🏠 Home
🔭 About
📺 Programs
Overview
🧪 Open Source Research Experience
🪺 Open Source Incubator Fellowship
🎓 Open Source Education
📚 Resources
📝 Blog
🎪 Events
reproducibility
Reproducing and addressing Data Leakage issue : Duplicates in dataset
Hello! In this blog post, I will explore a common issue in machine learning called data leakage, using an example from the paper: Benedetti, P., Perri, D., Simonetti, M., Gervasi, O.
Kyrillos Ishak
Last updated on Aug 24, 2024
SummerofReproducibility24
Final blog: Automatic reproducibility of COMPSs experiments through the integration of RO-Crate in Chameleon
The project aims to develop a service that facilitates the automated replication of COMPSs experiments within the Chameleon infrastructure
Archit Dabral
,
Raül Sirvent
Last updated on Aug 24, 2024
SoR
Final Blogpost: HDEval's LLM Benchmarking for HDL Design
Introduction Hello everyone! I’m Ashwin Bardhwaj, an undergraduate student studying at UC Berkeley. As part of Micro Architecture Santa Cruz (MASC) my proposal under the mentorship of Jose Renau and Sakshi Garg looks to create a suite of benchmark programs for HDEval.
Ashwin Bardhwaj
Last updated on Aug 24, 2024
Deriving Realistic Performance Benchmarks for Python Interpreters
Hi, I am Mrigank. I am one of the Summer of Reproducibility fellows for 2024, and I will be working on deriving realistic performance benchmarks for Python interpreters with Ben Greenman from the University of Utah.
Mrigank Pawagi
Last updated on Aug 19, 2024
Final Blog: FEP-Bench: Benchmarking for Enhanced Feature Engineering and Preprocessing in Machine Learning
Background Hello, I’m Lihaowen (Jayce) Zhu, a 2024 SoR contributor for the FEP-bench project, under the mentorship of Yuyang (Roy) Huang. Before we started, let’s recap the goal of our project and our progress until mid term.
Lihaowen (Jayce) Zhu
Last updated on Aug 19, 2024
Final Blog: FSA - Benchmarking Fail-Slow Algorithms
Introduction Hello! I hope you’re enjoying the summer as much as I am. I’m excited to join the SOR community as a 2024 contributor. My name is Xikang Song, and I’m thrilled to collaborate with mentors Ruidan Li and Kexin Pei on the FSA-Benchmark project.
Kexin Pei
,
Ruidan Li
,
Xikang Song
Last updated on Aug 18, 2024
Data Leakage in Applied ML
Hello everyone! I have been working on reproducing the results from Characterization of Term and Preterm Deliveries using Electrohysterograms Signatures. This paper aims to predict preterm birth using Support Vector Machine with RBF kernel.
Shaivi Malik
Last updated on Aug 19, 2024
SoR
[MidTerm] ScaleRep: Reproducing and benchmarking scalability bugs hiding in cloud systems
Hey there, scalability enthusiasts and fellow researchers! I’m excited to share my progress on the ScaleRep project for SoR 2024 under the mentorship of Bogdan "Bo" Stoica and Yang Wang. Here’s a glimpse into how we’re tackling scalability bugs in large-scale distributed systems.
Zahra Nabila Maharani
Last updated on Aug 6, 2024
SummerofReproducibility24
Mid-term Blog: Automatic reproducibility of COMPSs experiments through the integration of RO-Crate in Chameleon
Introduction Hello everyone I’am Archit from India. An undergraduate student at the Indian Institute of Technology, Banaras Hindu University, IIT (BHU), Varanasi. As part of the Automatic reproducibility of COMPSs experiments through the integration of RO-Crate in Chameleon my proposal under mentorship of Raül Sirvent aims to develop a service that facilitates the automated replication of COMPSs experiments within the Chameleon infrastructure.
Archit Dabral
,
Raül Sirvent
Last updated on Aug 24, 2024
SoR
Mid Term Blog: FetchPipe: Data Science Pipeline for ML-based Prefetching
Introduction Hello, I’m Peiran Qin, a CS student at the University of Chicago, currently working on the project FetchPipe: Data Science Pipeline for ML-based Prefetching under the mentorship of Prof. Haryadi S.
Peiran Qin
Last updated on Jul 27, 2024
»
Cite
×