LiveCodeBench: Holistic and Contamination Free Evaluation of Large Language Models for Code

When and Where

Friday, July 12, 2024 12:00 pm to 1:00 pm

Speakers

Naman Jain, UC Berkeley

Description

In this talk we introduce LiveCodeBench, a comprehensive and contamination-free benchmark for LLMs in code, which continuously collects new problems from LeetCode, AtCoder, and CodeForces. LiveCodeBench evaluates a wide range of capabilities, including self-repair, code execution, and test output prediction. It currently hosts 400 coding problems published between May 2023 and May 2024. We evaluated 18 base LLMs and 34 instruction-tuned LLMs, presenting findings on contamination, performance comparisons, and potential overfitting.

Please join the event. Everyone is welcome—it is free and you do not need to be affiliated with the university.

About Naman Jain

Naman Jain is a CS Ph.D. student at UC Berkeley, focusing on using machine learning to enhance developer productivity tools like program analysis, synthesis, and repair. He also explores how synthesis and verification can improve algorithm generalizability and explainability. He holds an undergraduate degree from IIT Bombay, where he researched NLP robustness and computer vision. Before his Ph.D., he was a predoctoral research fellow at Microsoft Research India, working on program repair, improving large language models, and learning decision trees.

Contact Information