Back to browse
BSCS Bench – College CS Curriculum AI Benchmark

BSCS Bench – College CS Curriculum AI Benchmark

by charlielockyer·Apr 15, 2026·1 point·0 comments

AI Analysis

●●●BangerBig BrainSolve My Problem

Real CS coursework beats synthetic coding benchmarks for model evaluation.

Strengths
  • 66 actual assignments across 11 courses provides comprehensive curriculum coverage
  • Cost and time metrics alongside accuracy show practical deployment tradeoffs
  • Live leaderboard with multiple model versions enables direct comparison
Weaknesses
  • Single university curriculum may not generalize to other CS programs
  • No open submission process for other institutions to add their courses
Category
Target Audience

AI researchers, educators, and teams evaluating coding models

Similar To

HumanEval · SWE-bench · LiveCodeBench

Post Description

Can frontier models complete a college CS curriculum? I tested exactly that with BSCS-bench: 66 assignments across 11 core courses in Rice University’s CS curriculum.

I also wrote a companion essay discussing the effects of these results on higher education as a whole: https://www.bscsbench.com/blog/no-calculators-please

Similar Projects

AI/ML●●●Banger

Legal RAG Bench

Legal RAG benchmark revealing embedding quality > LLM choice by 19-point margin.

Big BrainNiche GemSolve My Problem
beowa
414mo ago