All the LM solutions on SWE-bench are bloated compared to humans

Name: All the LM solutions on SWE-bench are bloated compared to humans
Availability: InStock
Author: lieret

by lieret·Mar 4, 2026·1 point·0 comments

AI Analysis

○Pass

Twitter thread with a chart; not a product or tool.

Strengths

Weaknesses

•No actionable tool, framework, or code provided—pure analysis posted to Twitter.
•No link to reproducible methodology or dataset beyond the thread.

AI/ML●●●Banger

Game-based AI benchmark measuring spatial reasoning against human speedrun records.

Big BrainNiche Gem

ClassicRob

2013h ago

Transparent proxy cuts Codex context tokens by 87% via working memory.

Big BrainNiche Gem

george_ciobanu

1021mo ago

AI/ML●●Solid

Multilingual tokenization comparison across Arabic, Chinese, French that LangSmith ignores.

Big BrainNiche Gem

lognebudo

103mo ago

AI/ML●●●Banger

97% on SWE-bench Verified with full artifact transparency, not just a score claim.

Big BrainZero to One

kimjune01

2029d ago

AI/ML●●Solid

Beats humans at pronunciation scoring but doesn't ship product integration yet.

Big BrainWizardry

fabiosuizu

1314mo ago

Site is currently blocked behind Cloudflare; cannot assess project functionality or merit.

Fubwubs

103mo ago