Back to browse
XML, Markdown, or JSON: Which gives LLMs the most reliable boundaries?

XML, Markdown, or JSON: Which gives LLMs the most reliable boundaries?

by systima·Mar 5, 2026·3 points·2 comments

AI Analysis

●●SolidBig Brain

Settles the delimiter format debate with data—Markdown fails under adversarial inputs on MiniMax.

Strengths
  • Rigorous methodology: 600 model calls, 600 human judge assessments, two rounds of increasing difficulty—cuts through anecdote-based advice.
  • Actionable finding: 'format rarely matters' contradicts confident Anthropic/OpenAI recommendations, plus real exception (MiniMax+Markdown) moves beyond hand-waving.
  • Open-sourced benchmark and raw data on GitHub enables reproducibility and follow-up research.
Weaknesses
  • Audience is narrow: matters mainly to prompt engineers in regulated industries or adversarial threat models; 80% of teams won't change behavior.
  • Tested only four frontier models (missing smaller open-source and reasoning models); MiniMax result feels like an outlier, not a trend.
Category
Target Audience

Prompt engineers, AI reliability researchers, LLM practitioners in regulated industries

Similar To

HELM · MTEB · LMSys Chatbot Arena

Similar Projects

AI/ML●●Solid

AgentNexus – coordinate LLM agents by service boundary, not role

Service boundaries beat agent roles for coordination — 281 tests back the architecture.

Big BrainShip It
dugubuyan
601d ago