Back to browse
sllm – Split a GPU node with other developers, unlimited tokens

sllm – Split a GPU node with other developers, unlimited tokens

by jrandolf·Apr 4, 2026·188 points·104 comments

AI Analysis

●●SolidBold BetBig Brain

Cohort-based GPU sharing cuts DeepSeek V3 costs from $14k to $40/month.

Strengths
  • Novel economic model — nobody charged until cohort fills, reducing individual risk.
  • OpenAI-compatible API via vLLM means zero code changes to existing integrations.
  • Privacy commitment with no traffic logging addresses enterprise security concerns.
Weaknesses
  • Throughput caps at 15-35 tok/s may not suit high-volume production workloads.
  • 3-month commitment on larger models reduces flexibility for experimentation.
Target Audience

Developers needing affordable access to large language models

Similar To

RunPod · Lambda Labs · Together AI

Post Description

Running DeepSeek V3 (685B) requires 8×H100 GPUs which is about $14k/month. Most developers only need 15-25 tok/s. sllm lets you join a cohort of developers sharing a dedicated node. You reserve a spot with your card, and nobody is charged until the cohort fills. Prices start at $5/mo for smaller models.

The LLMs are completely private (we don't log any traffic).

The API is OpenAI-compatible (we run vLLM), so you just swap the base URL. Currently offering a few models.

Similar Projects

AI/ML●●●Banger

Browser-Native GPU Sharing

Browser-based GPU cluster for LLM inference with HTTP API and SSE broker coordination.

WizardryZero to OneBold Bet
bilekas
1212d ago