Try out emotion steering of LLMs here
High-throughput vLLM API for per-request emotion steering beats prompt-only methods.
Extract and serve CAA-style emotion steering vectors for any HF causal LM
Per-request emotion steering that preserves vLLM continuous batching for Qwen3.
ML engineers and researchers experimenting with model steering
LLM-Beamer · Activation Additions
High-throughput vLLM API for per-request emotion steering beats prompt-only methods.
Local vLLM GUI with layout-aware OCR, but macOS-only in a crowded space.
Anthropic's emotion research replicated open-source with a Flask visualizer for Gemma 4.
Reveals agents diagnose bottlenecks 87% correctly but fix them only 17%—scaffolding matters more than model.
Build vLLM from scratch with PagedAttention kernels when llama.cpp already exists.
LLM cost routing with LoRA awareness when LiteLLM already handles basic proxying.