• Library
  • Coordinates
  • Pack
  • About

Papers

Research papers and insights

1 papers

Speculative Decoding in the Wild

Summit

How vLLM, SGLang, and TensorRT-LLM implement Eagle speculation: tree attention, KV cache tricks, and CUDA graph trade-offs.

2 weeks ago
Speculative DecodingLLMsInference

Want to stay up to date?

Subscribe via RSS Feed

© 2026 Harīṣh Tummalachērla

  • Home
  • Trails
  • About
  • Reading
  • Coordinates
  • Projects
  • Now
  • Pack