phospho-app / fastassert Goto Github PK
View Code? Open in Web Editor NEWDockerized LLM inference server with constrained output (JSON mode), built on top of vLLM and outlines. Faster, cheaper and without rate limits. Compare the quality and latency to your current LLM API provider.
Home Page: https://phospho.ai