Inference
Tutorials for serving models and building inference applications as Flyte apps.
Voice customer-service agent
Serve an LLM with vLLM and a browser voice UI as two composed Flyte apps, with switchable text-to-speech and a live latency comparison.