# ONNX Runtime GenAI ## Docs - [CUDA Execution Provider](https://mintlify.wiki/microsoft/onnxruntime-genai/acceleration/cuda.md): Accelerate inference with NVIDIA GPUs using the CUDA execution provider - [DirectML Execution Provider](https://mintlify.wiki/microsoft/onnxruntime-genai/acceleration/directml.md): Cross-vendor GPU acceleration on Windows using DirectML - [OpenVINO Execution Provider](https://mintlify.wiki/microsoft/onnxruntime-genai/acceleration/openvino.md): Intel hardware optimization for CPU, GPU, and NPU acceleration - [Hardware Acceleration Overview](https://mintlify.wiki/microsoft/onnxruntime-genai/acceleration/overview.md): Choose the right execution provider for optimal performance across different hardware platforms - [QNN Execution Provider](https://mintlify.wiki/microsoft/onnxruntime-genai/acceleration/qnn.md): Qualcomm Neural Processing SDK for NPU acceleration on mobile and edge devices - [WebGPU Execution Provider](https://mintlify.wiki/microsoft/onnxruntime-genai/acceleration/webgpu.md): Browser-based GPU acceleration using the modern WebGPU API - [Generator Functions](https://mintlify.wiki/microsoft/onnxruntime-genai/api/c/generator.md): C API functions for text generation and inference - [Model Functions](https://mintlify.wiki/microsoft/onnxruntime-genai/api/c/model.md): C API functions for model creation and configuration - [C API Overview](https://mintlify.wiki/microsoft/onnxruntime-genai/api/c/overview.md): Overview of the ONNX Runtime GenAI C API - [OgaGenerator](https://mintlify.wiki/microsoft/onnxruntime-genai/api/cpp/generator.md): Generate text using ONNX Runtime GenAI models - [OgaModel](https://mintlify.wiki/microsoft/onnxruntime-genai/api/cpp/model.md): Load and manage ONNX Runtime GenAI models - [OgaGeneratorParams](https://mintlify.wiki/microsoft/onnxruntime-genai/api/cpp/params.md): Configure text generation parameters and search options - [OgaTokenizer](https://mintlify.wiki/microsoft/onnxruntime-genai/api/cpp/tokenizer.md): Encode and decode text for model input and output - [Generator](https://mintlify.wiki/microsoft/onnxruntime-genai/api/csharp/generator.md): Generate text sequences with ONNX Runtime GenAI models - [Model](https://mintlify.wiki/microsoft/onnxruntime-genai/api/csharp/model.md): Load and manage ONNX Runtime GenAI models - [GeneratorParams](https://mintlify.wiki/microsoft/onnxruntime-genai/api/csharp/params.md): Configure text generation parameters for ONNX Runtime GenAI - [Tokenizer](https://mintlify.wiki/microsoft/onnxruntime-genai/api/csharp/tokenizer.md): Encode and decode text for ONNX Runtime GenAI models - [Generator](https://mintlify.wiki/microsoft/onnxruntime-genai/api/python/generator.md): Generate tokens using a loaded model - [Model](https://mintlify.wiki/microsoft/onnxruntime-genai/api/python/model.md): Load and manage ONNX models for text generation - [GeneratorParams](https://mintlify.wiki/microsoft/onnxruntime-genai/api/python/params.md): Configure generation parameters and search options - [MultiModalProcessor](https://mintlify.wiki/microsoft/onnxruntime-genai/api/python/processor.md): Process images, audio, and text for multimodal models - [Tokenizer](https://mintlify.wiki/microsoft/onnxruntime-genai/api/python/tokenizer.md): Encode text to tokens and decode tokens to text - [Generation](https://mintlify.wiki/microsoft/onnxruntime-genai/concepts/generation.md): Understanding generation strategies, search methods, and sampling in ONNX Runtime GenAI - [KV Cache](https://mintlify.wiki/microsoft/onnxruntime-genai/concepts/kv-cache.md): Understanding key-value cache management, optimization strategies, and performance implications in ONNX Runtime GenAI - [Models](https://mintlify.wiki/microsoft/onnxruntime-genai/concepts/models.md): Supported model architectures, configuration, and management in ONNX Runtime GenAI - [Architecture Overview](https://mintlify.wiki/microsoft/onnxruntime-genai/concepts/overview.md): Understanding the core architecture of ONNX Runtime GenAI - [Advanced C/C++ Features](https://mintlify.wiki/microsoft/onnxruntime-genai/examples/c/advanced.md): Multi-turn conversations, custom configurations, and multimodal processing with ONNX Runtime GenAI C++ API - [Basic C/C++ Usage](https://mintlify.wiki/microsoft/onnxruntime-genai/examples/c/basic.md): Get started with ONNX Runtime GenAI C++ API for text generation - [C# Chat Example](https://mintlify.wiki/microsoft/onnxruntime-genai/examples/csharp/chat.md): Streaming chat implementation with ONNX Runtime GenAI in C# - [C# Multimodal Example](https://mintlify.wiki/microsoft/onnxruntime-genai/examples/csharp/multimodal.md): Process images and audio with ONNX Runtime GenAI in C# - [Interactive Chat](https://mintlify.wiki/microsoft/onnxruntime-genai/examples/python/chat.md): Build an interactive chat application with ONNX Runtime GenAI - [Text Generation](https://mintlify.wiki/microsoft/onnxruntime-genai/examples/python/generate.md): Generate text with ONNX Runtime GenAI - [Constrained Generation](https://mintlify.wiki/microsoft/onnxruntime-genai/examples/python/guidance.md): Control output format with grammar-based guidance - [Multimodal Generation](https://mintlify.wiki/microsoft/onnxruntime-genai/examples/python/multimodal.md): Work with vision and audio models using ONNX Runtime GenAI - [Build from Source](https://mintlify.wiki/microsoft/onnxruntime-genai/guides/build-from-source.md): Build ONNX Runtime GenAI from source for development or custom configurations - [Constrained Decoding](https://mintlify.wiki/microsoft/onnxruntime-genai/guides/constrained-decoding.md): Ensure structured outputs with grammar-based decoding, JSON schema constraints, and tool calling support - [Download Models](https://mintlify.wiki/microsoft/onnxruntime-genai/guides/download-models.md): Learn how to download ONNX Runtime GenAI models from Foundry Local, Hugging Face Hub, or build your own - [Model Builder](https://mintlify.wiki/microsoft/onnxruntime-genai/guides/model-builder.md): Convert and optimize PyTorch models to ONNX format for ONNX Runtime GenAI - [Multi-LoRA Support](https://mintlify.wiki/microsoft/onnxruntime-genai/guides/multi-lora.md): Dynamically load and switch between multiple LoRA adapters at runtime - [Runtime Options](https://mintlify.wiki/microsoft/onnxruntime-genai/guides/runtime-options.md): Configure model behavior dynamically during inference with SetRuntimeOption API - [Installation](https://mintlify.wiki/microsoft/onnxruntime-genai/installation.md): Install ONNX Runtime GenAI for Python, C#, or C++ - [Introduction](https://mintlify.wiki/microsoft/onnxruntime-genai/introduction.md): Overview of ONNX Runtime GenAI for running generative AI models - [Gemma Vision Models](https://mintlify.wiki/microsoft/onnxruntime-genai/multimodal/gemma-vision.md): Use Google's Gemma-3 vision models for multi-modal understanding with ONNX Runtime GenAI - [Multi-Modal Overview](https://mintlify.wiki/microsoft/onnxruntime-genai/multimodal/overview.md): Learn about multi-modal capabilities in ONNX Runtime GenAI including vision and audio model support - [Phi Vision Models](https://mintlify.wiki/microsoft/onnxruntime-genai/multimodal/phi-vision.md): Use Microsoft's Phi vision models for multi-modal understanding with ONNX Runtime GenAI - [Qwen Vision Models](https://mintlify.wiki/microsoft/onnxruntime-genai/multimodal/qwen-vision.md): Use Qwen's advanced vision-language models with ONNX Runtime GenAI - [Whisper Audio Models](https://mintlify.wiki/microsoft/onnxruntime-genai/multimodal/whisper.md): Use OpenAI's Whisper models for speech recognition and transcription with ONNX Runtime GenAI - [Quickstart](https://mintlify.wiki/microsoft/onnxruntime-genai/quickstart.md): Run your first generative AI model with ONNX Runtime GenAI