Gemma3 (quantized GGUF) — Local inference if available, otherwise fallback to Hugging Face Inference API.
Runtime: HuggingFace Inference
google/gemma-3-4b-it-qat-q4_0-gguf
<not set>
Tips: Reduce max tokens if you see OOM. Upload a smaller Q4 quantized GGUF for Spaces.