Gemma 4, Optimized
for Real Hardware

Run powerful AI on your CPU, mobile device, or edge hardware. 51% faster inference, 50%+ smaller models, fully local and private.

51% Faster Inference
57% Size Reduction
100% Local & Private

Choose Your Model Family

From multimodal powerhouses to ultra-lightweight mobile models, optimized for every use case.

MULTIMODAL

gemma4-turbo

Full-featured AI with vision and audio support

+51%
Faster
IQ4_XS
Quantization
4.3-18 GB
Model Sizes
17K+
Downloads
  • Vision and audio capabilities
  • 51% faster than stock Gemma 4
  • 4 model sizes (e2b, e4b, 26b, 31b)
  • Tool calling and function support
  • Windows-optimized for CPU inference
ULTRA-LIGHTWEIGHT

gemma4-nano

Mobile-first AI that actually works on real devices

57%
Smaller
Q3_K_S
Quantization
3.1-14 GB
Model Sizes
<1 GB
RAM Usage
  • Text-only, optimized for mobile/edge
  • Sub-1GB RAM usage (891.7 MB total)
  • Stays cool on 8GB RAM phones
  • Full 4.5B params, 128K context
  • 13% faster than turbo on CPU
COMING SOON

More Variants

Pushing the boundaries of AI optimization

🚀
In Progress
Possibilities
  • Experimental quantization methods
  • Specialized task-optimized variants
  • Even smaller mobile models
  • Performance tuning research
  • Community-driven development

Get Started in Seconds

One command to run powerful local AI on your hardware

Option 1: Ollama (Recommended)

# Install Ollama from https://ollama.com

# For multimodal AI with vision:
ollama run ssfdre38/gemma4-turbo:e4b

# For ultra-lightweight mobile/edge:
ollama run ssfdre38/gemma4-nano:e2b

# That's it! Start chatting with local AI 🚀

Option 2: Direct GGUF Download

# Download from Hugging Face:
wget https://huggingface.co/ssfdre38/gemma4-turbo-gguf/resolve/main/gemma4-e4b-iq4xs-turbo.gguf

# Use with llama.cpp, Ollama, or any GGUF-compatible tool

Model Size Guide

# Turbo (multimodal):
e2b   → 4.3 GB   # Smallest with vision
e4b   → 6.1 GB   # Recommended default
26b   → 15 GB    # High capability
31b   → 18 GB    # Maximum performance

# Nano (text-only):
e2b   → 3.1 GB   # Mobile-ready
e4b   → 4.7 GB   # Balanced
26b   → 12 GB    # High quality
31b   → 14 GB    # Best performance