Gemma 4 natively processes images at variable resolutions, automatically adapting without requiring preprocessing or padding.
Variable resolution support is a significant advancement in multimodal models. Traditional approaches force images into fixed dimensions, losing information through cropping or distorting content through resizing. Gemma 4's architecture handles images at their natural resolutions, preserving visual detail and reducing computational overhead.
This capability is particularly valuable for document processing, where charts, diagrams, and text need to be understood at their original resolution. OCR applications benefit from clearer character recognition, while screen UI understanding preserves layout precision.
For vector search applications with Zilliz Cloud, this translates to higher-quality embeddings from original-resolution images. You can store images in their native format and pass them directly to Gemma 4 for embedding generation. Zilliz Cloud then indexes these embeddings with full support for filtering, metadata management, and multi-vector retrieval.
This approach simplifies your embedding pipeline: no need for custom preprocessing, image standardization, or quality checks based on dimensions. Gemma 4 handles the variation; Zilliz Cloud handles indexing, retrieval, and availability.
Related Resources