Cohere Unveils Command A Vision, Setting New Multimodal Standard for Enterprise AI

Cohere has announced the launch of Command A Vision, a new generative model designed to enhance multimodal image capabilities crucial for enterprise applications. The company stated on social media, "Introducing Command A Vision, a state-of-the-art generative model that excels across multimodal image capabilities that matter for enterprises!" This release underscores Cohere's continued focus on delivering advanced AI solutions tailored for business needs.

Command A Vision, a 112-billion-parameter dense model, is built upon the foundation of Cohere's Command A, known for its strong performance in real-world enterprise tasks, including tool use and retrieval-augmented generation (RAG). The model demonstrates superior performance against established benchmarks, reportedly outperforming models like GPT-4.1, Llama 4 Maverick, Mistral Medium, and Pixtral Large in vision tasks. Its open-sourced weights signify a commitment to broader accessibility and innovation within the AI community.

The new model is engineered to tackle complex visual data challenges prevalent in various industries. It can process and interpret diverse visual inputs such as financial tables, healthcare diagrams, and construction PDFs, excelling in optical character recognition (OCR) and real-world image analysis. This capability allows businesses to automate and streamline processes that traditionally relied on manual review or custom solutions.

Cohere has positioned Command A Vision as an enterprise-grade and deployment-ready solution. It offers efficient private deployment options, requiring as few as two A100 GPUs or one H100 for 4-bit quantization. This efficiency aims to reduce computational costs and latency, making advanced multimodal AI more accessible for corporate environments. The model's integration into Cohere's platform and availability on Hugging Face further facilitates its adoption.

The introduction of Command A Vision reinforces Cohere's strategy to provide secure, high-performance language models for regulated industries. By focusing on practical business applications and maintaining strong text capabilities alongside its new visual prowess, Cohere aims to empower enterprises to leverage generative AI for enhanced productivity and data-driven insights across their operations.