Sarvam Launches Indus Beta and Expands Full AI Stack

Key Takeaway

Indian startup Sarvam has unveiled its 30B and 105B parameter large language models trained under the IndiaAI Mission and is expanding into a broader AI stack spanning consumer apps, document intelligence, speech systems, on-device AI, and agent orchestration. Indus, its limited-beta assistant, runs on the 105B sovereign model, while new platforms such as Akshar, Studio, Saaras V3, Edge, and Arya extend the company’s push into sovereign, multilingual, and production-grade AI infrastructure.

Sarvam Debuts Indus – Key Points

The Overview

Following the announcement of Sarvam-30B and Sarvam-105B at the India AI Impact Summit 2026, the Bengaluru-based company has detailed a broader product ecosystem, including its Indus assistant (powered by the 105B model), a document intelligence workbench, a streaming speech model, an on-device AI initiative, and an agent orchestration stack.

The Facts

Foundation Models & Indus

Indus, launched in limited beta, is powered by the 105B “sovereign model” and focuses on accuracy, efficiency, and Indian alignment before scaling to larger systems.
Indus rollout is gradual due to limited compute capacity and includes waitlist-based access.
Sarvam-30B and Sarvam-105B were trained from scratch in India under the IndiaAI Mission; both use mixture-of-experts (MoE) architecture.

Document Intelligence: Vision & Akshar

Sarvam Vision is a 3B vision-language model for document intelligence across English and 22 Indian languages.
According to Sarvam’s February 15, 2026 product post, Vision achieved leading scores on benchmarks including olmOCR-Bench and OmniDocBench (English) and the Sarvam Indic OCR Bench, outperforming named frontier models in Indic contexts.
Akshar, built atop Vision, adds layout-aware extraction, visual grounding, block-level extraction, and agentic proofreading to reduce manual document validation.
Akshar is positioned for historical manuscripts and complex layouts, addressing OCR failures with Indic scripts and diacritics.

Speech Systems: Saaras V3

Saaras V3 supports streaming and batch speech recognition across 22 official Indian languages plus English.
Trained on 1M+ hours of curated multilingual audio data.
Achieves ~19.31% WER on the IndicVoices benchmark (top 10 languages subset), compared with higher WERs reported for Gemini 3 Pro and other named systems in Sarvam’s evaluation.
On the Svarah Indian-English benchmark, Saaras V3 reports 6.37% WER.
Offers configurable real-time modes (Accurate, Balanced, Fast), with Fast mode targeting sub-150ms time-to-first-token.

On-Device & Deployment

Sarvam Edge focuses on fully on-device inference for speech recognition and synthesis, removing cloud dependency and per-query costs.
Edge speech model specifications (74M params, ~294MB footprint, latency targets).
Unified multilingual on-device model supports 10 popular Indic languages with automatic language detection.

Studio & Multilingual Publishing

Sarvam Studio enables document translation, AI dubbing, and multilingual publishing workflows in a single workspace.
In a human evaluation study described by Sarvam, Studio achieved higher overall viewer preference than ElevenLabs, Rask AI, and YouTube Dub in side-by-side comparisons.
Document translation study reported a 52.8% reader preference rate for Sarvam versus 35.8% for Gemini 3 Pro and lower rates for Claude Opus 4.5 and GPT-5; average publish-ready rate reported at 68.2%.

Agent Infrastructure: Arya

Arya is Sarvam’s production agent orchestration stack built around eight composable primitives (LLM, Agent, MCP, Node, Ledger, Task Graph, Code Interpreter, Artefact).
Uses an immutable state ledger with schema-enforced deltas to prevent partial state corruption.
Declarative configuration (HCL/Terraform-style) separates specification from execution, enabling model swaps and A/B tests via configuration changes.
Designed to improve reliability in multi-step agent workflows where step-level reliability compounds.

Context

Sarvam is positioning itself beyond standalone LLM releases toward a vertically integrated sovereign AI stack: foundation models, consumer interface (Indus), document AI (Vision, Akshar), speech AI (Saaras), on-device deployment (Edge), enterprise publishing workflows (Studio), and agent infrastructure (Arya). The strategy emphasizes multilingual coverage, deployment efficiency, and reduced dependence on foreign cloud AI systems.

Why This Matters

Sarvam is transitioning from a model provider to a full-stack AI platform spanning models, deployment, applications, and infrastructure. If validated independently, the combination of sovereign LLMs, on-device AI, structured agent orchestration, and multilingual tooling could establish an India-centric AI ecosystem with reduced reliance on foreign cloud providers.

This article was drafted with the assistance of generative AI. All facts and details were reviewed and confirmed by an editor prior to publication.