India’s foundational AI model to support reasoning in Indic languages, will be ready in six months: Vivek Raghavan

featured-image

India is developing a homegrown foundational AI model that will be trained on diverse Indian datasets and support reasoning in Indic languages, said Vivek Raghavan, founder of Sarvam AI, which has been selected by the Government of India to build sovereign Large Language Models (LLMs). “We are actually building reasoning in Indic languages,” Raghavan revealed. “So, you can ask questions in any language, in any script, whether it's Devanagari or Roman script Hind and the model will respond.

" “We are building something which is leveraging various kinds of Indian data...



that's the focus of what we are doing,” Raghavan told Moneycontrol on April 26. “It will be a general-purpose, multi-modal model, voice, images, text, all of it.” From 67 proposals received since February 15, Sarvam was the first startup selected to develop India’s first indigenous foundational model.

The model aims to compete with global state-of-the-art systems while staying rooted in Indian linguistic and cultural contexts. “The goal is to be competitive with the best in the world, but the intent is that it will be an Indian-focused model,” Raghavan added. Addressing concerns about the availability of high-quality regional language data, Raghavan pointed to ongoing efforts.

“AI4Bharat and Land OS have done data collection in every district. And we’re also finding that it’s now possible to generate data of higher quality than what’s available from existing sources.” "Our model will be able to understand and reason across 22 Indian languages, covering not just major scripts but also regional variations," Raghavan indicated.

"It will support queries in any language, any script, whether typed in Devanagari, Roman, or other forms." When asked about the timeline, Raghavan said: “Six months. That’s the internal target we’ve set.

” On compute resources and funding, Raghavan clarified that his team isn’t directly involved in the financing but has access to necessary GPUs. “Of course, you always want more GPUs, but I think this is good for what we have in.” He emphasized that wide adoption will be key.

“For a model to be useful, it has to be widely used,” he said, though he did not confirm whether the model would be open source..