Project Indus by Tech Mahindra

Tech Mahindra is an IT major in India, and the fifth largest software services firm in India. They have launched Project Indus for LLMs of Indian languages. A 15-member Project Indus team plans to release first the LLM for Hindi and its 37 dialects. The model will be ready by December 2023 or January 2024. It is an attempt by the company to build a foundational model on Indian languages.

The team has collected data in Hindi and related dialects over the past two months. It is a corpus of 1.2 terabytes of data. They would develop a refined web text from this by November, 2023. It will be open source.

Later, the work will start for other languages. AI’s growth is verticalised in future. These LLMs are the base. Later there can be domain specific models — rural finance models, agritech models, healthcare models and so on.

print

Leave a Reply

Your email address will not be published. Required fields are marked *