Tajikistan launches first AI model in national language
SoroLLM built to understand Tajik dialects, marking a major leap in linguistic inclusion and digital innovation
DUSHANBE, Tajikistan (MNTV) — Tajikistan has unveiled its first artificial intelligence language model tailored to the Tajik language, marking a breakthrough in the country’s digital ambitions and linguistic inclusion.
According to Asia Plus, the model—called SoroLLM—was developed by local researchers at zehnlab.ai and is the first neural network specifically trained on Tajik. Unlike global models such as GPT or LLaMA, which offer little to no support for Tajik, SoroLLM is designed from the ground up to understand both standard Tajik and its many regional dialects.
The project was formally presented to President Emomali Rahmon during the recent inauguration of the country’s first AI Computing Resource Center, a milestone event in Tajikistan’s growing push toward technological self-reliance.
Developers say the goal is not just language recognition but full cultural representation. “SoroLLM captures everything from northern Tajik speech patterns to the languages spoken in the Pamirs,” the team said.
The model’s next phase will integrate multimodal capabilities, allowing it to handle audio and video inputs in addition to text. Citizens have also been invited to contribute by sharing dialectal samples to improve the model’s accuracy and reach.
SoroLLM positions Tajikistan among a small but growing group of nations creating localized AI tools that prioritize cultural and linguistic heritage—a significant shift away from the one-size-fits-all approach of global tech platforms.