Bangladesh’s endangered languages get a digital lifeline

TBS

14 October, 2025, 11:30 am
Last modified: 14 October, 2025, 01:26 pm
Illustration: TBS

Illustration: TBS

Rengmitca is a language that only six people in the world speak. Five of them are over the age of 60, and the remaining one is almost 50. They live deep in the hill tracts of Bandarban — four in Alikadam and two in Naikhongchari.

When they pass away, Rengmitca will vanish with them; the language, and the unique worldview it carries, could be lost forever.

But thanks to one of Bangladesh’s most ambitious digital preservation efforts in recent times, the language might survive.  Multilingual Cloud, a digital repository that houses the voices, words and stories of the country’s ethnic languages, aims to ensure that our languages are not erased by time or modernisation.

Archiving the voices of a nation

Funded by the ICT Division and initiated by the Bangladesh Computer Council (BCC) under a project titled Digitisation of Bangladesh’s Ethnic Languages, Multilingual Cloud represents the first attempt to build a national digital archive of ethnic languages in audio-visual form.

 

We did not just visit communities and collect data in a formal setting. We stayed with them, ate with them, and worked alongside them to truly understand their way of life. Only through that immersion could we capture authentic linguistic data.

Charu Haque, research and content expert, Digitisation of Bangladesh’s Ethnic Languages project

So far, the project has digitised 42 languages and five sub-languages or dialects. The platform contains 7,177 topics, each serving as a linguistic preservation sample.

Beyond text, the project has recorded over 12,000 minutes of audio from 216 native speakers and transcribed almost one lakh sentences into the International Phonetic Alphabet (IPA). Thus, the archive captures not only the sound of these languages but also their rhythm and emotion.

Md Mamun Or Rashid, associate professor at Jahangirnagar University’s Department of Bangla and the consultant for the project, explained that the initiative was a sub-project under the BCC, driven by technology and field research.

“We developed a dedicated software system designed to collect linguistic data directly from the field,” he said. “Using the software, our team can record and store all linguistic information on-site.”

From field to archive

The team’s journey began with a series of workshops in eight locations across Bangladesh to determine where and how to conduct fieldwork. They travelled through North Bengal, Sylhet’s tea gardens, the Mymensingh-Sherpur-Netrokona belt, and the Chittagong Hill Tracts. They also collected data from scattered communities like the Bede and Dhakaiya Urdu speakers.

“Many assume Dhakaiya Urdu is thriving, but our findings suggest otherwise,” Mamun said. “It is one of the 41 languages we have documented so far, several of which are endangered.”

Using both mobile and web-based applications, teams visited remote areas to collect data. They were joined by ethnic language experts proficient in Garo, Marma, Chakma, and Santali, as well as linguistics graduates from the University of Dhaka. In some cases, the researchers stayed with local families to understand their way of life and speech patterns.

Charu Haque, the project’s research and content expert, said that there was no existing model for such work in Bangladesh — or even abroad. So, they had to create their own path.

“When the tender for this project was first announced, it was floated five times, yet no bidder showed interest,” he said. “It was not that they were unwilling; they simply could not figure out how to approach such an unfamiliar task.”

Eventually, one firm agreed after being assured of full cooperation from the ICT Division. Even then, success was far from certain. “We had to adapt our methods along the way,” Haque said. “We learned by experimenting, by making mistakes and correcting them and so on.”

This hands-on, ethnographic approach defined the project. “We did not just visit communities and collect data in a formal setting,” he said. “We stayed with them, ate with them, and worked alongside them to truly understand their way of life. Only through that immersion could we capture authentic linguistic data.”

The process

Each language sample went through a multi-layered process. After native speakers recorded their input, a second group — known as validators — reviewed the data to ensure accuracy.

“The validators were often local schoolteachers or people with an advanced understanding of the language,” Mamun explained. “If they identified any errors, corrections were made. This cycle ensured the data was both authentic and reliable.”

Once verified, the data was transferred to a central server. Trained linguists then listened to each recorded word and transcribed it using the IPA. Another team focused on video documentation, recording the stories and expressions of communities. “We now have over 40 such video records,” Mamun said. “Each captures a glimpse of how these people speak and express.”

The task was not without its challenges. “One of the biggest difficulties was the shortage of experts in linguistic data analysis in Bangladesh,” Mamun noted. “We could have achieved even more if we had more trained professionals in this area.”

Haque shared another challenge: gaining the trust of participants. “Since the ICT Division was involved, some people were initially reluctant to share data, fearing how it might be used,” he said. “We had to reassure them that the documentation was only for preservation and research.”

Sometimes, the obstacles were deeply human. “Even simple things like recording devices could disrupt the natural flow of speech,” Haque explained. “People would speak fluently in casual conversation, but as soon as they noticed a microphone — or a camera, they became hesitant. It took time to make them comfortable.”

Navigating the project, the researchers found some unexpected things as well. While Rengmitca was known to be critically endangered, the team discovered others in even worse condition.

“The Kharia language, for example, has only two remaining elderly speakers who use this language on a regular basis,” Haque said. “Once they are gone, the language will disappear forever.”

Why preserving languages matters

Language is more than a tool of communication. It carries the worldview, history, and emotions of a community. “Religion, language, customs, food, and belief — these are the foundations of our identity,” Mamun said. “If a language disappears, a part of human diversity disappears with it.”

According to a report by Unesco, the world loses a language roughly every two weeks. Out of more than 7,000 known languages, nearly 2,500 are now considered endangered. In Bangladesh, a survey by the International Mother Language Institute found that 14 of the country’s languages stand on the brink of extinction.

As Haque said, the loss of language also means the loss of knowledge. “Every language carries the collective memory of its people,” he noted. “Judging its worth solely by its economic usefulness would be an injustice. Through language, a community passes down its wisdom and traditions. Once lost, that knowledge cannot be fully recovered in another tongue.”

In many ethnic communities, younger generations are gradually abandoning their mother tongues in favour of standard Bangla or English. “We often talk about intergenerational transmission,” Mamun explained. “If a grandfather speaks a language fluently but his grandson cannot, the language is already fading. Ten years from now, many of these languages may vanish completely.”

Multilingual Cloud aims to slow that process. By archiving pronunciation, structure, and oral traditions, it ensures that even if languages disappear from daily life, they will not be lost to history. Researchers, educators, and language enthusiasts will be able to access the platform to study and revive these languages in future.

For the project team, it is about more than preserving words. It is about dignity, memory, and recognition of the identities of the indigenous groups. “Every community, no matter how small, has the right to protect its traditions and continuity,” Mamun said. “Just as we take pride in our cultural awareness as Bangla speakers, smaller linguistic groups deserve the same recognition.”

From the six surviving speakers of Rengmitca to Veronica and Christina — the elderly sisters keeping Kharia alive — every recording is a story of resilience, a history of humankind’s existence.

“Language is, in essence, human history,” Mamun reflected. “Preserving it is not only about saving words. It is about safeguarding humanity’s collective heritage.”

LEAVE A REPLY

Please enter your comment!
Please enter your name here