This AI Can Automatically Decipher Lost Ancient Languages – InApps is an article under the topic Software Development Many of you are most interested in today !! Today, let’s InApps.net learn This AI Can Automatically Decipher Lost Ancient Languages – InApps in today’s post !

Key Summary

This article from InApps Technology, published in 2022 and authored by Phu Nguyen, highlights a groundbreaking machine learning model developed by MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) to decipher lost ancient languages written in scriptio continua (text without word dividers). Led by researcher Jiaming Luo, the model automates the decoding of undeciphered languages like Iberian by matching word pairs with related known languages (e.g., Gothic, Ugaritic) based on sound correspondences and linguistic patterns. Using a multidimensional embedding approach, it handles unsegmented text by detecting regular changes (e.g., p to b) and mapping words to known languages. The model confirmed Iberian is likely not related to Basque or other language families, aligning with recent findings. While not as thorough as human analysis, it significantly reduces time and effort, offering linguists a quick analysis tool. Future enhancements aim to handle multiple unrelated languages.

  • Context:
    • Author: Phu Nguyen, summarizing research from MIT CSAIL.
    • Theme: AI-driven decipherment of lost languages preserves cultural knowledge by automating the analysis of unsegmented ancient texts.
    • Source: InApps article, based on a paper by Jiaming Luo and team.
  • Key Points:
    • Significance of Lost Languages:
      • Languages reflect cultural worldviews; their extinction is a collective loss for humanity.
      • Many ancient languages use scriptio continua (no word spaces), complicating decipherment.
    • AI Decipherment Model:
      • Purpose: Automates decoding of undeciphered languages using machine learning.
      • Mechanism: Matches word pairs between an unknown language and a known related language (e.g., Gothic to Proto-Germanic, Ugaritic to Hebrew).
      • Sound Correspondences: Identifies consistent patterns (e.g., p to b) to confirm linguistic relationships.
      • Embedding Framework: Represents language sounds in a multidimensional space, where pronunciation variations are distances, enabling word segmentation in unsegmented text.
    • Case Study:
      • Tested on Iberian, confirming it is not related to Basque, Germanic, Turkic, or Uralic languages, aligning with recent linguistic findings.
      • Uses known relationships (e.g., Gothic, Ugaritic) as a baseline to validate the model.
    • Advantages:
      • Speed: Much faster than manual decipherment, requiring less human effort.
      • Utility: Provides linguists with a preliminary analysis tool for assessing language relationships.
    • Limitations and Future Work:
      • Currently limited to related languages; future models aim to handle unrelated languages.
      • Less thorough than human analysis but valuable for quick insights.
    • Impact:
      • Preserves cultural heritage by recovering lost languages.
      • Potential applications in linguistic research and historical analysis.
  • InApps Insight:
    • InApps Technology, ranked 1st in Vietnam and 5th in Southeast Asia for app and software development, specializes in AI-driven solutions and machine learning applications, using React Native, ReactJS, Node.js, Vue.js, Microsoft’s Power Platform, Azure, Power Fx (low-code), Azure Durable Functions, and GraphQL APIs (e.g., Apollo).
    • Offers outsourcing services for startups and enterprises, delivering cost-effective solutions at 30% of local vendor costs, supported by Vietnam’s 430,000 software developers and 1.03 million ICT professionals.
    • Relevance: Expertise in AI and natural language processing aligns with developing tools like MIT’s decipherment model for cultural preservation or data analysis.
  • Call to Action:
    • Contact InApps Technology at www.inapps.net or sales@inapps.net to develop AI-powered language processing tools or custom software solutions for innovative applications.
Read More:   Meteor Galaxy Containerizes JavaScript Apps for Full-stack Management – InApps Technology 2025

Read more about This AI Can Automatically Decipher Lost Ancient Languages – InApps at Wikipedia

You can find content about This AI Can Automatically Decipher Lost Ancient Languages – InApps from the Wikipedia website

A language can offer meaningful clues into how a culture views the world and its place within it, representing a lived body of knowledge. Every culture has something to say, so understandably, it’s a collective tragedy for the whole of humanity when a language goes extinct, and we lose a part of the beautiful, metaphorical tapestry that is the human experience.

But what if there was a way to automatically recover these lost languages? Researchers at MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) have found a way to use machine learning to help us decode dead, “undeciphered” languages — which means we would finally be able to understand the grammar, vocabulary, and syntax underlying the written versions of these lost languages. In particular, the research team focused on texts that were written with few or no spaces in between the words — a phenomenon that’s called scriptio continua.

“Our work is about automatic decipherment of lost languages written in an under segmented or unsegmented script — apparently for some ancient languages, word dividers had not been invented, or not consistently applied,” said Jiaming Luo, co-author of the study. “The significance of our work lies in the fact that ours is the first attempt to do such decipherment automatically using machine learning in such challenging situations.”

Finding Linguistic Cousins

Typically, in order to crack the code of an unknown language, it’s helpful to know at least another language that’s related. For instance, years ago experts were able to decipher Gothic, an extinct East Germanic language, thanks to its relatedness to known languages like Proto-Germanic, Old Norse and Old English. Inspired by this concept, the team developed their decipherment algorithm along similar lines, an earlier version of which was introduced last year in a previous paper.

Read More:   What JavaScript Programmers Need to Know about Transpilers – InApps 2022

“Our machine learning model works by trying to match as many word pairs as possible, between the ancient language and some known one, while handling the uncertainty in segmentation,” explained Luo. “What exactly counts as a matched pair depends on their sound correspondences on the character level, and how regular these correspondences are. For instance, if you find many pairs with a consistent change like p to b, then you are fairly confident that these pairs are truly matched. Why does this work? Because historical linguistics tells us that language changes happen in regular and consistent ways. If two languages are truly related (for example, as Spanish and Italian are), then you would see these patterns emerge over and over again.”

In addition to being able to incorporate these linguistic tendencies, the model handles the uncertainties that comes with unsegmented text by “embedding” the language sounds into an imaginary multidimensional space, where the variations in pronunciation are represented as distances between points in this space. By using this kind of framework, the model is able to detect patterns in the evolution of related languages, thus allowing it to segment out and separate words in undeciphered languages, and map them to words in known, related languages.

As outlined in the team’s paper, this relatedness between known, deciphered languages and undeciphered languages can be used as a kind of baseline, a “ground truth” to help determine whether such AI-powered decipherment models are actually working. In this study, the team leveraged known relationships between Gothic and Ugaritic, a Semitic language somewhat similar to ancient Hebrew, in order to test out how their model would perform on unknown languages, such as Iberian. Through this process, the team used their machine learning model to corroborate that Iberian was very likely not, in fact, related to Basque, as well as other possibilities like Germanic, Turkic, and Uralic languages, a conclusion that is supported by other recent findings.

Read More:   How High-Performance Teams Cultivate a Culture of Early and Meaningful Feedback – InApps 2022

While the model appears to work well in evaluating how related two languages might be, the team is now aiming to expand the model beyond its current capabilities so that it can juggle multiple, potentially unrelated languages. For now, the team hopes that their model can help automate and take out some of the guesswork out of what is usually a long, tedious process.

“Our work could be useful for linguists to get a quick analysis of the relationship between two languages, especially when one of them is unknown,” said Luo. “It is by no means as adequate or thorough as human analysis, but it’s much much quicker and requires much much less human effort.”

List of Keywords users find our article on Google:

decipherment
norse lab
proto labs jobs
gothic tapestry
mit linguistics
luos embedded
tapestry pronunciation
pronunciation of nguyen
tapestry jobs
is kubeflow dead
hebrew language wikipedia
lost in the unknown code
mit csail
csail mit
proto-germanic
uralic languages map
old norse dictionary
proto germanic
decipher
norse group jobs
outsourced pronunciation
buildout recruitment
ux design collective
daves garage
focus vision decipher
binh pronunciation
tapestry vietnam
dave’s garage
decipher market research
pronunciation of deciphered
ats garage

Source: InApps.net

Rate this post
As a Senior Tech Enthusiast, I bring a decade of experience to the realm of tech writing, blending deep industry knowledge with a passion for storytelling. With expertise in software development to emerging tech trends like AI and IoT—my articles not only inform but also inspire. My journey in tech writing has been marked by a commitment to accuracy, clarity, and engaging storytelling, making me a trusted voice in the tech community.

Let’s create the next big thing together!

Coming together is a beginning. Keeping together is progress. Working together is success.

Let’s talk

Get a custom Proposal

Please fill in your information and your need to get a suitable solution.

    You need to enter your email to download

      Success. Downloading...