Artificial Intelligence Inching Closer to Deciphering Long Lost Languages
With new technology available to us, we’re inching closer to the end of the days when deciphering ancient languages is a painstaking task filled with frustration and confusion. Nifty machines following complex algorithms are helping researchers around the globe as they take on the often monumental task of understanding ancient texts and lost languages.
Big Think reports that linguistic experts estimate there have been approximately 31,000 languages spoken throughout human history. Many of them are now dead and forgotten, but a new AI project may be part of the answer in how to decipher the writing of ancient languages.
How could something like this work? As Big Think points out:
“While languages change, many of the symbols and how the words and characters are distributed stay relatively constant over time. Because of that, you could attempt to decode a long-lost language if you understood its relationship to a known progenitor language.”
And that knowledge is the basis for the work of a joint team of researchers from MIT’s Computer Science and Artificial Intelligence Lab and the AI project called Google Brain. According to Discover Magazine the project team has ‘devised an algorithm that can begin to match words from unknown languages to related words, or cognates, in languages that share the same root.’ By using computing and linguistics advancements the project is make headway in creating algorithms that will help other researchers decipher ancient texts.
- The Three Distinct Scripts of Knossos: An Unfinished Epic
- Reading Between the Lines: Decrypting the Scripts of the Minoans and Mycenaeans
- Exploring an Ancient and Undeciphered Language: Eteocypriot and the Amathus Bilingual Inscription
And while the algorithm hasn’t been applied to undeciphered languages such as Olmec, Linear A, or Proto-Elamite script yet, the researchers behind it have shown it’s an advance in translating texts that have enough examples to provide a decent dataset for the algorithms to work with. So far their work has focused on training the system with Linear B and Ugaritic – two ancient languages that have mostly been translated by other means in the past.
Working with Linear B and Ugaritic
Linear B is a script that was used by the Mycenaean civilization in the Late Bronze Age, 3000-plus years ago. It was first deciphered in 1953 by an architect named Michael Ventris. Ugaritic on the other hand is a cuneiform early Hebrew language that also dates back some 3000 years. It was first identified by French archaeologists in 1929.
Ugaritic script. (Public Domain)
To test the AI system, Big Think reports the researchers “focused on 4 key properties related to the context and alignment of the characters to be deciphered – distributional similarity, monotonic character mapping, structural sparsity and significant cognate overlap.”
It seems the effort was worthwhile because a report on the project states that: “When applied to the decipherment of Ugaritic, we achieve a 5.5% absolute improvement over state-of-the-art results. We also report the first automatic results in deciphering Linear B, a syllabic language related to ancient Greek, where our model correctly translates 67.3% of cognates.”
Clay tablet inscribed with Linear B script, from the Mycenaean palace of Pylos. (Sharon Mollerus/CC BY 2.0)
That means this could be a helpful tool for the researchers who want to speed up their work in studying these ancient scripts. While the creativity and understanding of past cultures is undoubtedly part of translation work – and something AI isn’t able to do yet – Big Think mentions the biggest plus to the new program “it can simply take a brute force approach that would be too exhausting for humans.” With the AI’s help, researchers “can attempt to translate symbols of an unknown alphabet by quickly testing it against symbols from one language after another, running them through everything that is already known.”
What Can AI Translation Teach Us?
By cracking the codes of ancient languages we will be able to gain much more insight into what life was like in ancient cultures. All sorts of insight could be gained on social, political, cultural, and everyday matters.
As technology keeps improving, it makes sense that researchers want to take advantage of it. Why spends hours upon hours painstakingly trying to compare the letters of the most distant with something more recognizable today when a machine can accomplish the same task in much less time (and with far less frustration)?
Artificial intelligence may be able to accomplish the same task in much less time, and with far less frustration. (christian42/Adobe Stock)
In December 2018, BBC reported that Émilie Pagé-Perron, a researcher in Assyriology at the University of Toronto, was “coordinating a project to machine translate 69,000 Mesopotamian administrative records from the 21st Century BC.”
As Pagé-Perron explained, even though we have garnered much information through archaeological digs and analysis, there’s still a missing element that can be filled in by translating ancient texts:
“We have information about so many different aspects of the lives of Mesopotamian people, and we can’t really profit from the expertise of people in different fields like economics or politics, who if they had access to the sources, could help us tremendously to understand those societies better.”
Jacob Dahl, a professor of Assyriology at the University of Oxford, says that “We have more sources from Mesopotamia than we have from Greece, Rome and ancient Egypt together.” But only 10% of the thousands of examples of tablets and seals have been deciphered – the problem is not the lack of texts to work with, but finding enough experts who can read it.
The BBC report states that the Pagé-Perron team are “training algorithms on a sample of 4,000 ancient administrative texts from a digitised database. Each records transactions or deliveries of sheep, reed bundles or beer to a temple or an individual.” And while Pagé-Perron admits that the standalone texts are less than exciting, she believes “they’re extremely interesting if you take them as groups of texts.”
“Sumerian is probably the last member of what must have been a large family of languages that goes back thousands and thousands of years. Writing appeared in the world just in time to rescue Sumerian… We’re just lucky that we had some ‘microphone’ that picked it up before it went away with all the others […] It’s actually rather astonishing how interesting it is when you find a human mind across millennia, where it is like talking to them on the telephone. It’s the most exciting thing in the world when you meet one of these people.”
- Sumerian Tablets: A Deeper Understanding of the Oldest Known Written Language
- An 8,000-year-old Slab Holds the “Oldest Writing” Ever Discovered! Or Does It?
- Easy as Alep, Bet, Gimel? Cambridge Research Explores Social Context of Ancient Writing
“Sumerian is probably the last member of what must have been a large family of languages that goes back thousands and thousands of years.” (Andrea Izzotti /Adobe Stock)
Finkel’s excitement is easily applied to other ancient cultures as well. Any ancient text translation may be the next big breakthrough in uncovering the biggest secrets in humanity’s past.
But that’s not to say that AI is anywhere near ready to take over for the good old-fashioned human creativity and social understanding that’s often needed to make the mental leaps that are necessary to making sense of old writing. For now, it’s just another amazing tool that can help researchers on the quest to make sense of what our ancient ancestors thought important enough to jot down or laboriously carve into tablets.
Top Image: Closeup of glyphs on a Mayan calendar. Credit: zimmytws / Adobe Stock