singapore stone
Image: Singapore Heritage/Roots

The Language On This 1,000 Year Old Stone is a “Glyph Breaker’s Nightmare.” Scientists Want to Use AI to Crack It.

For centuries, the Singapore Stone has stood as one of Southeast Asia’s most enigmatic artifacts. Discovered in 1819 at the mouth of the Singapore River, this sandstone slab, inscribed with an unknown script, has baffled historians and linguists alike. Dubbed a “glyph breaker’s nightmare,” researchers are turning to artificial intelligence to unlock the secrets of this ancient relic.

The Singapore Stone, originally a large boulder, was partially destroyed by the British in 1843 for use in building a stone fort. The remaining fragment, now housed in the National Museum of Singapore, bears an inscription in a script that has yet to be deciphered. Scholars have speculated that the stone dates back to between the 10th and 14th centuries, possibly linked to the Majapahit empire or a South Indian rajah. Despite various efforts, the script remains a mystery, with no known parallels in other historical records.

Glyph breakers, language experts, and cypher experts have attempted to decode the mystery text, but to no avail. 

“To decipher an undeciphered writing system (as well as to crack a cipher we do not have the key for), the essential requirement is to have enough text available. The Ancient Egyptian Hieroglyphs were deciphered because Champollion was a genius, but also because they were everywhere in Egypt. Same for the Cuneiform writing system and Henry Rawlinson and Edward Hincks,” Dr Francesco Perono Cacciafoco, the lead researcher on this project told The Debrief.

“With the Singapore Stone, we have just a small fragment, plus the reproductions of two other (for now) lost fragments, plus some reproductions of the whole slab, before it was blown up, with not very clear characters and entire sections missing because of erosion. Therefore, the amount we have is very little. Moreover, its text/script is unique and never found anywhere else in the world.”

In 1837, several years before the British blew the stone up, it was hand-drawn by the politician William Bland and philologist James Prinsep. Later, Sir Stamford Raffles, the British East India Company’s administrator and the “founder” of Singapore, attempted to decode it. However, when it was blown up, only three recovered fragments were graphically reproduced before being sent to India.

singapore stone
A map of Singapore in 1825. Rocky Point at the mouth of the Singapore River is where the stone stood. (Image: Wikipedia)

The glyph breakers eventually hit a dead end, the stone was placed in a museum, and for a couple centuries, bided its time, with its secrets securely kept. Until now.

Perono Cacciafoco and his team are leveraging the power of AI to tackle this cryptographic challenge and have begun building a tool that may break through the mystery.

“Our work is mainly aimed, for now, at a digital restitution and/or recovery of the full text of the Stone (a possibly reasonable version of it), to have a consistent starting point for frequency analyses, comparisons, and pattern recognitions,” Perono Cacciafoco explained.

Their project, based at Nanyang Technological University and now continuing at Xi’an Jiaotong-Liverpool University in China, employs an AI tool named Read-y Grammarian. This tool is designed to analyze the stone’s text using advanced computational methods, including computer vision, artificial neural networks, and deep learning.

Perono Cacciafoco’s team knows, however, that this is not a simple task. You can’t just ask AI to decode the text. The sheer volume of data that needs to be processed is immense. The original stone, when intact, measured approximately 3 meters by 3 meters and contained 50 to 52 lines of text. However, the fragments and reproductions of the stone are insufficient for comprehensive frequency analyses and pattern recognition. There just isn’t enough stone left, nor are there any other examples out in the world for cross-referencing. So, the team needs to feed the AI model other known languages from the geographic areas it may be from, such as Kawi and Pali, which may or may not be related to the script of the Singapore Stone but can be used as reference points. It’s a lot of data points.

Read-y Grammarian, the AI tool at the heart of this project, is a “prediction machine” developed by Colin Loh, an engineer and mathematician currently working at the National University of Singapore. The algorithm has been further refined by Dr. Perono Cacciafoco’s colleagues. To simplify the science here, it analyzes various parameters, including the shape, size, and width of the extant characters, the degree of erosion on the stone’s surface, and the length and position of repeated symbol clusters. By comparing these features with well-known writing systems, the AI tool generates possible lines of missing text, which can then be further analyzed for frequency and pattern recognition.

“This process is what Philologists do ‘by hand’ with ancient manuscripts, trying to fill the gaps in text based on the contents of a work and on the lines and writing,” Dr. Perono Cacciafoco said. “The ‘machine’ can produce a mountain of mistakes and negative results, and no text is ‘final’, but its ‘products’ are unbiased and based only on data. This is fundamental in an exercise in Crypto-linguistics like the one we are performing. We cannot allow our own ideas and postulations on the script of the Stone to compromise our analysis.”

Using high-definition 3D scans of the fragments and the drawn images of the stone, they built a digital model that the AI will use to “learn.” As it moves through the process of deciphering, each read of the stone’s text will hopefully improve its ability to make predictions as to what fits where. With a reasonable reconstruction in hand, the team can begin to try to decipher what the symbols mean.

 

Enter the “lucky match.”

One of the most significant hurdles in deciphering the Singapore Stone is the “unknown writing system” and “unknown language” dichotomy. This combination is a nightmare for crypto linguists, as it provides no phonetic or linguistic clues. However, history has shown that human ingenuity can prevail in such situations. The decipherment of Egyptian hieroglyphs by Jean-François Champollion and Linear B by Michael Ventris are prime examples of breakthroughs achieved through a combination of genius and fortunate discoveries.

Dr. Perono Cacciafoco’s team hopes for a similar “lucky match”—a recognizable name or phrase that could provide a key to unlocking the script. This could be the name of a king, a deity, or a place, which, once identified, could help decode other parts of the text. This creates a sort of rainfall moment, and suddenly, that one match leads to other matches, and a cascade occurs. This assumes the symbols on the stone aren’t completely unique.

“If the writing system of the Singapore Stone is really unique and developed only for the Stone itself, and if the language hidden behind it is an ‘isolate’, without any link to any other known, deciphered, and attested language, the decipherment would be impossible, but a case like that is very, very, very, very, very unlikely,” Dr. Perono Cacciafoco said, wanting to be crystal clear on the “very unlikely” part.

The implications of this research extend beyond the Singapore Stone. The methodologies and tools developed by Dr. Perono Cacciafoco’s team could be applied to other undeciphered writing systems, such as the Indus Valley Script and the Rongorongo writing system of Easter Island. Read-y Grammarian’s success in analyzing and reconstructing the Singapore Stone’s text could pave the way for similar breakthroughs in other areas of historical linguistics.

But, with all research, there is always a catch. Building a tool like this is not easy or cheap. 

“A big limit to our research is ‘external’ – we never had any support,” explained Dr. Perono Cacciafoco unabashedly. 

Dr. Perono Cacciafoco and his team are making strides toward deciphering this ancient script, and the tool they are building could have resounding implications for other similar un-decodable texts. Moreover, for Dr. Perono Cacciafoco and his team, their work speaks to something bigger than simply money or shiny new tech.

“No one cares, institutionally, about language deciphering,” Perono Cacciafoco lamented in his statements to The Debrief. “And that indifference is a big burden, not only at the level of funds and resources but also at the moral level, for researchers who try to give back voice to our ancestors and our forgotten monuments and historical relics.”

MJ Banias covers space, security, and technology with The Debrief. You can email him at mj@thedebrief.org or follow him on Twitter @mjbanias.