Text To Image AI Has Created Its Own Secret Language, Researcher Claims

Here’s something reassuring to think about: researchers using machine-learning artificial intelligence (AI) often don’t know precisely how their algorithms are solving the problems they are tasked with.

Take for instance the AI ​​that can identify race from X-rays where no human can see how, or the Facebook AI that began to develop its own language. Joining these may be everyone’s favorite text-to-image generator, DALLE-2.

Computer Science PhD student Giannis Daras noticed that the DALLE-2 system, which creates images based on a text input prompt, would return nonsense words as text under certain circumstances.

“A known limitation of DALLE-2 is that it struggles with text,” he wrote in a paper published on pre-print server Arxiv. “For example, text prompts such as: ‘An image of the word airplane’ often lead to generated images that depict gibberish text.”

“We discover that this produced text is not random, but rather reveals a hidden vocabulary that the model seems to have developed internally. For example, when fed with this gibberish text, the model frequently produces airplanes.”

In one illustration posted to Twitter, Daras explains that when asked to subtitle a conversation between two farmers, it shows them talking, but the speech bubbles are filled with what looks like complete nonsense.

However, Daras had the thought to feed these nonsense words back into the system, to see if the AI ​​had assigned its own meanings to them. When he did that, he found that the words did appear to have their own meaning to the AI: the farmers were talking about vegetables and birds.

If Daras is correct, he believes that it would have security implications for the text-to-image generator.

“The first security issue relates to using these gibberish prompts as backdoor adversarial attacks or ways to circumvent filter,” he wrote in his paper. “Currently, Natural Language Processing systems filter text prompts that violate the policy rules and gibberish prompts may be used to bypass these filters.”

“More importantly, absurd prompts that consistently generate images challenge our confidence in these big generative models.”

However – though other algorithms have been shown to create their own languages ​​– this paper has not been peer-reviewed yet, and other researchers are questioning Darras’ claims. Research Analyst Benjamin Hilton asked the generator to show two whales talking about food, with subtitles. After the first few results did not return decipherable text, gibberish or not, he kept going until he did.

“What do I think?” Hilton wrote on Twitter. “‘Evve waeles’ is either nonsense, or a corruption of the word ‘whales’. Giannis got lucky when his whales said ‘Wa ch zod rea’ and that happened to generate pictures of food.”

Moreover, adding other phrases like “3D render” to other of the phrases gives different results, suggesting that they do not consistently mean the same thing.

It could be that the language is more along the lines of noise, at least in some cases. We will know more when the paper is peer-reviewed, but there could still be something going on that we don’t know about.

Hilton added that the phrase “”Apoploe vesrreaitais” does return images of birds every time, “so there’s for sure something to this”.

Leave a Comment