The most annoying weirdness that has recently circulated on social media is this enormous language model’s inability to correctly count the number of letters in a word. A common example is the word “strawberry,” in which artificial intelligence frequently fails to correctly count the number of times the letter “r” appears. But why is it doing this?
Large language models, particularly OpenAI’s ChatGPT, transformed how we interacted with robots that understood and produced human-like words. However, each of these models had their own eccentric personality. The most annoying weirdness that has recently circulated on social media is this enormous language model’s inability to correctly count the number of letters in a word.
A common example is the word “strawberry,” in which artificial intelligence frequently fails to correctly count the number of times the letter “r” appears. But why is it doing this? The solution is deep within the foundation of how these models analyze and generate speech.
The Tokenization Process
One of the primary reasons AI struggles with tasks such as letter counting is due to the way it processes language. Language models like GPT-3 and GPT-4 do not treat words as a collection of individual letters. Instead, they divide the text into smaller components called “tokens.” Tokens can be as short as one character or as lengthy as a full word, depending on the model’s design and the specific word involved.
For example, the word “strawberry” would most likely be divided into two tokens, which are representations of partial word fragments that the model recognizes from training. The issue is that these frequently do not correspond to the letters in the word. This is because, for example, in the word “strawberry,” the AI may see two tokens, such as token IDs 496 and 675, rather than a full, single-letter breakdown. When requested later to count certain letters, this model will not be able to easily link the tokens back to the number of occurrences of a specific letter.
Prediction Mechanism of Language Models
Language models anticipate the next word or token in a sequence based on the context provided by the preceding words or tokens. This is especially useful for creating a language that is both coherent and aware of its surroundings. However, it is not particularly useful for tasks that require exact counting or reasoning about individual characters.
If you were to ask the AI to count the instances of the letter “r” in the word “strawberry,” it wouldn’t have such an accurate representation of the word that could be used to determine the quantity and location of each occurrence. Rather, it responds based on its knowledge of how to make predictions based on the request’s structure.
Of course, this could be wrong because the information it was trained on didn’t even pertain to letter counting, let alone the kind of data needed to trace the “r” in our sample word.
The limitation of Models in Pure Language
It’s also crucial to remember that language models, which are the basis for the majority of chatbots, are not suitable for explicit arithmetic or counting. To put it another way, pure language models are essentially sophisticated dictionaries or predictive text algorithms that do tasks that are probabilistic weighted depending on the patterns they identify, but they are not very good at activities like counting that call for rigorous logical reasoning.
The AI may be more accurate at spelling words or breaking them down into their individual letters if this is demanded of it, as this is more in keeping with the task it has been trained on: text production.
Changes and Enhancements
Despite these drawbacks, AI performance on these tasks could yet be improved. You can make them better by asking the AI to perform the counting using various programming languages, including Python.
You could, for instance, attempt to direct the AI to construct a Python function that counts the “r”s in the word “strawberry,” and it most likely would succeed.
We employ this strategy because it makes use of the AI’s capacity to comprehend and produce code that may be used to carry out the work in the right way. In addition, more recent generations of language models are blended with additional instruments and algorithms to enhance their efficiency for more organized tasks, such as maths and counting.
An AI system that could overcome those drawbacks would use symbolic reasoning or combine the LLMs with external reasoning engines.
Language Models Nature and “Collective Stupidity” 😅
The difficulty of counting letters in words, such as “strawberry,” is indicative of a broader and more widespread problem in this context: the “collective stupidity” of these trained models. These models may generate text at a highly sophisticated level because of their training on big datasets, yet occasionally, they will make incredibly silly mistakes that a small child might easily avoid.
This occurs because the model’s “knowledge” must consist of statistical correlations and pattern recognition rather than its comprehension of the real world or logical inference.
The AI can be stubborn in sticking to incorrect answers, even when given explicit instructions or placed in a scenario where multiple models validate each other. This behavior highlights how crucial it is to understand exactly what artificial intelligence (AI) systems can and cannot do rather than overestimating their potential for tasks outside of their purview.
Conclusion: The Development of Knowledge About AI
The fact that AI is unable to count the “r”s in a “strawberry” is not a minor error, but rather an indication of the language models’ fundamental architecture and design philosophies. These models are incredibly effective at producing text that seems human, comprehending context, and simulating dialogue, but they are not designed for tasks that call for precise character-level detail.
Future models will probably be able to perform these tasks better due to advancements in AI, such as enhanced tokenization procedures, the incorporation of additional reasoning tools, or even completely new approaches to language understanding and manipulation. It should be used with an awareness of its limitations until then, making appropriate workarounds and realizing that although it can mimic understanding, it does not yet truly “understand” in the same sense that humans do.
Thanks for reading, please give a like as a sort of encouragement and also share this post on socials to show your extended support.