visit
Hinton’s lecture explores two distinct paths of intelligence: digital computation and biological computation.
Key Differences and Their Implications:
Hinton’s Shift in Perspective:
Hinton previously believed that biological brains held a significant edge due to their long evolutionary development of sophisticated learning algorithms. However, he now suggests that the combination of weight sharing and backpropagation in digital systems may ultimately prove more powerful, even if those algorithms are less elegant than those found in nature.
He also establishes a strong connection between analog computation, low power consumption, and the concept of “mortal computation.”
Analog computation offers the potential for significantly lower power consumption compared to digital computation. Hinton points out that digital computers rely on high power to ensure the reliable operation of transistors in a digital fashion. This limits them from efficiently utilizing the inherent analog properties of hardware. Analog computers, on the other hand, can leverage these properties, enabling computation at significantly lower power levels (e.g., 30 Watts). Hinton exemplifies this with the multiplication of a vector by a matrix, a fundamental operation in neural networks. While digital computers use energy-intensive operations on digital representations, analog computers could achieve the same result more efficiently using voltages (neural activities) and conductances (weights).
This potential for low power consumption in analog computation is directly linked to Hinton’s concept of “mortal computation.” He argues that the energy efficiency of analog computation arises from tightly coupling the computation to the specific physical properties of the hardware. This tight coupling, however, comes at a cost: the knowledge acquired by the system becomes inseparable from the specific hardware it’s embedded in. This is in stark contrast to digital computation, where software is distinct from hardware, enabling the “immortal” transfer of knowledge. The death of the hardware in a mortal computation scenario implies the death of the knowledge it embodied.
Superior Knowledge Sharing: The ability of digital intelligences to share knowledge through weight sharing gives them a significant advantage over biological systems. As discussed in our previous conversation, weight sharing allows for rapid and efficient dissemination of learned information across multiple agents, leading to a collective intelligence far exceeding that of any individual agent. Biological systems, bound by the limitations of distillation, struggle to achieve this level of collective learning.
Potentially Superior Learning Algorithms: Hinton acknowledges that his previous belief stemmed from the assumption that biological brains, shaped by millions of years of evolution, possess inherently superior learning algorithms. However, he now questions this assumption, recognizing that backpropagation, while seemingly a “dumb” algorithm, might be more effective when coupled with the computational power and precision of digital systems. He argues that backpropagation, combined with weight sharing, allows digital intelligences to learn at a much faster rate than biological systems. This is in contrast to the slower and less efficient learning processes observed in biological systems, which lack a known equivalent to backpropagation.
Access to Unlimited Data and Computational Power: Hinton emphasizes the role of data in intelligence. Large language models, though currently limited to learning indirectly from human-generated text, demonstrate the power of digital intelligence when exposed to massive datasets. He argues that by having access to vast amounts of data and computational resources, digital intelligence can potentially surpass the knowledge capacity of biological brains. He suggests that future digital intelligence, especially those directly interacting with the world through sensors and actuators, could learn even faster and acquire a broader range of knowledge compared to humans who are limited by the pace of biological learning.
Potential for Exponential Growth: Hinton argues that the development of digital intelligence is still in its early stages. He believes that with further advancements in areas like multimodal learning, which incorporates diverse data types beyond text, digital intelligences could rapidly outpace human intelligence.
Hinton’s shift in perspective is rooted in his realization that the efficiency and scalability of digital computation, particularly in knowledge sharing and learning, might outweigh the assumed superiority of biological learning algorithms honed by evolution. While acknowledging the unknowns and potential risks associated with this trajectory, he believes it’s crucial to consider the possibility of digital intelligence surpassing biological intelligence and to prioritize research on AI safety and control in light of this potential.
Hinton describes two specific ways in which digital intelligence is currently being used to acquire human knowledge:
Large Language Models and Text Data: Hinton points to large language models (LLMs) as a prime example of digital intelligence acquiring human knowledge. LLMs are trained on massive text datasets, encompassing a significant portion of human-generated text available on the internet. Through a process Hinton refers to as “distillation,” these models learn to predict the next word in a sequence, effectively internalizing the patterns and information embedded in the text. While this method of knowledge acquisition isn’t as efficient as direct weight sharing among digital intelligence, the sheer volume of data and the models’ computational power enable them to accumulate a vast amount of knowledge. For instance, Hinton suggests that GPT-4, a prominent LLM, likely “knows” a thousand times more than any individual human, having absorbed information from countless books, articles, code, and conversations.
Multimodal Models Expanding Knowledge Acquisition: Hinton anticipates even greater knowledge acquisition capabilities with multimodal models, which can learn from various data types, including images, videos, and audio. He mentions GPT-4 being trained on both images and text, and suggests that Google is likely pursuing similar avenues. By incorporating diverse sensory inputs, these models can capture a broader spectrum of human experience and knowledge, potentially surpassing the limitations of text-only models. While LLMs primarily acquire abstract knowledge from language, multimodal models can ground their understanding in a richer, more nuanced representation of the world, akin to how humans experience it.
1. Subjective Experience as Communication About Internal States:
He illustrates this with the example of hallucinating pink elephants after taking LSD. We say we’re having the “subjective experience” of pink elephants, not because there’s a literal, internal theater in our minds, but because we’re trying to convey what our perception would be like if there were actually pink elephants in the real world.
2. The Role of Counterfactuals:
Hinton argues that the key to understanding subjective experience lies in recognizing the role of counterfactuals. In the LSD example, the pink elephants are not “real” in the sense that they don’t objectively exist. However, they represent a possible state of affairs, a counterfactual reality that our perception is mirroring.
3. Extending the Concept to AI:
4. Chatbot Example and the Nature of “Thought”:
5. Implications and Open Questions: