The foundations of AI are built on statistical mechanics
The news that the 2024 Nobel Prize in Physics had been awarded to John Hopfield (Princeton University) and Jeffrey Hinton (University of Toronto) shocked the world as the prize was awarded for their achievements in the field of AI. In an interview with NHK (Japan Broadcasting Corporation) reporting on the news, Kabashima said he was “surprised because conventionally, algorithm research is not within the scope of the Nobel Prize in Physics.” However, Kabashima was delighted as if it had been his own achievement because these two scientists were his heroes doing research in the same field as him and built the pillars of modern AI, such as neural networks, machine learning, and deep learning, from a physics perspective using statistical mechanics.
Kabashima first heard of Hopfield just after entering graduate school.
“It was around 1982 that Hopfield proposed that neural networks were a form of the Ising model of magnetism. I started college in 1985 and graduate school in 1989. At the time, there was already widespread talk in statistical mechanics community that we could apply theory of spin systems to neural networks.”
In other words, when the young Kabashima became a researcher, neural networks were already the talk of the town in statistical mechanics.
“As research progressed, it became clear that there was a common mechanism of information processing, and that Bayesian theory could be used to express many problems in information science. Simultaneously, by the mid-80s, Giorgio Parigi, who won the Nobel Prize in Physics in 2021 along with Syukuro Manabe, had created a method for analyzing a magnetic state called “spin glass.” While spending six months in England in 1997, I realized that applying the method of analyzing spin glasses could solve many problems in information science. I remember thinking that looking at problems related to information and computation as spin glasses would open up a rich field of possibilities,” Kabashima recalls.
Now, let us learn about Kabashima's unique, cutting-edge research and his dreams while explaining some unfamiliar terms. First, statistical mechanics.
The strange phenomena of “many”
Water turns into ice (solid) when the temperature drops to 0 degrees Celsius and steam (gas) when the temperature rises to 100 degrees Celsius. Or consider a magnet, which suddenly loses its magnetism when it is heated and reaches a threshold temperature (for example, neodymium at 312 degrees Celsius). An abrupt rather than gradual change like this is called a “phase transition.”
“Strange, right? How could such a thing happen? It is the central question of statistical mechanics. The answer is... because there are so many.”
So many? Of what?
“For example, in 22.4 liters of air, there are about 6.02 x 1023 molecules. We would not see properties such as phase transitions with only a few molecules. However, strange properties emerge when there are as many molecules as Avogadro's number. Similarly, public opinion can get suddenly inflamed because so many opinions come together. Something that would not amount to more than a small disagreement in a friend group of two or three can stir up strong emotions in a group of many. The same is true for crowded streets. At first, everyone is walking on a path of their choosing, but eventually, as the number of people increases, the flow naturally divides into those on the right and those on the left. So, the simple answer is because there are so many.”
Kabashima laughs, but the key takeaway is the strangeness of the fact that something happens just because “there are so many.”
“I want to know the laws underpinning this strangeness. That is my motivation.”
In Newtonian mechanics, the second law of motion, F=ma (force = mass x acceleration) governs all objects. This equation governs the movement of every molecule as well. Thermodynamics, however, has revealed new laws; a typical example is the relationship expressed by the ideal gas law, pV= nRT (pressure x volume = amount of substance x ideal gas constant x temperature).
“We are looking at the same gas, but if we look at the individual "microscopic" molecules, we see "F=ma," and if we look at the macroscopic aggregate, we see "pV= nRT." Statistical mechanics is the study of how these two levels are related. In other words, statistical mechanics explains the macroscopic phenomena emerging from microscopic behavior.”
“Quantity changes quality”
“More is different,” or “quantity changes quality.” These are the words of Philip Anderson, the 1977 Nobel laureate in physics, and Kabashima's favorite phrase.
“When there are many elements (each of which is an object, such as a molecule), the number of combinations increases and becomes more and more complicated. In other words, if we look at things from a microscopic point of view, we will encounter a combinatorial explosion and lose sight of what is happening. However, we know by experience what it is like to look at things from the "top," from the macro level. By reducing the magnification of a microscope, for example, a set of simple rules emerge.”
In the metaphor of a street crowded with people, the micro perspective is the properties of each person (where they are going, how fast they are walking, whether they are lost in thought, whether they are chatting with a friend they are walking with, etc.). Predicting how each person will walk from this large number of parameters and making the same calculation for thousands of people is, of course, impossible due to a "combinatorial explosion." But observed from above with a drone, a regularity, the flow of people, emerges.
This is what “more is different” refers to. Combinatorial explosions of microscopic molecular motions too numerous for computation become computable, obeying different laws when "the many" is viewed in its entirety from the top. This mechanism is investigated using statistical mechanics expressed by probabilities.
“Even if we knew most of the laws that govern physical phenomena, it would still be difficult to predict what would happen in the future. Statistical mechanics attempts to predict macroscopic properties using probabilistic methods, which is not an easy task. But a model with certain special properties (idealization to make it easier to handle) allows us to see the shift between the micro and macro levels (how what happens at the micro level manifests at the macro level).”
One of these models is the Ising model mentioned previously.
What is the Ising model?
The Ising model (see the illustration) refers to a concept proposed to explain how magnetism emerges in magnets.
“Electrons have a magnetic moment called spin, which makes them weakly magnetic. In ordinary materials, the spins of each electron in each atom point in all directions but align when a magnetic field is applied from the outside. The spin alignment makes the material magnetic and attracted to or repulsed by a magnet. Incidentally, a permanent magnet is a material in which the spins are already aligned at room temperature.”
“Why does a magnet undergo a phase transition and suddenly lose its magnetic force at a certain temperature? To explain this, we must consider the interactions of huge numbers of electrons interacting with each other, which is too complicated. Therefore, to simplify the problem as much as possible, we express spin directions as up or down and the magnetic force as 1 or −1. We place the spins on each corner of a square lattice, like pieces on a chess board. The interactions between spins are inherently weak. So, they only interact with their neighbors. The result is the Ising model, a simplified mathematical toy of sorts. Even though it is a simplified model, it is still challenging to solve in practice. However, we have managed to get a handle on it and now know exactly how phase transitions occur.”
As the spins move to align themselves with their neighbors, they create a magnet. When the model is simplified to the sole property of “alignment with neighbors,” calculations from a macroscopic point of view can be made with an infinite number of spins, the same way we treat gases.
This Ising model is regarded as a model for analyzing behavior at the macroscopic level not just for electron spins but for large numbers of “things” of any kind. Hopfield’s proposal mentioned previously was born from the idea that the Ising model could be applied outside of physics.
“Nerve cells (neurons) in the brain are in one of two states: either firing or not. That is, their states can be represented as 1 or −1. Connecting many such “elements” creates a kind of artificial brain, a “neural network.” There are various ways to connect these elements, but in the early 70s neural networks of “associative memory,” networks that could store and retrieve numerous patterns, were proposed one after another. Japanese researchers such as Kaoru Nakano and Shunichi Amari contributed significantly to these models. After meticulously examining these studies, Hopfield’s greatest achievement was to realize the Ising model was underlying these networks.”
This is where statistical mechanics connected the world of physics to the world of information.
The spin glass theory Kabashima encountered in the 90s is a theory that explains why spins become frozen in an unaligned state in magnetic materials due to the randomness of the interaction at each point in the material. It is called “spin glass” because this state of unaligned spin directions resembles glass that has been rapidly cooled and failed to crystallize.
The free and fascinating world of information
A major revolution in AI is now underway in various domains. At the root of this revolution is the rapid development of machine learning kickstarted by Hopfield's introduction of the Ising model into neural networks. In other words, it was statistical mechanics that greatly accelerated the evolution of AI. Kabashima's task is to take this revolution further and to more diverse areas.
“What I am working on is not as glamorous as ChatGPT (laughs). I research various important tasks machine learning can do, such as creating “classification” algorithms or constructing various “learning” scenarios for algorithms. There are many ways to make learning happen. For example, if a student asks good questions in a teacher-student scenario, learning will progress quickly and smoothly. If we could leverage our data to facilitate this kind of learning, the proportion of performance to the amount of data would increase considerably. We are developing models for this purpose and doing wide-ranging research.”
Wireless communication systems, such as CDMA used in cell phones, are another target of Kabashima’s.
“Cell phone signals are first sent to the nearest base station. However, the base station receives signals from many cell phones in the surrounding area, which leads to overlapping signals. How do you recover each original signal? As it happens, spin glass analysis is used for this task as well.”
In this way, Kabashima says, many problems related to information and communication have become a study subject for statistical mechanics. The same is true for error correction, the technology for detecting and correcting erroneous code (noise) generated in communication networks.
“It corresponds to creating interactions in the Ising model. So, the signal can be untangled using statistical estimation even with noise present. The same mechanism can be found in an associative memory model. Any phenomenon that can expressed in terms of probability distributions is subject to statistical mechanics. That includes things like cryptography.”
Showing the width of Kabashima's research interests, a study to extract information from biological data is underway.
In physics, describing the motion of three or more interacting objects (or units) is called the many-body problem. Kabashima also considers information a many-body problem in which the objects are bits.
“Information is invisible to the eye. It is an abstract world in which almost all problems are many body problems. However, unlike physics, the world of information is not restricted by spatial dimensions. In three-dimensional space, there are distances and directions, up-down and left-right. In the world of information, however, everything can be next to everything. Imagine something like an airplane route map. A hub has many routes leading to and from it, in other words, it has many neighbors. A local airport with only one route has only one neighbor. We can correspond this route map with a space where each location has a different number of neighbors. In this way, in the world of information, we have the freedom to think about various things in different ways, which makes it easier to navigate and find interesting problems. It is a lot of fun. Well, I find it fun.”
AI cannot make discoveries
Kabashima describes the potential of statistical mechanics as “applying theory of matter to abstract problems.” It means that statistical mechanics can bridge the gap between the abstract, irreducibly complex phenomena spanning from communications technology to animal behavior and the phenomena of the natural sciences involving concrete objects. Or, perhaps it means that this ability is the essence of statistical mechanics.
The new movement currently being promoted under the banner of the “fusion of machine learning and physics,” known as the “machine learning physics,” may also be seen as part of “applying theory of matter to abstract problems.”
“The repeating cycle of creating hypotheses and testing them through experiments brought about the developments of modern science. This will not change in the future. Machine learning physics aims to accelerate this process by leveraging the power of machine learning. I believe it will be possible to achieve in just one day what used to take 10 years in some fields.”
Kabashima adds that the aim is to accelerate research using machine learning, that is, AI, in various fields, such as astronomy, particle physics, and condensed matter physics. However, it is still unclear how Kabashima personally thinks about AI.
“The AI excels at applications where combinatorial explosion is a problem. Many scientific fields, such as material design and high-performance battery development, would grow exponentially if we could swiftly explore quantities of combinations. AI should be used more and more in such fields. In cases of combinatorial explosion or where data are too vast for humans to handle, machine learning should be used proactively.”
On the other hand, Kabashima also says that AI will never be able to make discoveries such as new laws of nature.
"For a while, there was a lot of angst about the singularity (a prediction that AI surpassing human intelligence would have grave consequences). However, I strongly believe that AI is merely a tool. As such, it should be harmless provided people do not misuse it. We can use it for our convenience. We cannot run faster than a car or fly like an airplane. The same is happening in the realm of knowledge. We will not need to memorize things but learn to use AI well. I even ask AI, “How do I input/output files?” when I code (laughs). But AI probably cannot make discoveries. AI cannot accomplish things never thought of before."
In short, discovery and creation are tasks left for people to do.
Looking for unknown paths
“Since I was an undergraduate student, I wanted to do something new that other people have not yet done or could not do. I saw statistical mechanics as a field with such potential. It has allowed me to be free. Because there is no fixed subject, there are no restrictions. That is why I can use it very freely. I like it very much.”
Kabashima says with a joyful smile. He hopes young people will also enjoy the freedom he appreciates so much.
“I think that nowadays we tend to focus on reaching a certain goal as fast as possible or solving difficult problems. I intentionally avoid such things (laughs). I want young people to aim for paths they did not even know existed. I want young people to focus on finding a field where they can win without fierce competition. Instead of following a predetermined path or competing with everyone else to solve the same problem, I would like students to put more effort toward finding what is fun for them, such as finding a new problem or offering a new perspective, even in a small way.”
The charm of science, Kabashima says, lies in the pursuit of fun.
“Compared to engineering and other fields, basic science has fewer restrictions, and doing research is free. Discoveries are made, and things never thought of before arise thanks to this freedom. That is what makes basic science fun, but it also takes a long time to make a solid contribution. Many young people these days say they want to be useful to others. However, I think they do not have to be useful to others and should feel free to do what they want.”
When asked about his motivation for his research, Kabashima replies he is motivated by simply being a researcher.
“I want to do fun things. I want to find things that people did not realize existed.”
By the way, Kabashima says he has experienced finding something he did not know existed three times. He says he was grinning all day long.
“I remember thinking "Could this be possible?!” and “I can solve the problem using this.” It was too much fun.”
*Bayesian theory: describes the probability of an event based on prior knowledge of the conditions that may be associated with the event
※Year of interview:2025
Interview/Text: OTA Minoru
Photography: KAIZUKA Junichi