Let me start by asking a controversial question: Can machines think? The question takes us back to Alan Turing’s seminal paper Computing machinery and intelligence, where he outlines the so-called imitation game which later would be known as the Turing test. In his paper, Turing pointed out very cleverly that the answer to that question lies on what we understand as machine and thinking. However, establishing definite boundaries for the meanings of either term has been the endeavors of philosophers and linguists probably since the beginnings of civilization, and to date there is no general consensus regarding what is and what is not a machine, what is and what is not thinking: Is the human body a machine? Can dogs think?
I first encountered Turing’s question during an artificial intelligence (AI) course during my undergraduate studies, and the last time I came across it was during a meeting at Numenta’s HQ, although in this occasion the question was immersed in the subtext. It is probably known that one of the missions of Numenta is to reverse engineer the brain in order to reveal the computing principles of the neocortex. Such objective is not only driven by scientific curiosity to understand the universe within us, which might be the ultimate quest to know what is it that makes us human, what are the limits of our mind, and which is our place in nature; but also by practical restlessness and ingenuity: the creation of intelligent machines.
Possibly, the latter objective is in the mind of every AI practitioner and researcher; from those crunching big loads of data into GPUs to those studying Monte Carlo techniques to sample efficiently from complicated probability distributions. However, very few would take their inspiration from the only thing that we are certain to exhibit intelligence, namely, the human brain.
When we consider the development of AI since its origins at the Dartmouth Conferences in 1956, it seems almost impossible not to draw analogies between the pursuit to build intelligent machines based on studying the human brain, and this other achievement of human engineering, which was also inspired by nature: aviation. The analogy allows me to point out that, in the case of AI, we are still in the point in which we do not know to what level of detail should we model brain function in order to obtain emergent behavior that can be deemed as intelligent. This is one of the open questions in the philosophy of AI. Let’s ask ourselves the following: Is it essential for machine intelligence, and thus to natural intelligence, to model every aspect of the brain cell such as the -apparently stochastic- opening and closing of synaptic gates? Or is it enough to model, for instance, action potentials in populations of neurons? In other words: to what level of abstraction is it essential to replicate neuronal activity?
The question has been around since the very beginning of AI, and it resulted in two schools of thought that achieved relative success in several applications and techniques that are still around to date. The symbolist school of thought (not to be confused with the symbolist art movement that includes celebrities such as Charles Baudelaire) claimed that intelligence results from manipulations of symbols inside the mind, where a symbol can be thought of as a mental concept, that is, a representation of a physical or abstract object within the mind. This idea can be traced all the way back to the big three philosophers of empiricism in the 18th century: John Locke, David Hume and George Berkeley.
The symbolist perspective of human intelligence is situated at a high-level of brain processing: right at the top where concepts and notions reside in the mind. This idea resulted in early successes in AI such as expert systems and computer programs such as Terry Winograd’s SHRDLU. However, this paradigm faced difficult obstacles both practical (in order to achieve decent performance most programs required to contain vast lines of codes of the type if-then-else) and theoretical (John Searle’s famous thought experiment: the Chinese room -which would be a perfect title for a thriller directed by Martin Scorsese or Roman Polanski- argued against the possibility of a symbolic system to give meaning to the elements being manipulated, and thus give rise to true intelligence). And thus, this paradigm fell out of use in recent times.
The second school of thought that emerged from having the human brain as inspiration to develop machine intelligence is called connectionism, a term that was introduced by the very Donald Hebb from Hebbian-learning fame. Connectionism aims to model mental and behavioral processes as the emergent phenomenon of the activity of interconnected units, which can be artificial neurons such as the ones used in modern artificial neural networks (ANNs), or cells in a hierarchical temporal memory (HTM) model. Connectionism, however, fell out of favor in the late 60s when Marvin Minsky and Seymour Papert showed how ANNs were not able to model a function as simple as the binary exclusive-or. And, although it was clear that adding more layers to the network would solve the issue, at the time there were no efficient algorithms to train the system. This ultimately led to the so-called AI winter, which would end with the discovery of the backpropagation algorithm in the late 80s.
Nevertheless the question remained open, and it has been the subject of debate between adherents to the connectionist paradigm: what are the minimal properties of brain function required to achieve intelligence? As mentioned above, the issue reminds us of the history of aviation; and in this particular point to the quest for heavier-than-air flight during the middle ages, when enthusiasts across Europe and Asia built wings covered with feathers which they believed would allow them to fly after being thrown from dangerous heights. John Damian, an italian alchemist in the court of king James IV of Scotland attempted to fly from Stirling Castle all the way to France wearing a pair of wings made of feathers. Naturally, he failed and broke his thigh bone in the attempt, not without claiming that his failure was due to not having used eagle feathers instead of chicken feathers.
Many attempts like this fill the pages of the early ages of aviation history, reporting minimal to no success in sustained flight. It was not until Sir George Cayley carried out a systematic and rigorous study of the physics of flight, that mechanical flight was in the way of becoming a reality. Cayley studied scientifically the principles of bird flight allowing him to successfully identify the four physical forces that influence a flying-object: thrust, lift, drag and weight. This level of abstraction let him design flying-machines that would ultimately lead to the modern airplane. Years later, engineers and scientists would extend Cayley’s research to estimate accurately the power needed for sustained flight, as well as to study the wing loading (weight to wing-area ratio) of birds, which eventually would lead to the conclusion that humans are not able to fly under their own power by attaching wings onto their arms. Almost at the same time, mechanical flight enthusiasts, like the Wright brothers, would build wind tunnels in order to study the flow of of air moving past solid objects: a novel simulation environment that pioneered the study of aerodynamics, and that implied a paradigm shift: instead of studying how a flying-object behaves when passing through air, the experimenter would study how air affects a static object when passing through it.
Moving back from aviation to AI, we could wonder whether we have reached a comparable level of abstraction when it comes to replicating or modeling neural function with the purpose of building intelligent machines. The study of AI starting from neuroscience requires its own mathematical tools, and the adoption of paradigms and hypotheses to be tested. One of the main hypothesis of HTM theory is that a cortical circuit has all that is required for sequence learning, invariant-representation of objects, and coordinate transformations, among others. To perform such functions a population of cells perform the operations of spatial-pooling, temporal memory, disjoint pooling and sparse representations. Moreover, in order to be able to perform such operations a single cell need to implement proximal, distal and apical synaptic connectivity along with depolarization triggered by NMDA spiking. Other neural network models do not feature these properties of the living neuron, they do not implement neuron activation nor active dendrites in their inner mechanisms, and rather use individual units as continuous gates of information that little resemble the action of true neurons. These latter models make use of the backpropagation algorithm to perform supervised learning, and gradient descent methods to minimize the learning error. Under this perspective, learning is seen as an optimization problem.
These two different neural models should make us wonder about the essential properties of the neuron that are responsible for the emergence of intelligent behaviour when these units act collectively. More explicitly: is it enough to think of the neuron as a continuous gate of information, or is it essential to capture its spiking properties to the point of say distinguishing between excitatory and inhibitory cells, among other features? What level of abstraction is sufficient to achieve intelligent behavior from a population of neurons? These questions lead almost immediately to another one: how would we know what level of abstraction is right? To answer this question we could ask whether a particular model is able to replicate a particular brain phenomenon, and thus explain its observation and derive further predictions. Or we could build applications based on it, and test whether it achieves human-level performance, or any other benchmark.
To dive a little bit more into this question let me distinguish between theories-of and applications. Theories-of attempt to provide an explanation and thus a framework where to study a particular phenomenon. Examples are theories of mind, of speech, of vision, of the brain, of intelligence, etc. For theories to bear that name need to explain specific observations of the real-world in order to make them understandable. That is how theories are born. This lies at the core of the scientific method.
Applications, on the other hand, are not required to fulfill such requirement. Nevertheless, they need to achieve certain level of performance to be deemed useful. For instance, recent AI techniques are being tested and evaluated by their performance in games such as chess, go, or in Atari videogames. This could be seen as an extension of Turing’s “imitation game” in which if a machine could deceive us by pretending to be a human being based on its responses to an interrogator’s questions in another room, then it could be concluded that the machine exhibits a certain degree of intelligence. Turing’s proposal, however, has received a few number of objections. First of all, it is a behavioral test, and it can be argued that many aspects of intelligence do not imply an explicit behavioral response. Also, it is an anthropomorphic test of intelligence. If the ultimate goal is to build machines that are more intelligent than human beings, why should we insist that intelligent machines must closely resemble people? Drawing another analogy between the development of AI and the history of aviation: airplanes are judged by how well they fly, not by comparing them to birds.
In sum, an AI model situated within the connectionist paradigm could spring to life from either of these two perspectives which I call the “scientific approach”, and the “engineering approach”. In the scientific approach, we start with observations of a particular phenomenon (vision, speech, memory/learning, etc.) and we build a model of it. We assess the quality of the model by evaluating how accurate it is able to replicate the observed phenomenon, and if the resemblance is good enough then we would be able to make predictions from the model. This approach leads to theories when the model is based on sound hypotheses, and not by mere trial-and-error in the model’s parameters or constituents to fit the data. Here, we assess the quality of the assumptions and the level of abstraction of the model by measuring how accurate the model resembles the phenomenon, and how accurate are its predictions.
In the engineering approach, we proceed by building a model right away (that is, without having a particular real-world phenomenon to be replicated or explained). Then, we build applications based on the model, and test if they achieve human accuracy or any other benchmark. Take for example, ANNs for handwritten digit recognition. Here, the task was not to explain the phenomenon of visual recognition of humans, but to build a machine that achieves accuracy in such a task comparable to that of human beings. Here, we assess the quality of the assumptions and the level of abstraction of the model by measuring how the performance of the model compares to human performance.
Both approaches provide valid methods to test the quality of the assumptions made by the model, and its level of abstraction. However one is judged by its usefulness (the engineering approach), whereas the other by its certainty (the scientific approach). At the end, the model’s raison d’être should give us a hint on how to evaluate a model’s assumptions and abstractions; for instance, convolutional neural networks might be great to distinguish between dogs and cats, and even surpass human performance in such a task, but they would not constitute a theory of vision.
Returning to Sir George Cayley, remember that he correctly identified the four physical forces that act upon a flying-object, and that this abstraction ultimately led to the construction of heavier-than-air flying machines. Abusing a little from the analogy, let’s ask ourselves the following: have we discovered the physical forces that act upon thoughts, symbols, or groups of neurons, which are responsible for reason and intelligent behavior? In my opinion, concepts and operations like spatial- and disjoint-pooling, temporal memory, and sparse distributed representations carried out within neocortical circuits will put us on the right track.
Analogies can be inspiring, we can learn from them. But they are just that: analogies. The history of AI will go its own way with its own challenges, obstacles, failures, frustrations, and of course, its own victories. On the way there is plenty of math, physics, neuroscience, and naturally, philosophy to be developed.