I own an Amazon Echo, which gives me access to the voice assistant Alexa to use some of Amazon services. I usually use Alexa to know about the day’s weather or the news, as well as to play music from Amazon Music’s catalog. Alexa is not perfect. She’s far from it (let me use the pronoun “she” to refer to it), but allows me to use voice to command things that otherwise I would need to type on a computer. In addition, she also lets me know indirectly about the current challenges in natural language processing, machine learning, and artificial intelligence (AI) in general.
For me, the use of an assistant such as Alexa is another example of how amazing human intelligence really is. And I am not talking about the talented scientists and engineers that built her. No, I’m talking about us, the users. We -as her users- are required to adapt to her, that is, we have to adapt to the sentence structure, speech intonation, etc. that Alexa is capable of understanding. Modify a bit any of those elements and your query -your request- might not be successful: you will not get the piece of information that you were looking for or the song that you wanted to hear.
Alexa doesn’t exhibit any adaptation or learning at all. At least not without an update from her creators sent from Amazon servers. She’s not adapting to the intricacies of my own way of speaking as any other human being interacting with me would do. She doesn’t adapt to the way I talk, the vocabulary I use, my mannerisms, etc. Rather, it is I who needs to adapt to the things that she will understand. For instance, I once asked her to play the song “Thor, the Powerhead” by heavy-mental band Manowar. I got the query right once or twice. Afterwards, I don’t know if my own speech changed or something else got in the way between nonsense and comprehension but she couldn’t understand anymore what I was asking for. Same happened with the song “Erotomania” from the american band Dream Theater. She -or should I say “I”?- got it right once, and then I couldn’t make her play the song ever again. All she could do is reply “I cannot find [whatever thing that sounds like the word ‘erotomania’].” Could it be that both songs disappeared from the catalog of Amazon Music? The reader might be asking. Not at all! I could still have her playing them if I used the app in my phone to look for the tracks and cast them to the Echo. (Yet another example of me adapting to her, rather than the other way around. Where is the intelligence on AI?)
What about giving some contextual references as cues to Alexa? Even if she’s not able to find the song because of my bad pronunciation, it would be great if I could simply tell her to play the song that she played to me last Tuesday around 8am in the morning, that is, giving her some contextual information. It would be great if the dialog went along these lines:
ME: Alexa, play “Erotomania” from Dream Theater.
ALEXA: Sorry, I cannot find <whatever word similar to the word ‘erotomania’> from Dream Theater.
ME: Ok. Alexa, do you remember the song that you played yesterday night?
ALEXA: Which one? You played <song #1>, <song #2>, and <song #3>.
ME: Yes, that last one. Play it for me again.
Unfortunately, Alexa cannot keep track of any contextual references to the queries that she responds to even less understanding a question that refers to a context.
Another unfruitful interaction with an AI occured to me last time I wanted to switch electricity companies for my house. The company that I was trying to switch to has implemented an automated calling service that in the best of cases sends customers to the appropriate agent for their particular requests. The AI that I was talking to asked me about the purpose of my call and the service that I required. I tried my best -that is, once again I adapted to the AI- to make my request understandable, but the artificial agent kept guessing wrongly what the purpose of my call was. After getting very frustrated with the situation and thinking about the future of humanity once AIs take over many jobs, I pressed several buttons at random on the dial pad until finally a human voice appeared on the side of the line.
So these experiences got me thinking about the so-called artificial intelligence revolution and its hype. For me, a truly intelligent agent, call it artificial or not, is one that adapts, and does so by learning continuously from its environment. A truly artificial intelligence would be one that adapts to its user. For instance, if Alexa was truly an AI, she would adapt to my particular way of talking, that is, she would adapt to the pauses in my speech, its accents, my stuttering, my filler words or phrases (like the apparently ubiquitous “I mean…” from English speakers), etc. Rather than me adapting to her hearing/understanding capabilities. Another example: everytime there is a major update in my computer or mobile’s operating system, I need to adapt to the changes that the developers made. Wouldn’t it be better if it was the other way around, that is, if it was the software who adapted to me?
Having said that, for me the A in the acronym AI should stand for assisted rather than artificial. Despite its great achievements in the past decade in the form of machine and reinforcement learning, AI is still only a way to augment or assist our own intelligence. We use complex computer vision models to improve our way to distinguish -as ridiculous as it might sound- a cat from a dog, or -as promising as it sounds- a malignant tumor from a benign one, etc. AI methods only provide ways to augment our own capacity to classify, recommend, predict and plan, but there is no truly artificial intelligence yet. And here I am not talking about general intelligence, commonly known as AGI, which would imply that the same system that distinguishes between cats and dogs is also able to learn how to play chess, how to cook an omelette, how to dance, et cetera.
In my humble opinion, the next AI breakthrough will occur after we have provided our models with a capacity to learn and distinguish cause and effect (like it has been advocated by Judea Pearl and others), and to form, learn and combine conceptual representations (like researchers Joshua Tenenbaum, Samuel Gershman and others have been advocating). I also believe that for the next breakthrough to happen a new hardware revolution in the form of commercially available neuromorphic chips will need to happen. Let’s not forget that one of the main features of the human brain -the only object in the universe capable of flexible and abstract intelligence that we know of- is its massive parallelism. Today’s computers don’t get even close to this amount of parallelism. As well they must deal with the overhead of maintaining an operating system for a general purpose machine. So a dedicated neuromorphic chip sounds like a promising medium for truly intelligent systems that adapt and learn from their environments.
One thought on “A is for Assisted”