Chatbots: Past, present, & future

Published in

Chatbots Life

14 min readJun 4, 2018

Chatbots are gathering, slowly & surely, on the horizon of a new set of technological advances ready to take human experiences with machines to a new realm.

In 2017, Yogesh Moorjani & I gave a talk at UXPA Toronto about our experiences & thoughts around designing useful chatbots. Yogesh has distilled portions of the talk into a couple of excellent Medium articles, which you can read here & here. In those articles, you will find tips & tricks on approaching your next design problem where you might be considering using a chatbot.

In order to understand how we’ve gotten to today’s environment in which over 2 billion messages are sent a month through Facebook’s bot platform, we decided to also chart a path through computational history that has brought us to the present-day landscape. In this article, I highlight some key landmarks from this history that have served as building blocks for our current ecosystem where bots, increasingly, are more & more integrated within the fabric of our digital experiences.

The following journey is framed through the evolution of two interwoven fields, machine intelligence & human-computer interaction.

Machine intelligence jointly refers to the the academic fields of artificial intelligence & machine learning while human-computer interaction refers to our experiences with & use of these machines. Advances in these domains have shaped our present-day technological reality, and the examples below come from the rich historical backbones that have shaped both fields. These examples are not meant to comprise a comprehensive list; instead, they aim to fill into the above frame of reference.

1920s: ‘Robots’ arrive

‘Robots’ formally enter the English language

The word ‘chatbot’ obviously has two components — ‘chat’ & ‘bot’. ‘Chat’ captures the conversational nature of these tools; we’ve been ‘chatting’ with each other for thousands of years, and this is partly what makes chatbots so inviting. Meanwhile, ‘bot’ has come to represent the digital & computational sub-section of our concept of ‘robots’, a concept that has lingered in the minds of humans since the dawn of civilization but, curiously, only formulated as an English word relatively recently.

In 1920, Czech playwright Karel Čapek wrote a science-fiction play which introduced the word ‘robot’ into the English lexicon for the first time. The play’s original Czech title Rossumovi Univerzální Roboti translates to Rossum’s Universal Robots in English, & the play centers around these titular factory-made entities called roboti, or robots.

The robots in Capek’s play look very similar to humans & can often be mistaken for humans (essentially, they are our modern image of clones). Initially, these robots are happy to work for & under the watchful thumb of the humans that produced them.

However, in a somber twist of events, what starts off as a nice tale of human-computer interaction devolves (or evolves depending on where your sympathies lie) into a revolt by the roboti that leads to the near complete extinction of human-kind (a plot-line that wouldn’t be all that out of place for an upcoming Black Mirror episode).

1956: Dartmouth Summer Research Project on Artificial Intelligence

The field of Artificial Intelligence (without which we wouldn’t have chatbots) really took off in the summer of 1956 when a group of luminaries in the fields of computer, information, & cognitive sciences came together for a 2 month research conference at Dartmouth College. Spearheaded by John McCarthy, the Darthmouth Summer Research Project on Artificial Intelligence produced intellectual outputs that laid the groundwork for future research & exploration in the field, the fruits of which we are enmeshed with today. Along with McCarthy, the conference was attended by pioneering figures such as Marvin Minsky, Nathaniel Rochester, Herbert Simon, & Claude Shannon.

Of particular interest for the context of chatbots is the wording & framing in the proposal that McCarthy wrote for this conference (emphasis mine):

“The study is to proceed on the basis of the conjecture that every aspect of learning or any other feature of intelligence can in principle be so precisely described that a machine can be made to simulate it.
An attempt will be made to find how to make machines use language, form abstractions and concepts, solve kinds of problems now reserved for humans, and improve themselves.”

Simulations…machines…language — here, we see the seeds planted for the conjecture that, in order to get machines to simulate intelligence, the use of language is key. Through this conjecture, the concept of ‘chatting’ irreversibly crosses over from the realm of purely human communication into a foundational fabric for the future of machine intelligence.

1950s: The Turing Test

Alan Turing is, by now, a familiar name in most technological circles, & his namesake Turing Test has been a popular talking point in discussions around machine & human intelligence for the last 60 years — and for good reason. The Turing Test gives us another example of how language & conversation are seen as core indicators of intelligence.

Turing, the famous computer scientist, wrote a paper in 1950 in which he initially started with the question: “Can machines think?” What he surmised quickly, however, was that this is a very difficult — if not impossible — question to answer. Ask 10 people and you’ll get 10 different answers. The words ‘machine’ & ‘think’ have seemingly exponential definitions.

So, instead, as any good philosopher would do, Turing posed a thought experiment that he hoped would get to the bottom of his very vexing question. He asked: can a machine be made to successfully play a game he dubbed ‘the Imitation Game’? (Not coincidentally, this is also the title of a biopic on Turing released a few years ago.)

In the first variation of Turing’s ‘imitation game’, there are 3 rooms with computer terminals. In one room, there’s a male, in another a female, & in the 3rd, a judge. The judge’s job is to figure out which room contains the male & which contains the female. The judge can communicate with the participants using the computer terminals, & the job of the participants is to send the judge hints (or deceits) to try & help (or distract) from the task at hand.

Turing then makes a slight modification to the game. In this new variation, the 3 rooms now contain a judge, a human (gender neutral), and, in place of the 2nd human from the initial version, a machine that can use language to send messages to the judge just like a human can. Now, the judge’s job is to decide which room contains the human & which one contains the machine. If the machine can successfully fool the judge more than 50% of the time (i.e., the judge can’t figure out whether they are talking to a human or machine), Turing argued that it can be deemed intelligent.

So, what Turing essentially does is turn the question “Can machines think?” into the question “Can machines act like thinking things can?” And in formulating this into the thought experiment above (now known as the Turing Test), he furthers the notion that language, conversation, & interaction are things that machines should strive for if they want to be thought of as thinking, intelligent things.

(As a brief aside, it’s important to recognize that even today, not everyone agrees with the notion that language is a fundamental tenet of intelligence. Debates persist with equal merit on both sides.)

1964: ELIZA

Fast-forwarding to 1964, we find the computer program ELIZA, one of the earliest examples of Natural Language Processing (NLP) in use. NLP is a branch of Artificial Intelligence that allows computers to interpret what humans say to them by parsing out words, phrases, & other grammatical turns and is an essential technology for the advancement of chatbots.

Created by professor Joseph Weizenbaum, ELIZA (named after Eliza Doolittle from George Bernard Shaw’s play Pygmalion) used computer ‘scripts’ that allowed it to parse typed user input, interpret those inputs, & respond accordingly. So, as a simplified example, if a user’s typed sentence contains the word “apple”, the script might look through its dictionary to find a pre-programmed reply for any sentence that contains the word apple & send that in response.

The most popular of these scripts was one named DOCTOR, which aimed to simulate the behavior of a Rogerian psychotherapist. Rogerian therapy, also referred to as persona-centered therapy, encourages an “approach that allows clients to take more of a lead in discussions so that, in the process, they will discover their own solutions” (Psychology Today). This provided a perfect model for a nascent chatbot because it allowed ELIZA to ask passive, leading questions (“How does that make you feel?”, “Tell me more about that…”, etc.) that made it seem convincingly empathic & engaged in the conversation while actually doing relatively little (by today’s standards at least) ‘intelligent’ computation in the background.

What’s curious is that Weizenbaum initially created ELIZA as a way to say “Look, having machines & humans converse with each other is a superficial, silly endeavor!” (my words, not his). However, he was in for a surprise as he quickly found that people really took to the program & some even thought they were talking to a real doctor! Enthusiasts started envisioning ways that scripts like DOCTOR could be used to actually help people feel better about themselves & let go by getting therapeutic help through conversations with a computer.

Through the example of ELIZA, we start seeing abstract concepts of language & computation come together into tangible use-cases that people could experience. ELIZA is considered by some as the world’s first chatbot.

https://psychology.unt.edu/sites/default/files/users/jlc0442/figure_in_therapy_5141.png

1960s: IVR

Interactive Voice Response (IVR) systems

Right around the same time as ELIZA, Bell Labs creates the Interactive Voice Response (IVR) technology that we all have been annoyed with one time or another in our lives. At the time of IVR’s invention, there were already around 160 million phone lines around the world, a number that was rapidly growing. Communication through technology was becoming increasingly commonplace & user expectations began developing around these communications.

While IVR doesn’t actually take off commercially until the 1980s due to complexities in scale, implementation, & cost, it presents us an early example of a practical ‘conversational user interface’ (a category chatbots fit into) that was widely available for the common person to use outside of an academic or lab-based environment. It was a technology that met people where they were and sought to give them a way to experience automation in an approachable, ‘human’ manner. Since its inception, IVR’s pros & cons have provided us many lessons about what to do & not do with conversational UIs (a topic that warrants its own post.)

1980s: Jabberwacky

Jabberwacky.com, still alive & well in 2018

We now jump ahead in time to the 1980s. British programmer Rollo Carpenter set out to create a machine specifically for the purpose of passing the Turing Test & he decided to do so by building what he called a ‘humorous chatterbot’.

There are two aspects of Jabberwacky that makes it an interesting road marker to pause at in our journey. For one, it contained a very basic concept of a ‘working memory’. So, if I told it “I love the universe!”, Jabberwacky could immediately reply back with something along the lines of “Wow, I love the universe too! Tell me more about this.” The program obviously didn’t actually ‘understand’ what I just told it & this is also different from long-term memory, which would mean actually preserving this information over time & building associations around it.

But by storing some simple information from the last thing the user sent, Jabberwacky brought a level of ‘shared context’ into the conversation, which is a crucial component of actual human-to-human communication. Similar to ELIZA’s simple script-based conversation model, Jabberwacky’s working-memory-based model leveraged a well understood trick of human psychology. As Discover Magazine explains it, “[w]e humans tend to attribute much more intelligence to [computer systems] than is actually there. If it seems partly aware, we assume it must be fully so.” By tapping into this aspect of our psychology, programs like ELIZA & Jabberwacky gave us working examples of computer programs that immersed users into the illusion of having an actual conversation with a seemingly intelligent being. The roots established at the Dartmouth AI conference can be seen poking through here.

The second important point of note about Jabberwacky is that it was released on the internet in 1997. At that point, there were 70 million people on the internet or 1.6% of the global population; a far cry from the now close to 50% (or about 3.8 billion internet users), but a significant population nonetheless. Following along the trajectory of the IVR, Jabberwacky was available for access on a burgeoning medium for human experiences, thus feeding our curiosities & raising our expectations of a platform where an increasing number of our interactions were taking place. The program remains available at jabberwacky.com

2001: SmarterChild arrives

Jabberwacky inspired SmarterChild, which was released on AIM (AOL Instant Messenger) & MSN Messenger in 2001. Reaching over 30 million people directly, the creators of SmarterChild helped propel the popularity of chatbots by being adamant that they weren’t going to just build another novelty ‘chatterbot’ but instead would build a tool for utility. They did this by bundling SmarterChild together with databases of information such as movie times, sports scores, stock prices, news, weather, etc. It could help you convert a Fahrenheit temperature into Celsius so that you could quickly use it in your conversation with a new friend from halfway across the world. If you asked it for showtimes for Star Wars Episode II, it would (sadly) let you know.

So with SmarterChild, we see a real evolution in bot technology & utility. Instead of simply exhibiting basic intelligence through the ability to carry out a fairly trivial conversation, bots could now help us with practical, day-to-day tasks. This represented a significant step forward in both the trajectories of machine intelligence & human-computer interaction.

2011: Watson takes on Jeopardy

Jumping further into modern times, we get to IBM’s groundbreaking Watson. Introduced to the masses when it took on (& defeated) Jeopardy champions Ken Jennings & Brad Rutter in 2011, Watson represents a significant step up in machine intelligence & Natural Language Processing in particular. Trained on immense amounts of data (both user-generated & historical), Watson showed the world that computers were no longer limited to simple script-based or call-and-response styled behaviors when dealing with natural human language. It could intelligently combine multiple elements within sentences — words, phrasing, & the interplay between them — to come up with appropriate replies.

As our aspirations of what to build with machines continued to soar through our experiences with some of the above examples, Moore’s law found a healthy intersection with Watson to pair the exponentially growing power of computing with our ambitions. IBM has now deployed Watson’s underlying technology into a variety of domains such as healthcare and sports analytics & has also made parts of it available for developers to deploy in their applications.

Jennings must have read *Rossum’s Universal Robots*

2010 - now: Personal assistants, Bot marketplaces, & the future

Popular assistants from Amazon, Microsoft, Google, & Apple

As we come to the end of this winding path, we land in our current landscape filled with familiar assistants such as Amazon’s Alexa, Apple’s Sira, Google Assistant, & Microsoft’s Cortana — which combine voice & graphical user interfaces to provide us natural conduits for our experiences with the technology we are increasingly surrounded by — & bot marketplaces from the likes of Facebook, Microsoft, Amazon, & WeChat, where hundreds of thousands of bots are being churned out.

Zooming out and looking at the trajectory laid out here, what becomes clear is that we have gone from creating bots for the purposes of research & novelty to now building them for utility.

In case you didn’t read the sentence above

For those of us who are user-centered practitioners of design & research, this is exciting! Instead of building something to pass a philosophical test, we now see chatbot experiences being crafted with particular user needs & goals in mind. This is particularly important in light of recent research which shows that the top messaging apps — a domain in which many chatbots reside — have surpassed the top social networking apps in popularity.

From http://www.businessinsider.com/the-messaging-app-report-2015-11

In the spirit of meeting people where they are with our technology, this makes it extremely important that we continue to drive towards building chatbots for practical purposes. And as usage of messaging applications increases, so too do user expectations in parallel. Relying on old tricks of psychology to make a bot seem aware & intelligent will no longer cut it if a chatbot wants to stand out in a crowded marketplace of thousands. Thoughtful & purposeful design will be the only way forward.

While the question of whether chatbots are here to stay or are simply a passing fad seems to have faded a bit in 2018 (or, more appropriately, has become subsumed into the larger discussion of modern technology’s role in our day-to-day experiences), the debate will surely rage on. I hope that the journey I’ve charted out here has made it clear that this conversation about conversations is most certainly not new & instead is one that has deep roots in our intellectual imaginations, curiosities, & explorations.

I believe we can best create the future we want by examining the thread that ties our past endeavors together; I hope you’ve enjoyed this attempt to do so for chatbots. I’d love to hear your thoughts & look forward to continuing the conversation — get in touch via Twitter, LinkedIn, or the comments section below!