NLP in 2020; Modern Applications

Erin Khoo
Chatbots Life

--

Training computers to understand natural language has been a sought after application of computer science since the 1950s. Notably with ELIZA, a first in chatbots that facilitated dialogue between a computer system and a human via a command prompt.

Natural Language Processing (NLP) has come along way and is an exciting area of machine learning. I am particularly interested in NLP applications in business products and believe it will facilitate a big change in how the labour market gains aggregate efficiency in the future of work.

In this essay I explore Stanford lecture(s) on NLP with deep learning and unpack some recent related research in Neural Language Models by Microsoft and Google.

NLP has gone from rule based systems to generative systems with almost human level accuracy along multiple rubrics within 40 years. This is incredible considering we were so far off naturally talking to a computer system even just ten years ago; now I can tell Google Home to turn off my sitting room lights.

In the Stanford Lecture by Chris Manning introduces a Computer Science class to what NLP is, its complexity and specific toolings such as word2vec which enable learning systems to learn from natural language. Professor Manning is the Director of the Stanford Artificial Intelligence Laboratory and is a leader in applying Deep Learning (DL) to NLP.

The goal of NLP is to allow computers to ‘understand’ natural language in order to perform tasks and support the human user to make decisions. For a logic system, understanding and representing the meaning of language is a “difficult goal”. The goal is so compelling all major technology firms have put huge investment into the field. The lecture focuses on these areas of the NLP challenge.

Trending Chatbot Articles:

1. Knowledge graphs and Chatbots — An analytical approach.

2. Blender Vs Rasa open source chatbots

3. Designing a chatbot for an improved customer experience

4. Top 5 NLP Chatbot Platforms

Some applications which you might encounter NLP systems are spell checking, search, recommendations, speech recognition, dialog agents, sentiment analysis and translation services. One key point Chris Manning explains is that human language (either text, speech or movement) is unique in that it is done to communicate something, some ‘meaning’ is embedded in the action. This is not often the case with anything else that generates data. Its data with intent, extracting and understanding the intent is part of the NLP challenge. Chris Manning also lists “Why NLP is hard” which I think we take for granted.

Language interpretation depends on ‘common sense’ and contextual knowledge, language is ambiguous (computers like direct, formal statements!), language contains a complex mix of situational, visual and linguistic knowledge from various timelines. Learning systems we have now do not have a lifetime of learned weights and bias so can only currently be applied in narrow-AI use cases.

The Stanford lecture also dives into DL and how it is different to a human exploring and designing features or signals to then apply to the learning systems. The lecture discusses the first spark of DL with speech recognition from work done by George Dahl and how the DL approach got a 33% increase in performance compared to traditional feature modelling. Professor Manning also talks about how NLP and DL have added capabilities in three segments, namely what he calls Levels; speech, words, syntax and semantics. Tools; parts-of-speech, entities and parsing and Applications; machine translation, sentiment analysis, dialogue agent and question answering. Stating NLP + DL have created a ‘Few key tools’ which have wide applications.

Words as vectors — https://youtu.be/8rXD5-xhemo?t=2346

Towards the end of the lecture we explore the ideas around how words are represented as numbers in vector spaces and how this applies to NLP and DL. Word meaning vectors then are usable to represent meaning in words, sentences and beyond.

Recently there have been some big advances in NLM research which has pushed the boundaries of state-of-the-art (SOTA) with NLP.

The Turing-NLG released by Microsoft is a transformer based language model built under Project Turing

It is a huge language model (17 Billion parameters) which is 48% larger than the next biggest in the NVIDIA Megatron-LM. There are some advantages cited for the increased size such as a larger corpus of data to provide direct answers from and ‘zero shot’ answering of questions. T-NLG now performs as SOTA on the Recall-Oriented Understudy for Gisting Evaluation (ROGUE), a set of measures for evaluating automatic summarisation and machine translation. Microsoft plan to use this model for their cloud services and enable clients to leverage its technology. This is a strong strategy since almost no other company other than Apple, Google, Alibaba, Tencent or Amazon would have the computation or resource power to have even trained the model.

What is so surprising, is within weeks of T-NLG being published Google have released two new optimised NLP based models in the form of T5 and Meena. T5 is a Text-to-Text Transfer Transformer which facilitates text input for a variety of goals and then the model ‘understands’ the intended output, the same model can summarise text and translate languages!

Google also created an open sourced ‘Colossal Clean Crawled Corpus’ (c4) which was used to create T5, another great contribution to NLP research. The model is particularly good at question and answer and fill-in-the-blank tasks which is typically a tough challenge as Chris Manning talks about in the Stanford lecture. T5 is currently (Feb 2020) ranked #1 in the SuperGLUE Leaderboard, a rubric of hard NLP tasks to rank models.

This year Google released a paper called Towards a Human-like Open-Domain Chatbot. They created a dialog agent called Meena which acted as a SOTA conversational agent. Their best model was trained over 30 days on a Google Tensor Processor Unit-v3 Pod (2,048 TPU Cores), a huge amount of compute only accessible to a major tech company.

Toward a Human-like Open-Domain Chatbot — https://arxiv.org/abs/2001.09977

Conversational agents tend to be domain specific and Meena is a model that attempts to better generalize human interactions. What is interesting and I think how NLP models are going to develop, is they introduced a new metric to improve the model. They call this Sensibleness and Specificity Average (SSA); a way to minimize ‘perplexity’ and it allowed Meena as an open-domain chatbot to be only 2.6B parameter in size. This is 15% of T-NLG with some SOTA record setting in NLP.

Very impressive and goes to show more specific system design and signal augmentation can improve real-world model deployment. The used human ‘Crowd workers’ to label conversation snippets for various sizes for ‘sensibility’ and ’specificity’ which then could be used to reward the model for specific and interesting responses to proceeding questions.

To further dive into this topic I think it would be impactful to develop my own model possibly leveraging the open source c4 dataset (colossal cleaned Common Crawl’) in some way. The NLP challenge has a lot of layers to it and to really understand its applications I would like to investigate going deeper into conversational models to apply to recruitment. I know from domain specific knowledge there is a lot of inefficiencies in HR processes with just letting the job seekers know where they are with their application and how well their resume matches with the job they are interested in would improve everyones experience.

NLP is a fascinating and high growth potential subfield in machine learning. The real-world applications are already in the wild and they are part of our everyday lives, some of the most resourced companies in the world are weighing in a big way to enable innovation and gain market share.

It has been 50 years since the idea of the Turing Test. Could 2020 be the year we answer; ‘Can machines think?

I have learned a lot from going deeper into NLP and found some very compelling research that really could save humans a lot of time with emails, communication, discovery and decision making. I found the explanations of how word2vec functions and early n-gram models a great introduction into the history of NLP and how fast the field is moving 2020 and beyond.

If you would like to explore the wonderful applications of computation and machine learning creating conversations with meaning, I suggest interacting with Talk to Transformer, a web interface for the OpenAI GPT-2 partial model.

You can also challenge the Google T5 model in a Q&A trivia quiz.

Don’t forget to give us your 👏 !

--

--