By José Ignacio Orlando, PhD -Subject Matter Expert in AI/ML @ Arionkoder- and Nicolás Moreira -Head of Engineering @ Arionkoder-.
Chatbots are natural language tools that can talk with us through text as other humans would do. Although they were invented quite some time ago (the first one was introduced back in 1966!), it was not until recent years that they have increasingly gained ground as a technology. Currently, almost every online customer service uses at least some form of chatbot to bypass easy-to-solve clients’ requests and reduce the workload of their human personnel. Similarly, public offices, clinics and education institutions have started to leverage these tools to bring personalized information to users, for example to help them to know how to obtain a document, what symptoms and treatment options are related to a specific condition, or to monitor student progress and set goals.
In general, most of the chatbots accessible to us on these kinds of websites are not so smart: they can easily be trapped in conversation loops, or be unable to understand some requests despite our rephrasing efforts. However, thanks to the latest AI improvements in natural language processing (NLP), it seems that we are getting closer to solving these issues once and for all.
Last week, OpenAI –the company behind the AI agent that understands voices in almost every language and one of the best image generators– released ChatGPT, a new and innovative chatbot that uses advanced NLP and machine learning algorithms to provide users with engaging and interactive conversation.
ChatGPT is one of a kind. It can engage in pleasant, realistic conversations about almost any topic, even providing answers to follow-up questions after chatting for a while. It can even adapt its tone to sound differently, as in this example by Kate Crawford, in which it was asked to talk about its environmental footprint using Greta Thunberg’s style. If we believe that something that it said is wrong, we can let it know and it can either insist on it (if it was actually right) or admit its mistake and rectify it. Furthermore, in an effort to avoid misuse, OpenAI gave ChatGPT the ability to reject inappropriate requests, alerting the user of racist or sexist prompts, and to refuse to be involved in a biased or discriminative conversation.
ChatGPT also goes beyond simple chatting. For example, it can solve mathematical questions and explain how it did it, or answer questions about historical figures or geographic aspects. And no, it’s not using Google to find the answers: it has no access to the internet whatsoever, so all this information is encoded in itself. It is such an amazing tool that it can even generate code implementations in almost any programming language.
What, exactly, lays at the core of this fantastic beast? Basically, a fine-tuned version of a large language model (LLM) known as GPT 3.5, trained using reinforcement learning. This process –which we previously discussed in other posts for applications in robotics– mimics the way we train dogs, by giving a “treat” (a reward) to the agent every time it does what we want it to do. In this case, ChatGPT was trained using a modified version of this learning strategy, known as Reinforcement Learning from Human Feedback (RLHF). A first version of the language model was trained using a large dataset of prompts and human responses, which were used to “teach” the model which outputs are expected for a given input. This version was then applied to collect a comparison dataset, consisting of input prompts and multiple automated answers, that were then ranked from best to worst by human labelers, and used to train a reward model. Finally, the LLM is adjusted to optimize a policy against the reward agent, using reinforcement learning.
Just as AI generative models have done for artists and creators, tools like ChatGPT open up a myriad of possibilities for almost every human application. Apart from funny talks with someone that doesn’t (doesn’t?) exist, you can use ChatGPT to write and debug code much faster, get quick answers to questions without having to use Google, and get guidance about how to learn something new. Is it perfect? No, it is not. OpenAI has admitted that sometimes ChatGPT writes “plausible-sounding but incorrect or nonsensical answers”, as the example we’ll see below, or this other example. Also, it is usually quite verbose (a bias introduced by labelers ranking long answers as better than the shorter ones), and it still does not ask for clarifications when it doesn’t fully understand the prompt. Additionally, while OpenAI made efforts to refuse inappropriate requests, some users have already reported that the “bias barriers” can be bypassed by using complicated wording or asking for inappropriate answers in a sneaky way.
The good news is that, for now, ChatGPT is online and freely available for us to play around with it. For how long? We don’t know, since it’s still in a research preview phase. But it offers us a great opportunity to experience this technology by ourselves! If you get interesting answers, feel free to post them in Arionkoder’s LinkedIn and Twitter accounts so we can share them with our community.
Which other applications do you envision for chatbots like ChatGPT? Do you need a chatbot to improve your business processes? Reach out to us so we can help you leverage this technology to your own benefit!
*This title was suggested by ChatGPT after asking “Can you generate a much better title than “ChatGPT: an AI-driven chatbot with whom you can talk about (almost) everything”? We want it to engage people to open the blog post and bait as many clicks as possible.”