Did you know that language models can be used by robots to understand instructions and even write their own code?

Foto del autor

By Nicolás Moreira

November 11, 2022

Imagine that you’re rushing for a deadline. You’re typing furiously on your keyboard, and you cannot even move your eyes from the screen. You have to end stuff, get things done, ASAP. Then you decide that an extra sip of caffeine will be useful for a boost, and distractedly extend your hand to grab the coffee that is right next to your laptop… ending up spilling it all over your desktop! Yeah, I know, you’ve been there before.

Now imagine that by simply saying “Can you give me something to clean this mess up?”, a robot reasons by itself what happened, move itself to the kitchen, and comes over with a sponge for you to solve it. Well, as science fiction as it sounds, this can soon be happening thanks to the very same AI algorithms that are being used by students to do their homework

AI-driven large natural language processing (NLP) models such as BERT, GPT, and PaLM have shown great potential for code completion, image prompting, and faking conversation. Robotics experts and Data Scientists have built upon this technology to create new task automation applications. 

Not so long ago, Google announced PaLM-SayCan, a robotics algorithm that helps people better interact with robots. As machines struggle to understand instructions that require quite a bit of reasoning, researchers proposed to leverage the power of the state-of-the-art NLP model PaLM to make much easier prompts for them.

Hence, when a person says something like “I just exercised, can you bring me a drink and a snack to recover?”, PaLM uses a technique known as chain of thought prompting to decode the actual request, producing in this case the expression “The user has asked for a drink and a snack. I will bring a water bottle and an apple”. Therefore, robots can take this information to score the likelihood that an individual skill makes progress towards completing the high-level instruction, deciding on their own what to do to get it done.

More recently, another team from Google created “Code as Policies”. This goes one step further, as rather than generating natural language clauses much easier for the robot to interpret, the NLP network is applied to automatically produce code blocks for the robot with the instructions to follow. This, combined with a perception API –driven by Computer Vision tools– and an action API with a pre-defined set of skills, allows robots to do what they are asked to do without being explicitly programmed for it. Furthermore, requests can be provided in a much more expressive and natural way, which is a great step toward making robots more usable.

These experiments at the intersection between AI and robotics are getting more real every day, coming out from lab benches to daily routines. As an example, Everyday Robots, a US company that partnered with Google to improve their products, is already creating helper robots with capabilities to learn by themselves how to help anyone with almost anything. 

Which applications do you envision for large scale language models in your business? Reach us out so that we can help you accomplish them at Arionkoder!