By José Ignacio Orlando, PhD -Subject Matter Expert in AI/ML @ Arionkoder- and Nicolás Moreira -Head of Engineering @ Arionkoder-.
Artificial intelligence (AI) has allowed us to create automated agents that can reproduce human cognitive tasks with remarkable accuracy. Recently, large language models (LLM) have shown excellent capabilities in natural language processing (NLP) tasks such as reading, summarizing, translating and predicting future words in sentences, mimicking our abilities to talk and write. In one of our previous posts we even showed how Google exploited its LLM to make helper robots write their own code (If you haven’t read it yet, you should take a look!). In parallel, AI agents driven by strategic reasoning algorithms have been applied to compete with humans in board games such as Go, chess, or even Poker, beating some of the best players in the world.
But what if winning a game requires not just knowing where to put each piece on a board but explicitly talking with other players to make common strategies, foul them or convince them about a secret agenda?
There’s already a game like that: Hasbro’s Diplomacy, a board game in which the powers of post-Victorian Europe have to cooperate and clash with one another to master the Continent. In Diplomacy–rumored to be JFK’s and Henry Kissinger’s favorite game–seven players have to coordinate their actions through natural language negotiation before making any move in the game, plotting secretly against each other in the pursuit of conquering as many places as possible.
For decades, this has been seen as one of the most challenging scenarios for AI to intervene in. Just picture the difficulties: you need the machine to not only comprehensively assess the scenario and design a plan to win the game, but also to talk with other players and convince them to collaborate with it without revealing its purpose. This last point obligated researchers to focus on simplified versions of the game, ignoring the natural language communication part.
Until now. In a recent paper published in the prestigious journal Science, Meta AI (the AI division of Facebook’s creators) unveiled CICERO, the first AI agent able to play Diplomacy as it really is, without removing any part of the game. CICERO is capable of understanding what the other players are trying to achieve by analyzing the status of the board and previous conversations. At the same time, it can negotiate a plan to undermine other peoples’ goals, suggesting shared objectives that can ultimately benefit its own agenda, and communicating these suggestions with strategic intent. All in plain natural language–as humans do– but simultaneously–to every other player at the same time–in a tremendously realistic way–sometimes reassuring allies about their intentions, discussing broader strategic dynamics or even using chit-chat when necessary–.
This was possible thanks to Meta AI researchers figuring a way to make an LLM to interact with a symbolic reasoning engine. By itself, NLP algorithms are only able to reproduce other peoples’ sayings by learning from big corpuses of text mined from the Internet. But here Meta has gone one step further: a planning engine driven by a strategic reasoning algorithm designs the most convenient plan to bring the model closer to victory; then, a controllable dialogue model receives these strategies and translates them into coherent, understandable and realistic dialogue. Thus, CICERO foresees potential movements from the other players based on past evidence, creates its own plan and carefully uses free-form dialogue to negotiate with other players. This was achieved by training the agent using data from 125.261 games of Diplomacy played through a website, 40.408 of which contained a total of 12.901.662 messages exchanged between players. To align the messages with strategies, every single message in the training set was annotated with a set of actions corresponding to the message content, referring to the intent of the text.
CICERO was evaluated by playing 40 games against 82 different human players in total. 5.277 messages were shared during 72 hours of total gameplay, and CICERO achieved more than double the average score of the other players. Ultimately, it ranked in the top 10% of participants who played more than one game. From not being able to play the game as it is at all, to human-level performance. Remarkable, isn’t it? Of course, the agent is not perfect: sometimes it produces inconsistent dialogues and mistakes, asking something in one message that contradicts what it suggested before. But humans make mistakes as well, right?
Yann LeCun, one of the founding fathers of Deep Learning, recently said that “an agent that can play at the level of humans in a game as strategically complex as Diplomacy is a true breakthrough for cooperative AI”. And it certainly is. Not because of the game itself, which is yet another playground for trying and crafting new AI tricks, but because of its implications towards creating much more immersive AI tools. An agent trained for pursuing a goal and with the capability to realistically interact with humans and negotiate with them renders useful applications such as helping humans achieve new skills, automating complex tasks for us or even outsourcing negotiations from us! At the same time, there’s a downside that we should envision, too: what if a model like CICERO was trained with wrong or twisted intentions such as making profit out of confused users, or convincing them to take wrong actions in a real world environment? As usual, this technology draws yet another line to avoid crossing for responsible AI.
In any case, we’re just seeing another field in which science fiction turns less fiction and more science: we as humans can already create AI agents capable of negotiating in plain language to achieve its own goal. What other applications do you envision for this technology? Can you imagine potential scenarios in which such AI tools or NLP models can be used in your business? Reach out to us now so we can help you accomplish it together!