Nvidia’s new AI chatbot NPCs made me feel uncomfortable
Talking to AI-powered chatbot characters was the most fun I’ve had in a game for ages, but something felt off.
They feature in a revolutionary Nvidia tech demo called Covert Protocol, a first-person blend of LA Noire and Hitman where you progress through your mission using the art of conversation. And you can ask anyone anything.
Covert Protocol is a collaboration between Nvidia ACE (Avatar Cloud Engine) and AI gaming startup Inworld AI. It exists only as a proof-of-concept for now, but the ramifications on video games could be huge.
I play Marcus Pierce, a corporate spy sent to investigate foul play at NexaLife. The task: obtain incriminating information about CTO Martin Laine by snooping around his hotel.
The demo starts and I’m greeted by Tae the bellboy. To interact with him you hold down a push-to-talk key and speak into the microphone. Your words appear on screen, and after about a three-second wait, the NPC responds. “Hey, how’s it going, Tae?” asks the NVIDIA rep overseeing the demo. “Hanging in there,” says Tae. “This flu season is no joke.”
The delivery is monotonous, but human. He sounds less like a robot and closer to a voice actor who’s bored. Brief pauses between the odd sentence even convey the sense he’s thinking of what to say next. It’s impressive given he’s entirely artificial, and that’s what makes me uneasy.
It’s not just the lack of humanity underpinning the experience, but the realization this same technology could put untold numbers of voice actors and performance artists out of work.
Veteran actor Yuri Lowenthal is one of them. He plays Peter Parker in Insomniac’s Marvel’s Spider-Man, and has over one hundred video game credits since debuting in 2003’s Medal of Honor: Rising Sun.
“A lot of the studios are using the excuse, ‘oh, don't worry, we'll only use AI for the little jobs’ to allay our fears,” Lowenthal says. “A) That's a load of s***. They'll use it for whatever they can, but B) the small jobs (which, by the way, I still count on for work) are what allow us to survive as actors and get better as actors. If those are eliminated, you may well be eliminating the next generation of actors, period.”
“I am very concerned,” Cissy Jones tells me. She’s best known as Delilah in Firewatch and Joyce Price in Life is Strange. “When we start talking about making entire games with only AI voices, it feels like a recipe for disaster. Not only for voiceover (where the threat is very real and very scary), but for humankind. Isn't the ‘humanness’ of certain characters what we love? Who would Ellie be without Ashley Johnson? I guarantee you no AI model could replicate the heart and soul that went into Ashley's creation of that character.”
It’s difficult to reconcile, especially since the demo felt so different from anything I’ve ever played. What’s incredible is how NPCs reference earlier bits of conversation. For example, when our NVIDIA guy asks what Martin Laine’s like, Tae says he’s a “diva.” Apparently, that’s the first time he’s ever referred to Laine as a diva. I take the mic and ask Tae for some general life tips. “My advice? Don’t keep the diva waiting,” he says, again demonstrating his short-term memory.
Tae’s got a sense of humor too, joking I should “keep an eye out for anyone wearing a crown or fanny pack full of diamonds.” The longer you talk to someone, the more rapport you’ll build. It doesn’t just result in fun conversational callbacks, but may help you extract some vital details.
For example, if you ask Tae about his hopes and dreams, he’ll tell you he wants to be a cocktail waiter. Poking around near the bar reveals a cocktail recipe, and nearby, a badge so we can convince a high-ranking company member to talk to us.
That member is Diego, and he’s extremely sarcastic. “Where’s Martin?” I ask. “Let me check my schedule,” Diego responds. “Oh wait I forgot, I’m not Martin’s assistant.” I try to sweet-talk him by telling him his shoes are nice. “Thanks,” he says coldly. “They’re Italian leather.” I don’t like Diego.
The Nvidia rep tells me I should convince Diego there will be trouble at his upcoming keynote speech, which is not something I would have thought of. Without a guide standing by I can imagine parts of the experience would frustrate given I’m essentially trying to progress through an extremely structured mission by having an open-ended conversation. Sometimes, there’s no obvious end goal in sight, but that’s more of an issue with signposting – the game rather than the tech.
I tell him there’s a big fire in the building and he needs to get out now. He’s not particularly bothered, saying his “Italian leather shoes will be fine.” I stress that he will be burned alive very painfully, but he doesn’t move an inch. Apparently, says the Nvidia rep, I need to mention “Martin Laine”.
Martin Laine will be burned alive in a big keynote fire, I specify. These are the magic words that compel Diego to stand up, march over to the front desk, and tell the concierge, “I have an urgent message to give Martin Laine at room 807.” And with that, we have Laine’s room number.
Yes, some elements of the demo are a bit obtuse, boiling down to a game of ‘guess the prompt’, but the freedom of saying anything you want to an NPC and getting a convincing, even unique reply, is remarkable.
The three-second delay between your message and their response is also disruptive, but Nvidia says that’s because the demo uses the cloud. It can be played locally on your GPU, which should theoretically reduce it.
Every NPC is different. There isn’t just one set AI applied to everyone. Instead, the developers build individual personalities using a set of parameters. These include what they know, their goals, their relationships with other characters, and how much they remember. There are also sliders such as sadness/joy, anger/fear, negativity/positivity, and aggression/passivity.
Some NPCs are, as Nvidia puts it, “guardrailed.” This means, that no matter how much you talk to them, it won’t help in your mission. The hotel concierge wasn’t spilling any details about Martin Laine, as per company policy, despite numerous attempts at flattery. Guardrailing is also on a slider though, so if that staff member was set to 90% rather than 100%, there’s a small chance she’d crack.
Nvidia ACE, for good or bad, is not going away. Imagine the possibilities in an F1 game where you can communicate with your team about the state of your vehicle. Or a military shooter in which you have a briefing with squadmates before a battle and build a picture of the mission ahead.
You might even develop a relationship with certain squadmates throughout the game, which could unlock new weapons or skill buffs. I’d definitely do a team-building exercise with Master Chief. We could go bowling.
What about the fallout on creatives? Gerardo Delgado Cabrera, Director of Product Management at Nvidia, is optimistic the technology will result in more jobs for developers, not less. “All of these things are going to increase productivity…If you can make Cyberpunk in seven years instead of ten, that's a massive profitability shift for them. So with that, for example, I'm hoping they'll be able to employ more people because they save money by shortening development by three years and developers don't have to crunch and work 60 hours a week.”
It’s the actors and voiceover artists who games like Covert Protocol will hit hardest, however. As I said, despite its brilliance, the demo made me uneasy. Not just because I was talking to an AI pretending to be a real person, but because my enjoyment could put a lot of people out of work.