Artificial intelligence turns a single photo into a “real-life” presenter
Jenny, the presenter in a corporate video, is explaining the onboarding process to new employees. She is professional and engaging. But she’s not real.
She’s a photorealistic avatar, created from a five-minute video session with a real person. That encounter provided enough data for artificial intelligence to mimic her voice and movements so that what you see looks exactly like her. But isn’t.
D-ID, a tech startup in Israel, specializes in “hyper-real AI presenters”. Type in as much text as you want and he or she will present it like a real person, but at a fraction of the price of hiring one.
A live actor, in a studio with a camera operator, lights and sound recording, would cost $1,000 a minute. Avatars are an attractive and affordable alternative.
Anyone who’s endured a “Death by Powerpoint” presentation about safety, compliance or a new company initiative will tell you that a human – or near-human – presenter is more likely to keep you awake than another graph, stock photo or flow-chart.
“With a human face people are more likely to engage with information, to watch the video, to complete the course, and to absorb the information,” Gil Perry, CEO and Co-founder of D-ID, tells NoCamels.
“Our technology cuts through the headache of corporate video production to effortlessly create high-quality, cost-effective, professional videos in any language at the click of a button.”
D-ID has developed technology that allows its clients to create avatar-led training videos quickly, cheaply and efficiently.
In addition to Jenny it has a whole cast of “actors”– choose whichever one you want, then select one of the 270 voices, 119 languages, and a range of accents.
There’s even a range of presentation styles – angry, cheerful, sad, excited, hopeful, customer service, newscast.
The sophistication of the tech is remarkable. During a Zoom session Perry photographed me, pasted in a paragraph of text describing the NoCamels website, and brought the still picture to life in under a minute.
The voice is not mine (it’s a guy called Eric), so it wouldn’t fool my friends or family, but the AI has added a whole series of facial movements and even filled in missing parts of background that are exposed when my head moves. That’s all from one low-quality still image and you can try it yourself by uploading a picture here.
The company – founded in 2017 by three veterans of the IDF intelligence corps – started life in facial recognition, using algorithms to “de-identify” photographs (so the name D-ID) with tiny modifications so that they would remain recognizable to humans, but would fool biometric readers used by Facebook and many others.
The company, based in Tel Aviv, became a world leader in deep learning, computer vision, image processing and computational photography.
Sign up for our free weekly newsletterSubscribe
But Perry says they then realized the technology they’d developed could be applied elsewhere – to creating narrated content.
“We can radically reduce the cost of the video productions, we can increase the value of their existing boring assets, and we can make personalized and targeted content at scale,” he tells NoCamels.
“The biggest problem is completion rate. People just don’t watch. They don’t read all the stuff about onboarding, cyber-compliance, sexual harassment. They just press next, next, next next, to complete the course.
“We sell mainly to the corporate training and learning and development departments at larger enterprises. We help them make more engaging content, which is better understood and better remembered.”
“It’s often the case that managers who appear in videos as presenters are just not good actors. And employees are uncomfortable watching them.”
One of the advantages of using avatars rather than people is that it’s much easier to update or add to the script. Giving Jenny some more text to read beats hiring an actor, crew and studio all over again.
“D-ID’s work has already generated more than 100 million videos,” says Perry. The company is now offering a self-service version of its Creative Reality platform to smaller companies, and says the potential for growth is huge.
It has strict policies in place to prevent abusive use of its technology, and ensures all its videos display an AI symbol to indicate that they are computer-generated.
Aside from corporate training videos, D-ID, which has 45 employees and has raised $47 million from investors, is finding additional uses for its technology, which can be scaled up with virtually no limit.
The CEO of one of its client companies was able to send a personalized video to every subscribers – well over 100 million of them – all different, all addressing the subscriber by name, from one still photo.
The future offers even more opportunities. “We are now working on a real-time streaming,” says Perry.
“So you will be able to conduct video calls but without the camera. I could be in the kitchen washing dishes or on the beach, but Ill be able to choose a better photo of myself and present to you with the camera off.”