AI has the potential to change the world in many amazing ways. But like every revolution, it requires fuel. It’s long been said that “data is the oil of the information age,” and that’s certainly true in many ways. But while data is a less finite resource than actual oil, it does come with some challenges.
People are (rightly) protective of their personal data, and there are compliance and regulatory responsibilities that must be upheld if we’re using that personal data (often the most valuable kind of data) to power AI and generate predictions. Additionally, although data is in abundance everywhere – pretty much everything we do generates data – getting the right sort of data, at the time you need it, isn’t always straightforward. Generating or collecting specific types of data, when you have specialist requirements, can be expensive, time-consuming, and tricky.
As an example, we can look to the work of Affectiva – a leader in the field of “emotional” artificial intelligence. It builds systems that help machines understand the emotional or cognitive states of human beings. A function of one of its core business units helps vehicle manufacturers to create smart in-cabin systems – one such system is designed to detect when we might be feeling drowsy or in danger of falling asleep at the wheel.
It does this by using cameras to track our facial expressions, and analyzing the data with machine learning algorithms designed to keep track as we get fatigued the course of long journeys. Because these systems have to work for anyone who uses them, they need to study a huge number of faces in order for it to be able to recognize the signs that a person it has never come across before is getting sleepy.
Until recently, Affectiva gathered this data by hiring humans to sit in a driving simulator for periods of up to six hours. When many thousands of faces are needed, clearly this is an expensive undertaking – not to mention very dull for the data subjects themselves!
This is where the concept of synthetic data comes in. Computers are now capable of generating images of faces that are virtually impossible to tell apart from photographs of real people, that can be made to behave in a completely realistic manner. To solve this data problem, Affectiva partnered with Synthesis AI, a startup that recently closed a $4.5 million funding round based on its ability to create synthetic data, including faces of humans who have never existed. Synthesis’s mission is to reduce the cost of AI by billions per year by reducing the need for companies to collect, store and label “real” data, in a legally and morally compliant fashion.
The use cases for synthetic data are practically unlimited. AI algorithms used to pilot self-driving cars can be “driven” in simulated conditions, creating hours of driving experience without any risk to real-world road users. Likewise, algorithms that diagnose illness from medical images can learn to become increasingly accurate in their assessments, from studying computer-generated images of human bodies and organs.
CEO of Synthesis AI, Yashar Behzadi, told me that his business incorporates technology and methods pioneered in the movie industry to create very lifelike images and move them in a realistic way. These assets are then “supercharged” with generative AI modeling that allows any number of variations to be created very easily. Essentially, this can include any combination of age groups, genders, and ethnicities, and all these variables can be tailored to ensure that the result is a dataset that is truly diverse and representative. This has the potential to help iron out some of the issues around AI caused by bias – often a major challenge in AI development, with often alarming consequences when it goes wrong.
“We make fair AI systems,” Behzadi tells me. “By being able to declare distributions of data, you make sure you’re representing all the classes you’re interested in, and you’re able to create well-balanced datasets … [and] you can do it in a very privacy-compliant way. With synthetic data, you don’t have to worry about breaching GDPR or regulations. This … democratizes access; smaller companies can compete and win. This has always been fundamentally what’s driven us to build these systems.”
Even seemingly minor details like hairstyles and sunglasses can be modeled in this way so computers can learn to understand how they might impact their ability to understand humans. Camera angles can quickly be changed, too – which was helpful for Affectiva when, having captured a lot of data from cameras facing drivers head-on, from the steering column, it realized that images taken from a rear-view mirror position were actually much more insightful. Rather than having to re-do thousands of photographs, the computer images can be very quickly re-rendered from a different point of view.
Rana el Kaliouby, CEO of Affectiva, told me, “Humans are very complex; we’re interested in the nuanced, complex emotional states, like frustration, confusion, or fatigue. What does fatigue look like? We can combine that with something like object detection – you have a phone in your hand – to augment all of this and to detect activities and behaviors, and it becomes really powerful. The goal is to combine these multiple modalities to get a very holistic understanding of what the person’s state is, then have the technology respond in real-time.”
Synthetic data has the potential to help machines understand human emotions and nuances in many other ways, too. Human-to-machine communication will likely play an increasingly big role in human society as time goes on, and building machines that are capable of developing a deeper understanding of us, beyond merely the buttons we press or the words we use, will help to make that relationship more productive.
“I’m super-excited about the applications in social robotics, conversational interfaces, and the internet of things,” el Kaliouby says. “One of my favorite examples is a smart fridge, and it knows you are stressed because it has a ‘mood chip,’ and it says, ‘you know what, you’re about to have your third tub of ice-cream and I’m not going to let you do that. Now that’s going to require human perception AI; human emotion AI to understand your state.”
Greater access to synthetic data is also likely to lower the barrier-to-entry of AI to smaller businesses, many of which may have the vision and innovation to create truly new applications. Without the expensive production and compliance requirements, the doors are open for new entrants to get involved.
“Synthetic data is a very democratizing force; you can log in and start creating loads of images and train your system,” says Behzadi.
“That really changes who can access, who can benefit and who can contribute to the AI space, and I think that will break down some walls that have existed, where the larger technology companies have built these data models … ultimately to sell advertising. And I think there’re so many better benefits out there if we can unlock those use cases.”
You can watch and listen to all of my conversations with Rana el Kaliouby, CEO of Affectiva, and Yashar Behzadi, CEO of Synthesis AI, here.
Where to go from here
If you would like to know more about , check out my articles on:
Or browse the Artificial Intelligence & Machine Learning library to find the metrics that matter most to you.