Personal assistants like Siri, Alexa, Cortana, or Google Home can parse our spoken words and (sometimes) respond appropriately, but they can’t gauge how we’re feeling—in part because they can’t see our faces. But in the emerging field of “emotion-tracking AI,” companies are studying the facial expressions captured by our devices’ cameras to allow software of all kinds become more responsive to our moods and cognitive states.
At Affectiva, a Boston startup founded by MIT Media Lab researchers Rosalind Picard and Rana El Kaliouby, programmers have trained machine learning algorithms to recognize our facial cues and determine whether we’re enjoying a video or getting drowsy behind the wheel. Gabi Zijderveld, Affectiva’s chief marketing officer and head of product strategy, tells Business Lab that such software can streamline marketing, protect drivers, and ultimately make all our interactions with technology deeper and more rewarding. But to guard against the potential for misuse, she says, Affectiva is also lobbying for industry-wide standards to make emotion-tracking systems opt-in and consensual.
Business Lab listeners are invited to apply to join the MIT Technology Review Global Panel, our exclusive forum of thought leaders, innovators, and executives. As a member of the global panel you can examine today’s tech trends, see survey and study results, have your say and join your peers at business gatherings worldwide.
SHOW NOTES AND LINKS:
MIT Media Lab Affective Computing Group
FULL TRANSCRIPT:
Elizabeth Bramson-Boudreau: From MIT Technology Review, I’m Elizabeth Bramson-Boudreau, and this is Business Lab, the show that helps business leaders make sense of new technologies coming out of the lab and into the marketplace.
Elizabeth: Before we get started I’d like to invite listeners to join the MIT Technology Review Global Panel, our exclusive forum of thought leaders, innovators, and executives. And as a member of the global panel you can examine today’s tech trends, see survey and study results, have your say and join your peers at business gatherings worldwide. Apply to join the panel at TechnologyReview.com/globalpanel. That’s TechnologyReview.com/globalpanel.
Elizabeth: Now, wouldn’t it be cool if your phone could tell that you’re in a grouchy mood from all the day’s interruptions and hold your calls so that you get some work done? Or wouldn’t it be great if your daughter’s tablet computer could tell when she’s bored with her educational game and increase the challenge level to keep her engaged?
Elizabeth: Well, before our devices can serve us better in ways like this, they’re going to need to understand what we’re actually feeling. And that’s what I’m talking about today with my guest Gabi Zijderveld. She is the chief marketing officer and head of product strategy at Affectiva, a startup in Boston that’s a leader in the new field of emotion-tracking AI. It’s a spinoff from the MIT Media Lab’s Affective Computing Group. That’s Affective with an A. Affectiva builds algorithms that read people’s faces to detect their emotions and other cognitive states.
Elizabeth: The technology is already helping big companies test how audiences react emotionally to their ads. And now Gabi is leading a project to equip cars with software that will monitor drivers cognitive and emotional states and help keep them safe and awake. It could all amount to a big leap forward in the way we interact with computing devices. But of course, it also raises some tough questions about how to keep algorithms that can read our emotional states from exploiting our attention or invading our privacy.
Elizabeth: Gabi, welcome, and thank you so much for visiting us.
Gabi: Thank you so much for having me.
Elizabeth: The name of your company, Affectiva, is a play on words, and it’s play on the term affective computing. Can you define what affective computing is, please?
Gabi: Affective computing is basically designed to bridge the divide between human emotions and technology. And affective computing enables technology to understand human emotions and then adapt and respond to these emotions.
Elizabeth: So Affectiva, as I understand it, spun out of the Media Lab in, what, about 2009?
Gabi: Yes, correct. Almost 10 years ago.
Elizabeth: OK. And and the co-founders are Rosalind Picard who is head of the Media Lab’ Affective Computing Group, and Rana El Kaliouby—I’m not sure if I’m saying that right.
Gabi: Rana El Kaliouby.
Elizabeth: So she was a postdoc at that point in the group, right?
Gabi: Correct.
Elizabeth: What were the big ideas that the two of them were bringing to the table in 2009, and in their view, what was missing from computing, and what did they hope to change?
Gabi: Dr. Rosalind Picard actually started the field of affective computing. She wrote the seminal book about two decades ago, called Affective Computing. So this field really is her brainchild. And today she still runs the group at the MIT Media Lab. So Rana, Dr. Rana El Kaliouby, joined Ros Picard’s group as a postdoc, and together they were building out the idea that technology could have the ability to understand and respond to human emotions, to basically improve human interactions with technology to make them more relevant, more appropriate, but also maybe to help humans get a better grasp or better control over emotions. In the early days especially there was a lot of focus on the applications in mental health, especially helping children on the autism spectrum, to use technology to teach them how to recognize or understand emotions and then coach them on how to express their own emotions appropriately. So that that where, really, this idea started in the early days.
Gabi: And then Rana and Ros started getting a lot of interest out of industry. So at MIT Of course there’s lots of events and conferences where industry members come to get a sense of what’s new in technology and what’s evolving and in these demo days they started getting a lot of commercial interest in their technology. Out of a number of different industries, actually, including automotive, which interestingly enough is where we’re very active today. At the time they went to director of the Media Lab, and said, “Hey we need more budget to hire more researchers,” and aptly he advised them, “Well, it’s time you spin off and start your own company.” And that’s how in 2009 they co-founded Affectiva. Ros Picard is now heading up the group at MIT Media Lab so on a daily basis she’s no longer involved with the company. But Dr. Rana El Kaliouby today is our CEO.
Elizabeth: As I understand it you’ve got two main products. You’ve got one product that is focused on market research and another one—you mentioned automotive—it’s about driver safety. Can you say more about those two products? Maybe start with the one that’s focused on market research. Is that called Affdex?
Gabi: So actually there are different, more than just these two products that we have. So there’s different ways we’ve packaged up our technology. But those two markets that you were describing are really the key markets we’re going after today. So the first one, where we have our technology, Affdex, for market research, is a product. It’s a cloud-based solution that basically enables media and advertisers, including the big brands of the world, to test their content, such as video ads and TV programming, with target audiences. And in that market we’ve been the market leader for a good number of years we’ve had that commercial product out there for close to eight years at this point in time. And today about one-fourth of the Fortune Global 500 uses our technology to test all their ads around the world. I think as of this month we’ve probably tested more than 40,000 ads in 87 countries, and we’ve analyzed more than seven and a half million faces. So huge amounts of data that we have. And that’s enabled us to build a product that can also help these advertisers predict key performance indicators in advertising. So emotion data, or emotion analytics, can actually help them predict the likelihood of content to go viral, or purchase intent or sales lift.
Elizabeth: Ok. Now help me understand, how does it how does this actually work. So is this is it taking a video of someone while they’re observing an ad, for instance? And then it it analyzes the reactions of the face to the eyes?
Gabi: Yeah essentially, that’s how it’s done. In terms of how we typically work is, we work with large insights firms or market research firms, companies like Kantar Millward Brown. They have huge research processes in which they engage with their brand clients to understand how their advertising and go-to-market needs to take place. Now we’re part of their research methodologies, meaning that our technology is integrated into their overarching platforms. And typically how it would work is, they have paid panelists that are recruited to participate in these consumer insights studies. Part of these studies, there might be a survey component, but there’s also a component that says, OK we’d like you, online, to watch a piece of content, which could be TV programming or a video ad, and we ask you to opt in and consent to us recording and analyzing your face as you watch that content. And that’s where our technology comes in.
Gabi: It’s a cloud-based solution. All we need is to basically take, with permission, access of someone’s camera, and as they watch this content, sitting at home or wherever they happen to be, on their device, we record kind of unobtrusively in the background, their moment by moment reactions to that content. So frame by frame we analyze these responses. And interestingly enough, our research has shown that people quite quickly forget there’s a camera there. They just naturally react to whatever they are viewing. And it’s that’s kind of unbiased and unfiltered reaction that you want. Because with that insight, if you then accumulate that at scale, you can make really important decisions about your content and even your content placement or how you spend your advertising dollars.
Gabi: So that essentially is the first markets where Affective got started. Today we are still very active in this market. Another market for going after really with full force right now is actually automotive. And in the past year, almost a year ago, we launched a new solution for that market called AffectivA Automotive AI. Basically this is our core technology packaged and tuned to the automotive industry, because the use cases there are very different. They’re twofold. On the one hand in automotive as we all know, road safety is a key issue. There’s just lots of fatalities and tragic accidents that take place on the roads every single day due to distracted driving and drowsy driving. Now, what if you could detect that a driver was distracted or getting drowsy and have the car intervene in a relevant and appropriate manner? That that’s one thing that these automotive manufacturers are all going after. And this is where our technology comes in, because again just using cameras that are in cars already today, we can quite simply and unobtrusively understand people’s emotional states and complex cognitive states, such as drowsiness and distraction, by analyzing their face. So that’s one use case in automotive. Basically driver monitoring to help improve road safety.
Elizabeth: You must have quite a lot of data that you need to use to train your systems in order to be able to read the faces of a large number of people. Can you talk about where your training data is coming from, and what kind of a boost you’ve gotten from the revolution in machine learning and deep learning over the last five, 10 years? Can tell us a little bit about your data processes?
Gabi: Yeah absolutely. So maybe I should start with machine learning and deep learning and why we actually use that. So when you think about human emotions and how these kind of evolve and manifest, human emotions are actually very complex, often extremely subtle and nuanced. And then when you think about complex cognitive states, which technically aren’t emotions, things such as you know drowsiness and distraction, those are also things that evolve over time. And it’s rarely prototypical. Rarely in the real world do you see that exaggerated smile or someone falling asleep right away. It’s temporal. And being able to model for those complexities, you cannot do that with a rules-based heuristic approach. You really need to use machine learning to be able to detect those type of complexities.
Gabi: So that’s why a good number of years ago, our R&D really shifted to have all of our technology being built with machine learning approaches. Now machine learning and deep learning architectures need to be fueled by massive amounts of data. In addition to that, for us, again when you think about modeling human states, obviously people don’t look the same depending on age, gender, and ethnicity. And then there’s also cultural influences and cultural norms that kind of change sometimes the expression of emotions in human states. So in addition to being able to fuel deep learning, we also need large amounts of data to account for just the diversity that exists in humankind, diversities that exist around the world. So for Affectiva, data is essential to everything we do. And we’ve analyzed massive amounts of data and we’ve collected massive amounts of data. As a matter of fact, we’ve analyzed over 7.6 million faces in 87 countries.
Elizabeth: And where are you getting that data from?
Gabi: In a number of different ways. So first and foremost what I’d like to say, because this is so important to us, all this data is collected with opt-in and consent. We always, either we recruit people to have their data collected, or it’s through online mechanisms where we explicitly tell people that we’re collecting data and ask them for permission to do so. Also that data is for the most part anonymized. So, Elizabeth, if you participated in one of our studies, there’s just no way I could find your face back. Because essentially you’re a face. You’re not a named individual. So we do feel strongly about that.
Gabi: We collect this data in a number of different ways. As I mentioned before we’re very active in media and advertising and through our partnerships in that industry, we have done a huge number of media tests, and it’s through that that we’ve collected massive amounts of data. There’s other client relationships where we have, basically, data sharing agreements. Not all of our clients want to share their data, but some of them do. So that’s another path through which we get data. And then when you think for example about the automotive industry, and let’s use the example of drowsy driving, so we have this massive foundational data set that allows us to build these algorithms. But we don’t necessarily have huge amounts of drowsiness. Now in order to model for that and build algorithms for that, you don’t need just drowsy data but you do certainly need a certain layer of that data on top of what you have already, so you can tune your algorithms for that.
Elizabeth: So you can discern between a drowsy look and, I don’t know, a bored look.
Gabi: Exactly, exactly or distracted right because those manifest differently.
Elizabeth: And they have different consequences as a driver.
Gabi: Oh absolutely. Absolutely. And also in terms of how you collect that data in the vehicle, there’s some kind of operational challenges as well, depending on camera placement, camera angles. And now of course we need to support near-infrared cameras that are being used, because when you drive at night or in a tunnel, the lighting conditions aren’t that good. So these are all environmental conditions for which we would have had we have had to train our algorithms. But when you think about it, capturing drowsy driving data, it’s not that easy. Because it’s not like we can keep people up for 48 hours in one of our fantastic sleep labs around Boston and then send them down Memorial Drive in a car and see if they fall asleep. That’s something that we don’t necessarily want to do.
Gabi: So it’s also a matter of collecting massive amounts of data, mining our data for natural occurrences of those states, and then also doing very specific studies targeted at demographics that are inclined to be sleepy when they drive. For example we’ve done a number of studies with shift workers—people that might work long shifts in, for example, let’s say a factory, and then have to drive home in the middle of the night. You have more likelihood of capturing drowsy data that way. So there’s a variety of different ways that we’re collecting our data. That gives us a massive data repository and then a subset of that data is used to model your machine learning classifiers. And then you carve out another subset that you use for training and validation. So you kind of keep those separate. And we’re continuously collecting data, continuously annotating that data. It’s just an ongoing aspect of our R&D efforts and growing the repository that way.
Gabi: Right. So what you’ve just talked about is the ways in which you’ve been engineering the reading of emotions. Now, what about the need to program the computers to interpret and use that information. Isn’t that a lot harder to do?
Gabi: It depends. Whether or not it’s harder to do depends a little bit on what the interactions are. And usually that is the ultimate design decision of our client, but it’s very much also a collaborative process. For us to develop these algorithms that can detect and analyze human emotions, it’s also critically important to understand, what are the use cases? How do they want to use that technology? Because you can’t just build these algorithms in a vacuum. So it’s very much a collaborative process.
Gabi: So I was saying earlier that we’re quite active in the automotive industry right now. So it’s an ongoing dialogue with car manufacturers as to how they use our data to then design adaptations or interventions in a vehicle. And some of this is very much an evolving process. If you can see that someone is getting distracted in a vehicle, you don’t want to necessarily have all these alerts and alarms going off, if it’s just minor distraction, right? It might infuriate people or aggravate them even more and cause even more dangerous driving behavior. You want to be able to understand levels and intensities and frequency of distraction, and then design very subtle, relevant, and appropriate interventions.
Gabi: And there’s also a future state vision, and we’re certainly not there from a technology perspective, but I do think we’re heading there in the future. What if you could personalize that to the individual? So maybe when you get drowsy you would like to listen to hard rock music. Maybe even I’m drowsy, I just absolutely want to get out of my car and stretch my legs and walk around. And the way that my car or a future robot taxi services my needs...
Elizabeth: Is adapted.
Gabi: Is adapted to my personal needs in the moment, right. So the promise of potentially building this in a personalized fashion, I think we’re heading there in the future, but not yet there today, and I don’t think we’ll see that on you know in cars on the road anytime soon.
Elizabeth: I’m interested in the extent to which you all are thinking about the backfire potential of this. So right now of course we’re talking a lot about Facebook. We’re talking a lot about the 2016 elections. We’re talking about the manipulation that we feel pretty sure has occurred through social media platforms like Facebook. And I wonder to what extent you worry about what could be done with Affectiva’s technology and through the reading of the way people respond to certain things and therefore the adjustment of that messaging to make it more impactful. Do you worry about what the kind of unforeseen consequences of this technology might be, if it’s not managed properly.
Gabi: Of course we worry about that. But I think every single technology company needs to worry about potential adverse applications of the products they design. Because frankly every single bit of technology that we use on a daily basis can be used for mal-intent or the nefarious purposes. Think about the truck. That’s the transportation mode of choice for terrorists. Or google maps. Right. So those technologies and those systems weren’t designed for those use cases. So I do think first and foremost as technology companies, you always need to be mindful of that, especially now that technology has become so accessible, and compute power is so strong, and it’s at every consumer’s fingertips. You have to be mindful of that.
Elizabeth: But when you have a toolkit, do you worry about what happens if that tool kit can be used in ways that you all wouldn’t necessarily be able to guard against?
Gabi: Yeah absolutely. So there are things that companies can do and things that we have done. So I was just kind of speaking just now in generalities as to what I would wish technology companies would continually think about. But to your earlier question, going back to the original question, do we worry about that? Absolutely, yes. And what are we doing about it? A number of different things. So first and foremost, our technology, we’re very careful as to who we license it for. And we’re getting a lot more strict in that as to where we were maybe even a few years ago. So it’s not like anyone out there can just grab our technology and build something with it.
Gabi: There is also license agreements or legal documents that we have in place that safeguard against that. We also have stated as a company that there are certain types of use cases that we will just not sell our technology to. We believe in opt-in and consent, because when you analyze things such as human emotions, emotions are extremely private, and we do not want to engage in security or surveillance where people do not have the option to opt-in or consent to their faces being analyzed. And we have actually turned down business that would have taken us down that path.
Elizabeth: We wouldn’t even be where we are right now if we aren’t we weren’t all feeling a great deal of kind of cynicism or skepticism around technology’s ability to be harnessed, or kept from unforeseen negative consequences. Right? So in a sense it’s giving everybody, all of you guys are falling under closer scrutiny, because we’re feeling gun shy about technology. And we know that regulatory authorities in Washington are ineffectual in this respect.
Gabi: They are because they don’t understand it right. If you have senators asking the leadership of Facebook how they’re making money, because they don’t understand the core concepts of personalized ad targeting, then we have a problem. It’s an education issue as well. But on top of that, an interesting friction, right? Because there’s also I think aside from huge responsibilities that technology companies have, and where maybe some have been lagging or negligent, what about the consumer, right? Because there is this perceived value to be had. We we like using social media platforms and we’re okay sharing our lives there, because we perceive to get value out of that. And we as consumers don’t ask a lot of questions, and that too worries me. Especially when you know I have a daughter who’s about to turn 13 and I suspect will spend a lot more time on devices and social media. How do you educate for that? Even as a consumer, just these systems have also gotten hugely complex. Just go into your iPhone settings and try and kind of figure out where data is going, and how it’s flowing, and what do you want to shut off when, and how and how do you even do that?
Elizabeth: It’s very hard to decipher.
Gabi: It’s not very intuitive, right? Deliberately so. And you have to make a point out of going out there and finding information and doing it and reversing it, rather than the other way around, where maybe data is kept private all the time and you go in and you allow access. So there’s a huge friction there I think between value consumers perceive to get versus value the technology companies actually get with the data. Transparency in that. So for us as a company we certainly do worry about that.
Elizabeth: You have these conversations.
Gabi: Oh yeah. Continuously. And also in public fora. We joined the Partnership on AI, which is an industry consortium designed to basically realize fair, accountable, transparent, and ethical AI. And we were one of few start ups were invited to be part of that. But that’s one way that we’re hoping to drive for change. And also we’re lucky in that Rana, our CEO, is very much a thought leader in AI, very much a public persona. She has opportunities to be out there and speak in public settings. And she wants to be very vocal about these issues because we have a strong opinion on that. And we also feel we have a social responsibility to be transparent about this and to advocate for change. Inasmuch as a 50 person startup can do that. But we all need to contribute our share.
Elizabeth: So when I think about what the impact of emotional AI or emotion AI could be down the path, does it mean that, Siri will get, Alexa will get better at understanding my emotions and responding to me in in accordance with my emotions? And if so what does that mean for the future? What does it mean if our devices are smart in this way about us as emotional beings?
Gabi: So today of course where we’re connected by hyper advanced systems and technologies. Advanced AI. Lots of cognitive capabilities. But really what’s missing is this emotion awareness. These systems for the most part do not understand our states, our reactions, our well-being. And we at Affectiva certainly believe that that makes for very ineffective and superficial interactions with technology. So what if these systems could understand our emotions and our cognitive states and our reactions and our behaviors? How much more effective would our interactions with those technologies be?
Gabi: So in the future I certainly envision a world in which our type of technology, emotion AI, is ingrained in the fabric of technologies that are at our fingertips every day. It’s unobtrusively in the background, understanding and responding to our emotional well-being. I’ve always had this vision too that we as humans would maybe perhaps carry with us, let’s call it our emotion passport. It’s our emotional digital footprint that we control. We own that data. We manage that. And we allow, with our permissions and desires, to take that with us from device to digital experience to wherever we’re using technology. Whether we’re sitting in our office working on our laptop to getting in our car or using a ride share or on our home systems, like a Google home or in Alexa, you name it. Any type of technology we interact with. There would be this consistent understanding of our well-being and it would guide and advise us and help us. And I think that’s the critical part. And that’s why I think it’s also so important that this is all done with our own opt-in and consent and control.
Elizabeth: It’s so fascinating, because you can think about this technology being used to kind of create empathy in the devices that we use, and the experiences we have, right? And respond to the way we’re reading or reacting to an advertisement for instance and tuning that. And you can also think of this technology being used as a way of managing the emotions that we’re feeling. So when you were talking about an emotion passport, you could sort of say, I’m feeling grouchy, I’m feeling under the weather, and I want my devices and my technology to respond to that. Or you could look at that as my of those devices somehow need to manage me out of that emotion. And it’s quite interesting to think about. It could go either way. And you know I suppose I have my own vote as to which way I’d be most comfortable with it going.
Gabi: And ideally the systems would understand you well enough, what would be appropriate in the moment. Because here we allow for this data to be tracked longitudinally and maybe in the morning some home device I’m using might say like, “Hey Gabi, seems like you’re not as happy as you were yesterday morning. I can also tell that you didn’t really sleep the 7 hours that are optimal for you. Would you like me to ever turn on this music playlist? And maybe you don’t want the drive to work today. Why don’t I order you a ride share for you?” Or the coffee machine just started in the kitchen. Or vice versa. So you come home from work and it’s like, “Hey you had a really rough day at work. I made a restaurant reservation for you and the babysitter is coming for your kid.” Yeah. And the idea is that these that with, let’s call it an emotion passport, that gives our systems and the technologies that we use, gives them insight into our personal state and well-being, the idea is that it can help guide and advise us and essentially try and make our lives better or more effective. Of course I personally would like that always to be in my control and my opt-in and consent. Maybe I don’t want my well-being data sent to my doctor or, God forbid, my insurance company. But maybe in some situations that is helpful. And being able to allow our technologies to get a deeper understanding into our well-being and our state can be critically valuable.
Elizabeth: Wonderful. Well thank you, Gabi, this has been very interesting. This is an exciting area of development, and we wish you every success.
Gabi: Thank you so much and thanks for speaking with me. They were such great questions. I really enjoyed talking to you. Thank you.
Elizabeth: That’s it for this episode of Business Lab. I’m your host Elizabeth Bramson-Boudreau. I’m CEO and publisher of MIT Technology Review. We were founded in 1899 at the Massachusetts Institute of Technology. You can find us in print, on the web, at dozens of live events each year, and now in audio form. At our website, TechnologyReview.com you can find out more about us. And don’t forget to apply to join the MIT Technology Review Global Panel, a group of thought leaders, innovators, and executives, where you can learn from your peers and share your expertise on today’s technology and business trends. Apply at TechnologyReview.com/globalpanel. This show is available wherever you get your podcasts. If you enjoyed this episode, we hope you’ll take a moment to rate and review us at Apple Podcasts. Business Lab is a production of MIT Technology Review. The producer is Wade Roush with editorial help from Mindy Blodgett. Special thanks to our guest Gabi Zijderveld. Thank you for listening. We’ll be back soon with a new episode.