A Deep Dive Into Deep Fakes With Dr. Ilke Demir
Charlie Guarino: Hi everybody. This is Charlie Guarino. Welcome to another edition of TechTalk SMB. I’m really thrilled to have join me somebody that I have been reading so much about and have watched so many videos on, and it’s a real delight to have her here. I’m talking about Dr. Ilke Demir, and let me give you her quick bio: In the overlap of computer vision and machine learning, Dr. Ilke Demir’s research focuses on generative models for digitizing the real-world analysis and synthesis approaches in geospatial machine learning and computation geometry for synthesis and fabrication. She has earned her bachelor’s degree in computer science—computer engineering, I should say—with a minor in electrical engineering and has a master’s and PhD degree in computer science. After graduating, Dr. Demir joined Facebook as a postdoctoral research scientist where she and others developed the breakthrough innovation on generative street addresses. Her research further included deep learning approaches for human understanding in virtual reality, geospatial machine learning for map creation, and 3D reconstruction at scale. Currently, Dr. Demir leads the research efforts on 3D vision approaches in the world’s largest volumetric capture stage at Intel Studios. Wow, Dr. Demir. It is a pleasure to have you on our podcast. Thank you so much for joining us today.
Ilke Demir: Yeah, thank you for the invitation. It’s a pleasure to be here.
Charlie: Thank you very much. Thank you. So, Dr. Demir, how we first met: I attended a seminar that you were speaking on in California and I was absolutely riveted on the topic and it really raised my awareness on this topic that I didn’t know a lot about to begin with. But I have to say within a short amount of time—in less than two hours of hearing you speak on this topic—it was eye opening to say the least, and that probably doesn’t do it justice on what I’m talking about. The topic is on deep fakes. Years ago, we would hear about individual photos getting retouched—simple retouching—and that was enough to trick people into thinking that something really wasn’t what it was, in that with retouching, it looked better than it did or it looked different. I’ve seen historic videos and never once have given a moment’s thought to its authenticity, but today—especially after hearing you speak—my perspective has completely changed, and in so much that I’m now starting to question everything that I’m watching. So let’s start with that. First of all, let’s talk about deep fakes and old videos, how they’re being retouched or things like that, so let’s start right there. First of all, what exactly is a deep fake?
Ilke: Deep fakes are any type of media where the actor or the action of the actor is not real. That can be deep fake videos, synthetic images, synthetic voice or audio, or like there are even maybe some 3D content that is faked from some real content. So all of these collections are within the definition of deep fakes. If they are used for evil purposes, they are more categorized as deep fakes, but if they are for good or if they are more towards like synthetic data that you can see that it is synthetic or like that, you can easily identify that they are created as a derivative of something else down there, more towards the definition under synthetic media.
Charlie: So is it fair to say that deep fakes have the ability to almost rewrite history by having a new person introduce a new topic or say something that didn’t really happen?
Ilke: Yeah, yeah. They have the power to manipulate anything that we have seen before. To answer your question with an example, I think there was a Nixon deep fake that was reading a speech where if the moon landing didn’t happen, what would have been said. I think there were some documents that was like prepared in case it was going not as expected. So someone prepared a deep fake of him reading that text; it looks very realistic and you believe as if like it was also recorded, but it was not. So yeah, you can try to mimic history with deep fakes.
Charlie: So that event didn’t happen at all. Zero.
Ilke: No.
Charlie: And there are so many other examples of political, world leaders saying things that they didn’t say.
Ilke: Absolutely. I think the most prominent example that show that deep fakes are really bad was an Obama video, Obama deep fake that was saying like very not appropriate things about other politicians, and that was the first eye-opening example from the research committee to the public showing that deep fakes can actually change opinions.
Charlie: Right because for someone who’s not aware of this deep fake technology, they’re simply watching this video as it is being posted on social media and here, they see, as you mentioned, President Obama at the time saying these negative things, and it’s completely believable.
Ilke: Absolutely.
Charlie: What you’re seeing is not what you’re seeing, so to speak.
Ilke: Exactly. So actually, there was an old workshop that we attended at UCLA. And in that workshop we were looking at all impacts of deep fakes—not just technical impacts but like societal, psychological, etc., like legal effects and impacts of deep fakes. And one thing that we and mostly the social scientists that are looking at this topic realize that it is actually pushing society towards societal erosion of trust. This is exactly what you said. Even if something is not deep fake, the consciousness around deep fakes—that’s anything can be a deep fake—makes us question anything that we see. Like it may be super realistic, but we still have that hesitation. The more higher quality deep fakes that we see, that hesitation even increases more and that can also be weaponized against like genuine, authentic videos—saying that oh, that is a deep fake. So that’s like a very dystopian feature that we see unfortunately, but hopefully we are developing those like deep fake detection algorithms and providence algorithms so we won’t be going toward that dystopia.
Charlie: I hope not [laughs]. You know every year there are surveys done at the executive level—on their top concerns in IT, for example—and the one topic that always comes to the top year after year is security, and that’s always about, for example, data breaches and there’s so many different facets to security. I really think that this should be an equal facet in that discussion. Maybe it’s not because maybe we’re more concerned about encrypting data for example, things like that, and concerned about a data hack, but I think this particular topic is potentially equally if not more important to have in this discussion. What are your thoughts on that?
Ilke: Absolutely. So just imagine doors or systems that we unlock with our fingerprints. Fingerprint is a biometric information that is used by governments and like security systems, etc. Our face is also a very similar biometric information that we use for authentication, for enabling things or making people believe in us, etc. So when you seamlessly fake our faces, that is a security concern. That is like a very dramatic cyber risk that we can actually be giving away by ourselves, because you don’t post pictures of your fingerprint and put it everywhere, right? But you are taking photos of your face and putting it everywhere, so that is actually giving away biometric information, sometimes without even being asked consent. So deep fakes in that sense is actually hopefully bringing some awareness without actually—not without, but like before bringing the doomsday scenarios. It’s actually hopefully making people aware that they should not share that information that much, or they should trust the detection algorithms or like more—hopefully it comes with the awareness that it brings.
Charlie: You bring up an interesting point. I wouldn’t post my fingerprints online—I wouldn’t do that, of course—but my likeness, my photos and videos of myself and many peers in my community do the same thing. I’m certainly no different than many other people who were doing the same thing, but I think in many cases the ship has sailed. I mean the genie is out of the bottle, so to speak. My image is out there like everybody else who is freely putting—you said without consent—freely putting photos of themselves, videos of themselves in social media. So is it too late for some of us already?
Ilke: I wouldn’t be that negative. I think it’s not that late. First of all as I said, there will be detection algorithms and there will be authentication algorithms to see whether those faces are fake or real. And when those systems are in place that are doing automatic detection on any platform, then in that case if someone is using your likeness or using your face to do a deep fake, they will be automatically deleted or eliminated, or at least like marked as fake. On the other hand we are actually developing novel algorithms where you can have the ownership of your face, but in social media. So that is actually a very recent approach that we introduced, which is called My Face My Choice. The name of the paper is My Face My Choice. So if you put a photo with or like someone uploads a photo with or without your consent, you still have the freedom to tag or untag yourself, but even if you untag yourself, your face is there. Your face lives in the platform, so we want to prohibit that. For that we developed this My Face My Choice approach that looks at photo, looks at the friendship, looks at the permissions that you give to the people and for those that you don’t want to be seen, in their view the system is changing your face with a synthetic face. So that the image is still seamless—the image integrity is there—but your face is not there, your face is deep faked with a nonexistent face. So in that sense automatic face publication algorithms—like social media platforms that are storing face embeddings or like Clearview AI is running facial recognition on everything. So in that case those approaches cannot identify you anymore in those existing photos, and the more deep fakes that we create in this way—the more nonexistent faces that we create in this way—the more their search space is exploding. So instead of looking at, let’s say two billion faces, they are looking two trillion or even more faces, and doing this face recognition and ID detection is even harder when you blow up that search space. So we are also using deep fakes in the sense that like they are protecting your privacy and protecting your identity instead of like using it for bad.
Charlie: Do you think replacing somebody’s face with a synthetic image is—I don’t know. what’s the right word? More appropriate or better than let’s just say pixelating someone’s face out in an image? Is that a better approach, do you think?
Ilke: I would answer it in two ways: Technical way and societal way. So technically, some of the pixelation approaches or some of the masking approaches are actually reversible with super resolution. Or with like the complicated deep neural networks, you can actually reverse that pixelation and reveal that it is you. In a societal sense like seeing a mask, or like people are putting smileys or emojis on top of faces. Seeing that is an obviously engineered photo. You know it’s not like the same experience as if you are looking at a real—
Charlie: It’s not natural. It’s not natural.
Ilke: Yeah, not natural at all. There is no image integrity, so we don’t want that experience. The main concern of social media platforms or like a face-masking approach is that, oh yeah if we do that, people don’t look at photos anymore because they don’t enjoy the photos as much. So in that case, we still keep that naturalness, we still keep that image integrity, but you’re not there anymore.
Charlie: That’s an interesting approach. I had never considered that. So is that already in place? I mean have I already seen photos using this technology and other people in the group are not who they are? Is this that pervasive already?
Ilke: Not yet. Hopefully soon. If anybody is interested in that technology, reach out to us [laughs].
Charlie: I got it. Okay so let’s go a little deeper into this. I think about Forrest Gump and I think I mentioned it earlier. Forrest Gump is a movie that was amazing at the time in the 1990s. It put the character in the movie in many different places, and I did some research on that. It’s called the Forrest Gump Effect, but that was done with green screen and Hollywood trickery that was available at the time. But that’s different than today because in the movie Forrest Gump, they put him in different situations, the same person, whereas now deep fake is actually make a new person say something or putting a person in a situation and say things that they never said for example. I think it’s a very different nuance with far-reaching implications, I think is the right word here. Without revealing too many secrets—although I did find quite a bit of information online, so I guess it’s available—but what’s the process? How does somebody start making a deep fake? What’s the technical process to do that?
Ilke: So there are different ways to do a deep fakes, but the very first one was based on generative adversarial networks, which is a generative deep learning approach that is based on training a huge system of parameters on a huge data set of different people and learning the distribution of that data set so that whenever you are sampling from that distribution, it looks like a face that is very similar to what you had in the data set. So you can say that it all started with like deep learning and learning distributions and trying to sample from that distribution in a way that it makes sense with the whole network. Now that was back in 2014 when generative adversarial networks were first introduced. Actually, that’s even a different one than I said, but there are other approaches that you can actually do. For example, generative adversarial networks are made up of two networks: One of them is trying to generate new faces, and the other network is trying to guess if it is a real face or a fake face. So it is the game between those two networks that they actually converge at some point and they always create like good faces. There are other approaches—for example, face reenactment or face-swapping approaches. In that case, you are actually encoding your face into a latent vector and you are decoding the latent vector, which the decoder of another face so that you can actually keep the pose and keep the expression but encode the identity from the other face—this is more towards like making my face look like you but still talk like me. There are other approaches which are based on 3D model fitting for example. You know we can parameterize the face in 2D or 3D, and those parametric locations—where your nose is, where your cheeks are, like everything on your face—these can be parameterized and fit from person A to person B. That is a little bit more towards the traditional way of how we are doing face modeling and face deconstruction, but now that we have deep learning in those systems, you can actually do those—like face reanimation—in a more accurate way than it was done before with blunt shapes and parametrized models.
Charlie: You made me think of something. You mentioned that this went back to 2014, and I imagine the technology was not nearly as good as it is today—
Ilke: Oh no, not at all.
Charlie: It’s going at such an accelerating pace and maybe, like all things of technology, there’s always a good side and a bad side. But I think in the early days another term that I encountered in my research was this term called shallow flakes or shallow fakes—it was more rudimentary the video that you were watching, for example—but there’s also you mentioned data points on a face, and you mentioned like you know maybe someone’s nose and their chin and their cheeks and their eyes are data points. But I want to share with you something that I find very disconcerting. I have twin sisters, identical twin sisters and they can open up each other’s phone—
Ilke: Yes, facial recognition. We just talked about it. So face embedding is the same, so the facial recognition approach is assuming like you are the same person.
Charlie: But they’re two different people entirely, my sisters.
Ilke: Yeah, yeah.
Charlie: So that only speaks to—I think as good as this technology is, there’s still a lot of room for improvement.
Ilke: Absolutely yes. Yes.
Charlie: Well let’s go back to what I said earlier about 2014 and how from there until today how this technology has progressed and how quickly it has progressed. So what you have seen on that timeline?
Ilke: Yeah so if we look back on the directive models for example, there were as I said like variational quarters, and they were learning that distribution but they were not learning to sample like very high-resolution faces. So it was always that blurry face that like when you look from afar, it looks like a face but it’s actually more like the mean image of the faces that it has seen on the data set. Then of course the generative adversarial networks and generative adversarial networks brought so much more approaches that you can actually condition face generation on different vectors—like create with non-expression, create with non-gender, create with a non-face. And then instead of creating with non-face, you can actually change the existing ones. Like we have seen not even just only in the research space but on also in the application space, we have seen the aging applications, making younger applications, stylizing applications, like all of them are like mobile apps or web apps that you can just upload your photo and see what you are like at age 60, or see how you would look if you are another gender, etc. So all these applications and all this research is coming from the same generative networks that have been developed since then. The next step is actually doing that in 3D, changing the environmental conditions, changing lighting, not only doing it for humans but doing it for scenes, doing it for objects. If you capture this in my room for example, I can change the poster right behind me in a way that the reflections are the same, the occlusion doesn’t affect anything, it looks as realistic, etc. So all of these are becoming real thanks to the advancements in deep learning, democratization of AI and ease of training of the networks.
Charlie: You know the things you’re talking about there suggests that you need very superior computational power to do this, but I also have learned that home computing has become so powerful and the software has become so good that this is becoming easier and easier to do and nearly anybody can get involved in this technology.
Ilke: Yeah, and you know us researchers are always sharing our code. There are so many GitHub repositories that people can just like go, download the code, and try to run it on their laptop or desktop. And actually my colleague Dr. Ciftci was mentioning that he’s seeing even high school students downloading that code and making deep fakes. So it is becoming very accessible.
Charlie: And there’s always that motivation, that interest to do it just because you can do it, for no other reason of course. I mean a lot of students, it’s like they enjoy the challenge.
Ilke: Yeah. Our curiosity drives us.
Charlie: Absolutely, and that’s not necessarily a bad thing either, by the way.
Ilke: No.
Charlie: So let’s keep going in this conversation then. So how would I as a layman today with my naked eye be able to watch a video and possibly say to myself, that can’t be real. There is no way that that person would be saying that thing or has the technology gotten so good that I can’t do that anymore? I can’t detect it with my naked eye a deep fake.
Ilke: If you look at whole productions—like even before deep learning, even before deep generative models being in this level, it was done, right? So there is no limit to human creativity with technology, but in the context of deep neural networks and deep fakes, I think it is going towards that like unrealizable moment that we cannot just like look at the video and say by our naked eye that it is fake or not. A couple of years ago maybe, you could see some symmetry artifacts. You can see some like discrepancy in boundary or like hair on the face was not matching at all. So in those cases you see oh okay, this is not the real video. But now it is so advanced that you cannot actually just look at like, maybe you can have some intuition, like in this speech, [this] not realistic thing to the mouth or you know oh, that’s like a very unnatural eye blink, etc. But in going through that hesitation, you need some detection approaches that are automatically trying to detect whether they are fake or not. Hopefully they are actually reasoning that detection. They are not just applying the same—okay this is fake, this is real—but supporting that label with a confidence metric, supporting that decision with some explainability on the features that are extracted from the deep fake or real video, etc. So hopefully we will not be relying on our naked eyes in near future so that’s where it’s going.
Charlie: But for the casual person who is just simply watching videos online, who don’t have this technology to be able to detect deep fakes—and we’ll talk about that in a minute. But the videos that I’m watching today are very believable, very believable. You know one other thing that I initially thought of in the use of deep fakes would be again we talked about President Obama, you know world leaders saying things that didn’t really occur, things like that. But that doesn’t appear to be the case. What I read was that there’s a very high degree of deep fakes in the pornography industry—
Ilke: Yes.
Charlie: And maybe that might be the biggest use, perhaps, today of deep fakes. How do you speak to that?
Ilke: I think there was a 2019 a report that was saying that more than 96% of deep fakes appear in adult content, and you know if there is some technology that is going, you can see its future by how it is used for adult content [laughs]. So I think deep fakes like with the example of that, like when it is blooming there, it means that it will bloom in other parts of the population, so mostly like making non-people do unexpected things is what deep fakes is for, and especially for adult content. But it’s not only for like celebrity impersonation or defamation etc., it is also coming as a nonintentional cause. So for example if your data set has so many nudes for some specific category or some specific gender or specific race, then the images that you are generating from that data set, from the network that was trained on that data set, will come as nudes. I think there was a recent article about—you know there is a popular stylization app called Lensa AI, I guess. So one person was very upset with it, saying that I did not give any nude photos to the app for stylization, but all of the photos that came from that was nudes, and like I was so shocked that without my consent it is creating nudes. There was also another app—not an app but a telegram bot—that’s with a small amount of money, they were sending the nude version of any photo that you sent—
Charlie: That’s right.
Ilke: It was reported that it was used to create more than 600,000 nude photos of women, which is like unbelievable. I don’t think we have that much data sets to train with—but like that bot actually used that many images. So that’s also putting more emphasis on what we said before, right? You don’t want to put your face anywhere so that people can use it for something, or you don’t want to put your photo anywhere that people can use it for something. Hopefully when like detection algorithms are mainstream, all of those apps, all of those like nudification, nonconsent, nonconsensual deep fakes, etc., will be found out in the deep dark web.
Charlie: You know the release of deep fakes can literally ruin a person—ruin their career, reputation, and things like that. It’s pretty far-reaching.
Ilke: And there is the other way around too, like if the release of some truthful, shameful video of someone is out there, they can say that oh, that’s not me. That’s a deep fake. And they can get away with it.
Charlie: Hmm. That’s true. I hadn’t considered that. I think our entire conversation has focused primarily on maybe the negatives of deep fakes, things like that, but there has to be a positive side and that’s always the balance I think in the world of technology. There’s always the good side and the bad side. There’s always a misuse but surely there are always some positive reasons to use this technology. I found a couple online myself—for example, being able to have one person speak in many languages to have a much wider audience. But you have some other examples of how this is being used in a positive way?
Ilke: Absolutely. I mean the deep fake research emerged with providing better digital humans, virtual avatars like in how we can represent ourselves as realistic as possible in augmented reality and virtual reality environments. So all of the research actually seeded from there. It was not like for those evil purposes. Again in those environments, there are like very high end photos, realistic deep fakes, or synthetic content that are like representing us as our digital twins. As you said, there are personal assistants that are talking in many languages. That has been also applied to movie translations. So it’s not just like the voice over or dubbing, but they are actually creating like lip sync approaches to make Japanese movie redubbed and re-deep faked in English, for example. So you don’t even know that the characters are no longer speaking their own languages. Another one that is a good influence of deep fakes is all these visual effects—artists, creators, movie industry, etc.—they were doing all these hand edits or like motion capture systems, etc., but now they can actually basically use deep fakes. I think that was news about Bruce Willis giving his deep fake rights to a company to use it for movies. Although this was positive news, there was also the negative side that like he actually said no, I never signed anything to give away my deep fake rights. So we come back with the negative side a little bit, but I can give a positive example of that. So when you are shooting AR short movie in Intel Studios—it’s like a 3D capture studio, so we are basically shooting 3D movies. So this one was a series and main host of the show was coming to the studio to get a new capture done, new episode done, etc., but because of Covid, he couldn’t come to one of the episodes. We said don’t worry. Shoot your video in 2D with the script and scenario and like everything in your home and then send us your movie. So we will take your 2D movie and apply it to 3D capture that we had of you before so that we can do a 3D deep fake of you instead of you coming all the way to the studio. So these are all like authenticated with consent are created for good deep fakes. They are also coming and I think the hope around that is having a little bit more laws and more like legal consequences that supports these agreements and supports these deep fake rights, you know? It’s not just like oh we did it, but it was an ad-hoc agreement, but instead like if there’s like real deep fake rights in the legal domain, then that helps that too.
Charlie: There was one other article—it was actually a video that I saw. A couple of years ago the Dali Museum in St. Petersburg, Florida, they actually created deep fake videos of Salvador Dali speaking to people.
Ilke: Yes.
Charlie: You know this one.
Ilke: Yeah, that’s one of our best demo pieces. So you know we are showing our like detection technology, and Dali Museum so graciously let us use their video as an example of a really high-resolution believable deep fake. Actually Dali in that video says that like I don’t believe in death. Do you [laughs]? I’m like, well with this deep fake, I don’t believe it either. So just like that, there are also like very good uses of deep fakes. You can bring that people to life with deep fakes and make them say things that maybe they did not before. There are also like the person doesn’t need to be dead to do that. There are some privacy enhancing approaches that deep fakes are used. For example if you are in therapy with teleconferencing but you want to share your problem, you want to get advice from your therapist but you don’t want to reveal your identity, then there are some real real-time deep-faking approaches—the patient is talking as if he or she is talking to the actual therapist, but therapist never sees the actual person. She sees the deep fake but the expressions, micro expressions, eye gaze, everything is protected other than the identity. So the therapist can still analyze the behavior, analyze like the expressions, etc. That’s another good use, and maybe the last good use is substitution technique. We talked about the legal domain just a little bit, but there are other ways around. So probably you know according to GDPR and new data collection rules—like in Europe it is banned to just run an autonomous car on the way and cull like everything on the street. You cannot do that because of GDPR. So some startups and some companies are actually taking that street view video that is collected by the autonomous car, deep faking all the faces on the device without sending it anywhere and validating this process. So they are not actually sending any biometric information, any face information. They are sending deep fakes to be processed on the cloud. So that’s another good way of using deep fakes.
Charlie: Yeah. GDPR is a whole other discussion and I think certainly it’s on its way to the United States as well. California has already adopted some underpinnings of it, I think. All right well, wow. This has been fascinating and just while you were talking, I was just thinking about you had mentioned earlier about the movies. I’m just thinking how interesting it would be to watch a foreign movie without subtitles, you know how great that would be.
Ilke: Yeah. I think there’s a very good effort and I also want to mention that. There’s a documentary that is sharing the stories of oppressed LGBTQ communities in a country where they are tortured. The government is being violent [with] that minority population and the documentary wants to share their experiences without sharing their identity, because if their identity is shared, they will see the same thing. So they use deep fakes for the whole documentary. You can see the expression, you can see how they are really affected by the situation, but their identity is not revealed. [This] is maybe my favorite example in this domain of how deep fakes are actually helping people.
Charlie: That’s a great example and everything you talked about, people can google this information and find more examples, I think.
Ilke: Absolutely, yeah.
Charlie: So I would be completely remiss if I didn’t bring up one of the technologies that you were one the creators of, and that’s called Fake Catcher.
Ilke: Yes.
Charlie: This is really fascinating, and I hope you can expand on it, but it’s a product in its simplest definition to help identify deep fakes. So I’ll leave it there for a minute. What can you tell me about Fake Catcher? I think it’s fascinating technology.
Ilke: So Fake Catcher is a research baby of my and Dr. Umur Ciftci from Binghamton University. When deep fakes were first coming out, I was looking at understanding people, behaviors like gaze and gestures, everything in virtual reality and how we can actually build generation approach on those, how we can be like deep learning approach on those. My background is more finding priors and digital representations of any kind of data, so when deep fakes were coming, I was told there needs to be some priors in humans that we can depend on, there needs to be some like authenticity in human and human videos. At that point I actually knew the famous MITP paper on photoplethysmography PPG from videos, so I said we need to look at PPG signals on those videos. My colleague Dr. Ciftci, he’s an expert on PPG signals, so I brought him on board and we actually did very extensive studies on how photoplethysmography is behaving on real and fake videos. I keep saying PPG or photoplethysmography, but what it is is that when your heart pumps blood, it goes to your veins and the veins change color based on the oxygen content. That color of course is not visible to our naked eye—I cannot just look and see your heart rate—but that change is visible computationally, and that signal is called photoplethysmography. We are measuring the heart rate and blood flow from the skin on your face and using that, we find whether a video is real or faked. Of course we are not just using that. We are collecting that PPG signal from many places on the face, and we are looking at the temporal, spatial, and spectral correlation, and then we feed it to a deep neural network to classify them into real and fake videos. So that’s basically Fake Catch.
Charlie: That’s brilliant. It’s absolutely brilliant. And where would you say using this technology does that give me 100% accuracy or is it high 90s? I mean what’s the success rate of this technology?
Ilke: We evaluated it on many state of the art data sets. One of them is face forensics. On face forensics, we have 96% accuracy. Then we looked at other data sets like CelebDF, FaceForensics++, Deeper Forensics, and all of them are in the paper—if you search for Fake Catcher, they are in the like high 90s. But I want to emphasize another data set: the deep fakes in the wild data set, which is everything that we talked about, basically like all those videos on YouTube, Twitter, research presentations and all videos that we don’t know their generator. We don’t know what artifacts are there. We don’t know the compression. We don’t know what this is processed there. So in all those unknown settings, we can still have 91.07% accuracy using Fake Catcher. If you’ll try to do blind detection or just like train neural networks on that data, it is almost like back into no idea—like 54% accuracy, which is very bad.
Charlie: Let’s start wrapping this up. What’s your final view on this? If I’m watching a news program, news media, or if I’m running a news media organization, how do I trust the videos that I’m getting, that I’m broadcasting to my viewers, that that content is accurate? What do you see going forward in that viewers can trust what they’re seeing? Would they employ Fake Catcher and then say yes, it’s gone through Fake Catcher and we can verify the content is authentic? What’s your view on that?
Ilke: So basically, that’s our vision. So my team at Intel took Fake Catcher and we created this real time deep fake detection platform where we can process up to 72 comparing detection streams on Intel platforms which is very nicely optimized using Intel hardware and software accelerations for AI. We want news organizations, broadcasters, social media companies, everyone to adopt Fake Catcher or similar to real time deep fake detection algorithms so that whenever you are watching a video, there is that little check mark or a little confidence metric that is saying okay, with 80% confidence, we believe this is fake, with 90% confidence, we believe this is real. And we don’t want to stop there. We have approaches that are looking at source detection for deep fakes, like which generative approach created this deep fake. We can also provide that and that will help the user, the viewer make even a more informed decision about whether it is a fake or not knowing the source. We also want to build providence approaches with deep learning. For example, it may be a deep fake that is done for good. We have talked about that right? So how do we enable those deep fakes for good with creators embedding the information inside the deep fake saying that, okay this was done by me, done by this model. It was done with this purpose using this consent, source image, etc. So if we can embed all that information into the deep fake and decode it in the view time, decode it in the consumption time, then we can actually have that watermark, have that authentication saying that this is deep fake. It is done for good, like we can trust it. Even if it is fake, it was done with these sources and non-providence information. So even though the deep fake detection is totally needed for short term, in long term hopefully the providence approaches will be able to give us all the information about where the media came from.
Charlie: Right and raise the trust in media overall for everybody.
Ilke: Absolutely. That’s the name of our general research group, which is Trusted Media.
Charlie: So, I hit it right on the head. There you go. Dr. Demir, I can’t thank you enough, because this is such an important topic, and I think very timely. As I mentioned earlier, I started the conversation with—you know I attended one of your seminars last year and since that day, I have been questioning to myself and really thinking about what I’m watching and is it authentic, things like that. And I think you’re really onto something and I just I see you’re on a path here that’s just going to just keep exploding exponentially. I don’t know how it possibly couldn’t. So thank you so so much for your time today. Truly, I genuinely appreciate it. It was a real pleasure.
Ilke: Likewise, thank you.
Charlie: All right, everybody. I hope you enjoyed today’s podcast. You can look up Dr. Demir’s name on online. You’ll find lots of things about her and her Fake Catcher technology. I think you’ll be very interested and as fascinated as I still am to this day. So thank you again. Please check out other things on TechChannel. There are lots of other things on there, lots of good podcasts and lots of good content on there that’s worth your while. And until next time everybody, see you soon. Bye now.