"Deepfake Detectors Can Be Defeated, Researchers Show for the First Time"
Deepfakes refer to synthetic media, including images, audio, and videos, that are manipulated or created using Artificial Intelligence (AI). Deepfakes have received a lot of attention as they can be used to spread misleading or false information. In addition to spreading disinformation, deepfakes can decrease the effectiveness of security systems such as those that apply facial recognition technologies for authentication. Although there have been developments in security measures to detect deepfakes, this synthetic media remains a problem. A team of computer scientists recently demonstrated that it is possible to deceive systems designed to detect deepfake videos. They were able to defeat the detectors with adversarial examples, which are inputs designed to cause AI systems like Machine Learning (ML) models to make mistakes. The researchers inserted adversarial examples into every video frame. They also showed that their attacks can still work after a video has been compressed. The demonstration proves that it is possible for adversaries to create robust adversarial deepfakes even if they do not know the inner workings of the ML model used by the detector. Researchers used two scenarios to test their attacks. One scenario involves attackers having complete access to the detector model, while the other involves attackers only querying the ML model to determine the probabilities of a frame being classified as real or fake. The first scenario resulted in a success rate of over 99 percent for uncompressed videos and 84.96 percent for compressed videos. The success rate in the second scenario was 86.43 percent for uncompressed videos and 78.33 percent for compressed videos. The researchers recommend the performance of adversarial training to improve deepfake detectors. This article continues to discuss the spread of deepfake videos, the attacks demonstrated by researchers that can defeat deepfake detectors, and how these detectors can be improved through adversarial training.