Deepfake Video Detection with Convolutional and Recurrent Networks

Demo

Below is a video demoing our project. In the demo, we process and classify two videos, one real video and one fake video. First, we run the two videos through our preprocessing code, which takes 32 frames from each video (about 3 frames per second), extracts a person's face from each of those frames, crops the frame to only include the person's face (with a small margin around the face), and resizes each frame to be 32 by 32. The data for each video, which is a tuple containing the 32 processed frames and the label, real or fake, for the video is saved in a pickle file. This pickle file is then uploaded to Google Drive and read by our Google Colab iPython notebook, which contains the code to create, train, and evaluate our models (3D CNN, CNN-LSTM, and ensemble). Using our ensemble model to classify the two videos, we observe that both videos are classified correctly.