Skip to content

27rg5/Explicit-Video-Segment-Classifier-and-Summarizer

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

44 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Explicit Video Segment Classifier and Summarizer

To install whisper strictly use 'pip install git+https://github.com/openai/whisper.git'

python version = 3.10.0

In this project our objective is to take a video and extract the smaller segments of this video that contain explicit content and give a natural language summary of those video segments. We took a subset of two public datasets. For faster training we encoded the audio and language data in the form of audio and spectrogram encodings. The hierarchy of our data is

- cls_data
  - processed_data
    - encoded_videos
      - explicit
        - Video directory
          - audio_encs (Has a file containing audio modality input)
          - spectro_encs (Has a file containing video modality input)

      - non_explicit
        (Same structure as above)

    - non_encoded_videos
      (mp4 files)

  - stitched_videos
    (mp4 files obtained from concatenating smaller video clips. Used during demo)

To install create a conda environment of python version 3.10.0 strictly Install whisper strictly use pip install git+https://github.com/openai/whisper.git Then run pip install -r requirements.txt

Below are the diagrams of our pipeline

Pipeline Attention Mechanism

To train the model run

python -W ignore main.py --n_epochs 20 --learning_rate 1e-3 --optimizer_name SGD  --root_dir ~/cls_data --language_model_name distilbert-base-uncased --video_model_name slowfast_r50 --experiment_name sample_trimodal_test_run --batch_size 1 --print_every 10 --spectrogram_model_name resnet18

root_dir - The root directory of all data

experiment_name - The desired name where the model checkpoint, tensorboard logs and file containing validation video names will be saved (from this file the names will be loaded and the corresponding video shall be loaded from the encoded data)

optimizer_name - The desired optimizer of choice eg. SGD, Adam

language_model_name - The pretrained language model of choice which will be trained for language modality

video_model_name - The pretrained video classifier of choice which will be trained for video modality

spectrogram_model_name - The pretrained audio classifier of choice (CNN) which will be trained for audio modality

print_every - Number of iterations or batches after which the running loss will be printed

To run the evaluation for explicit vs non_explicit classifier Please refer run_evals.sh for every experiment done so far. An example evaluation command for our best model:

python -W ignore eval.py --root_dir_path ~/cls_data_1_min --experiment_name attention_fusion_default_networks_self_attention_21epochs_caption_modality --get_classified_list --eval_dataset_type val --mlp_fusion --mlp_object_path '<directory containing the sklearn mlp object>/best_model_bertopic_test_f1_0.782608695652174.pkl'

root_dir_path - The root directory of all data

experiment_name - Name of the experiment run, a directory exists of the same name with training logs and model weights

get_classified_list - Optional parameter to get a csv file having video name, predicted label and actual ground truth label

eval_dataset_type - can be one of 'train' or 'val', indicates for which dataset metrics should be computed

mlp_fusion - if set CaptionNet embeddings will be present for late fusion along with other modalities

mlp_object_path - path to the trained sklearn/pytorch mlp object

To run the demo for the entire pipeline

 python demo.py --stitched_videos_path ~/cls_data/stitched_videos/ --experiment_name sgd_lr_1e-3_macro_f1_with_seed_42_feats_200_200 

stitched_videos_path - Directory where stitched videos are saved

Below is an example output from our pipeline

Results

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published