TVR Dataset

A Large-Scale Dataset for Video-Subtitle Moment Retrieval

What is TVR?

TV show Retrieval is a new multimodal retrieval task, in which a short video moment has to be localized from a large video (with subtitle) corpus, given a natural language query. Its associated TVR dataset is a large-scale, high-quality dataset consisting of 108,965 queries on 21,793 videos from 6 TV shows of diverse genres, where each query is associated with a tight temporal alignment. Read our paper

Download

TVR text data files, including train/val/test-public sets annotations and subtitles:

We use the same set of videos as the TVQA dataset, click the button below to download 3FPS video frames. Note that you will be re-directed to TVQA website.

We provide a code base for you to start, which includes basic data preprocessing and analysis tools, feature extraction tools as well as our XML baseline model code. You can also find associated video features in the repo.

Evaluation

The ground-truth video name and timestamp annotations are not released for test-public set, you need to submit your model predictions to our evaluation server. Follow the instruction below:

Submission Instructions

Fill out the Google Form below if you want to show your results on our Leaderboard:

Acknowledgement

This research is supported by NSF, DARPA, Google, and ARO.

Questions?

Ask us questions: tvr-tvc-unc@googlegroups.com or jielei [at] cs.unc.edu.

TVR Leaderboard

TVR tests a system's ability of localizing a moment from a large video (with subtitle) corpus. The performance is measured by R@K (Recall@K, K=1, 10, 100), with temporal IoU = 0.7.

RankModelR@1R@10R@100

1

Jan 20, 2020
XML

UNC Chapel Hill

Paper, Code
3.32 13.41 30.52

2

Jan 20, 2020
MEE + CAL

ENS & INRIA & CIIRC + KAUST & Adobe Research & INRIA (Implemented by UNC)

MEE Paper, CAL Paper
0.66 3.09 12.03

3

Jan 20, 2020
MEE + ExCL

ENS & INRIA & CIIRC + CMU (Implemented by UNC)

MEE Paper, ExCL Paper
0.40 1.73 2.87

4

Jan 20, 2020
Chance

UNC Chapel Hill

0.00 0.00 0.07