Download

Please read the README file before download.
Annotations (single .json file): Train+Val, Test (segments only)
Recipe ID/type: ID-type pairs
Splits: Train+Val+Test
Scripts: Download Scripts
Frame-wise ResNet feature (.csv, 23.9GB): Train+Val+Test
Frame-wise ResNet feature (.dat, 11.2GB): Train+Val+Test
All-in-one (35.1GB): YouCook2

[06/22/2024] Due to requests and inaccessibility of online videos, we are sharing the raw video files. By downloading these files, you are agreeing to use them for non-commercial, research purposes only.
Raw Videos: All Videos (144 GB); PartA PartB PartC PartD PartE PartF PartG PartH PartI PartJ

Download (YouCook2-BB)

[05/07/2019] Reformatted the annotation JSON files. Augmented the initial release with 3.5% more annotations. See the bottom of the code repo for more details.
Please read the README file inside the annotation pack.
Annotations (1.3MB): YouCook2-BB
Region proposals and features see the code repo.

Download (others)

We provide both deep appearance (RGB) and motion (optic flow) feature extracted from TSN at our DenseCap repo. See our CVPR 2018 paper for more details.

License

[06/22/2024] Due to requests and inaccessibility of online videos, we have begun to distribute the raw video files for non-commercial, research purposes only.
[09/09/2019] We only provide the annotations and do not distribute the videos. The licenses for our annotations are as follows: YouCook2, YouCook2-BB

Citation

We introduced YouCook2 dataset in our AAAI 2018 paper on video procedure segmentation. Please cite the following if you find the dataset useful:

  @inproceedings{ZhXuCoAAAI18,
    author={Zhou, Luowei and Xu, Chenliang and Corso, Jason J},
    title = {Towards Automatic Learning of Procedures From Web Instructional Videos},
    booktitle = {AAAI Conference on Artificial Intelligence},
    pages={7590--7598},
    year = {2018},
    url = {https://www.aaai.org/ocs/index.php/AAAI/AAAI18/paper/view/17344}
  }

The bounding box annotation comes from our BMVC 2018 paper on weakly-supervised video object grounding. Please cite the following if you find the annotation useful:

  @inproceedings{ZhLoCoBMVC18,
    author={Zhou, Luowei and Louis, Nathan and Corso, Jason J},
    title={Weakly-Supervised Video Object Grounding from Text by Loss Weighting and Object Interaction},
    booktitle = {British Machine Vision Conference},
    year = {2018},
    url = {http://bmvc2018.org/contents/papers/0070.pdf}
  }