HOLLYWOOD2
Human Actions and Scenes Dataset

We provide a dataset with 12 classes of human actions and 10 classes
of scenes distributed over 3669 video clips and approximately 20.1 hours
of video in total. The dataset intends to provide a comprehensive
benchmark for human action recognition in realistic and challenging
settings. The dataset is composed of video clips from 69 movies
(see the list of movies below). A part of this dataset was originally
used in the paper "Actions in Context", Marszalek et al. in Proc. CVPR'09.

Action samples were collected by means of automatic script-to-video
alignment in combination with text-based script classification following
Laptev at al. CVPR'08. Video samples generated from training movies
correspond to the automatic training subset with noisy action labels.
Based on this subset we also constructed a clean training subset with
action labels manually verified to be correct. We also provide a test
subset with manually checked action labels.

Scene classes are selected automatically from scripts such as to maximize
co-occurrence with the given action classes and to capture action context
as described in Marsza?ek et al. CVPR'09. Scene video samples are then
generated using script-to-video alignment. The labels of test scene
samples are manually verified to be correct.

The following tables provide the numbers of video samples in each of the
subsets as well as the distributions of class instances in each subset.
Note that samples may contain instances of several actions such e.g.
kissing and hugging.

Download details
================

The dataset is split in two parts with actions and scene samples respectively:

http://www.irisa.fr/vista/actions/hollywood2/Hollywood2-actions.tar.gz
(approx. size: 15Gb | md5sum: 55948d0ef45a569a2134ea44e6f8976c)

http://www.irisa.fr/vista/actions/hollywood2/Hollywood2-scenes.tar.gz
(approx. size: 25Gb | md5sum: b77f9ffe18ad5ea04957bb4c7725f5ce)

Action video samples are provided in directory AVIClips for three subsets
according to the table above. The annotation of samples w.r.t. 12 action classes
is located in ClipSets directory. Similarly, the video samples and annotations
for scene samples are located in AVSClipsScenes and ClipSetsScenes directories
respectively.

Example:
The file ClipSets/AnswerPhone_autotrain.txt contains annotation for AnswerPhone
action in the automatic training subset with 810 video clips. Each line of the
annotation file provides a name of a video sample in AVIClips directory as well
as the flag = {1|-1} indicating whether the sample contains AnswerPhone or not.
(Our annotation format is similar to PASCAL VOC annotation format for image
classification task).

We also provide conditional probability tables

http://www.irisa.fr/vista/actions/hollywood2/hollywood2_condprob_scenesgivenact_scripttrain.txt
http://www.irisa.fr/vista/actions/hollywood2/hollywood2_condprob_actgivenscenes_scripttrain.txt

for p(scene|action) and p(action|scene) estimated from an independent set of
movie scripts and used in "Actions in Context" Marszalek et al. CVPR'09 paper:

Source movies
=============

The 69 movies used to generate clips in this dataset were divided into 33 training
movies and 36 test movies as follows.

Training movies:
American Beauty, As Good as It Gets, Being John Malkovich, The Big Lebowski, Bruce
Almighty The Butterfly Effect, Capote, Casablanca, Charade, Chasing Amy, The Cider
House Rules, Clerks, Crash, Double Indemnity, Forrest Gump, The Godfather, The Graduate,
The Hudsucker Proxy, Jackie Brown, Jay and Silent Bob Strike Back, Kids, Legally Blonde,
Light Sleeper, Little Miss Sunshine, Living in Oblivion, Lone Star, Men in Black,
The Naked City, Pirates of the Caribbean: Dead Man’s Chest, Psycho, Quills, Rear
Window, Fight Club.

Test movies:
Big Fish, Bringing Out The Dead, The Crying Game, Dead Poets Society, Erin Brockovich,
Fantastic Four, Fargo, Fear and Loathing in Las Vegas, Five Easy Pieces, Gandhi,
Gang Related, Get Shorty, The Grapes of Wrath, The Hustler, I Am Sam, Independence
Day, Indiana Jones and The Last Crusade, It Happened One Night, It’s aWonderful Life,
LA Confidential, The Lord of the Rings: The Fellowship of the Ring, Lost Highway,
The Lost Weekend, Midnight Run, Misery, Mission to Mars, Moonstruck, Mumford, The
Night of the Hunter, Ninotchka, O Brother Where Art Thou, The Pianist, The Princess
Bride, Pulp Fiction, Raising Arizona, Reservoir Dogs.

Citation
========

Please cite the following paper if using this dataset in your publications:

@InProceedings{marszalek09,
author = "Marcin Marsza{\l}ek and Ivan Laptev and Cordelia Schmid",
title = "Actions in Context",
booktitle = "IEEE Conference on Computer Vision \& Pattern Recognition",
year = "2009"
}