YouCook2 Dataset | Home Explore Download Leaderboard |
LeaderboardEvaluation Metric
We use the same evaluation code as in here. We evaluate the model on both localizing and describing events. The metric first finds the proposals that have tIoU overlapping with arbitrary GT segment higher than a threshold (in our case 0.3, 0.5, 0.7, and 0.9). Then it measures the caption quality against GT caption (e.g., Submission FormatPlease create a submission.zip with only a single file named result.json (the name has to be exact). An example on JSON format for the captioning task: { "version": "VERSION 1.0", "results": { "v_xHr8X2Wpmno": [ { "sentence": "add some salt and pepper to the bowl and mix", "timestamp": [74.59, 109.14] } ] }, "external_data": { "used": true, "details": "First fully-connected layer from VGG-16 pre-trained on ILSVRC-2012 training set", } } An example on JSON format for the grounding task: { "database": { "zBexcthy_tA": { "recipe_type": "201", "segments": { "0": { "objects": [{ "label": "chicken", "boxes": [{ "occluded": 0, "outside": 0, "xbr": 326, "ybr": 288, "xtl": 38, "ytl": 19}] }] } } } }, } |