Benchmark for anonymous video analytics

EURASIP Journal on Image and Video Processing

Table 3 The dataset

Video		Length		Unique people*		Number of localization annotations
Name	Illumin. [Lux]	Time [min:s]	Frames	All	OTS	People	Per-frame	Faces	Per-frame
Airport-1	500	5:21	9629	37	29	22062	2.4 ± 1.0	12832	1.4 ± 1.1
Airport-2	500	5:34	10008	35	29	23600	2.7 ± 1.4	14214	1.6 ± 1.2
Airport-3	500	6:26	11578	47	44	26704	2.4 ± 1.2	17849	1.6 ± 1.0
Airport-4	500	5:08	9247	61	56	43685	4.7 ± 2.0	17792	1.9 ± 1.2
Mall-1	300	4:38	8344	158	111	106852	12.8 ± 2.3	45835	5.5 ± 1.8
Mall-2	300	3:41	6626	145	105	95417	14.4 ± 3.7	42779	6.5 ± 2.3
Mall-3	800	5:25	9740	33	30	37120	3.8 ± 1.5	18906	1.9 ± 1.2
Mall-4	800	6:04	10931	53	50	47113	4.3 ± 1.6	32038	2.9 ± 1.3
Pedestrian-1	60000	5:40	10202	18	17	39680	4.0 ± 1.7	19859	2.0 ± 1.4
Pedestrian-2	40000	6:15	11262	56	40	58477	5.2 ± 1.7	25042	2.2 ± 1.6
Pedestrian-3	7000	5:41	10220	27	25	22738	2.3 ± 1.2	13915	1.4 ± 1.0
Pedestrian-4	5500	4:32	8166	27	25	33031	4.0 ± 1.4	16248	2.0 ± 1.0
Pedestrian-5	250	2:58	5350	11	11	24476	4.6 ± 1.6	13504	2.5 ± 1.8
Subway-1	180	3:13	5795	17	17	36828	6.5 ± 3.1	25884	4.6 ± 2.7
Subway-2	180	2:32	4549	29	28	45125	9.9 ± 2.8	24248	5.3 ± 2.4
Subway-3	200	5:45	10342	31	29	85358	8.5 ± 2.9	35460	3.6 ± 1.7
Overall	[180,60000]	78:53	141989	785	646	748266	5.4 ± 3.9	376405	2.7 ± 2.1

*People re-entering the field of view, after exiting it for longer than 10 seconds, are considered as a new (unique) person