BBCnews
From Chorus
| Domain | News Media |
| Media | Image |
| Size | 255 MB |
| Instances | |
| File Format | XHTML, XML |
| Creation Date | |
| Task | retrieval |
| Copyright | |
| URL | http://mlg.ucd.ie/datasets/bbc.html |
Domain
- News media
Comments
- Cross media dataset combining images and text
- BBC news html pages categorized in 11 categories and split into two sets
Media (image, video, mixed, …)
- Images
Size (no images, in GB, …)
- ~255 MB compressed
Source (FlickR, Corel)
- Joao Magalhaes (Crawled from the internet)
Annotation type (free text, structured, …)
Ground truth
Event or project
Task (retrieval, recognition, …)
Format
- xhtml pages, xml files containing metadata and images extracted from the xhtml pages
Quality (resolution)
Creation date
Copyright
- Joao Magalhaes