Sie befinden Sich nicht im Netzwerk der Universität Paderborn. Der Zugriff auf elektronische Ressourcen ist gegebenenfalls nur via VPN oder Shibboleth (DFN-AAI) möglich. mehr Informationen...
Ergebnis 7 von 12
2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017, p.6422-6431
2017

Details

Autor(en) / Beteiligte
Titel
What's in a Question: Using Visual Questions as a Form of Supervision
Ist Teil von
  • 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017, p.6422-6431
Ort / Verlag
IEEE
Erscheinungsjahr
2017
Link zum Volltext
Quelle
Alma/SFX Local Collection
Beschreibungen/Notizen
  • Collecting fully annotated image datasets is challenging and expensive. Many types of weak supervision have been explored: weak manual annotations, web search results, temporal continuity, ambient sound and others. We focus on one particular unexplored mode: visual questions that are asked about images. The key observation that inspires our work is that the question itself provides useful information about the image (even without the answer being available). For instance, the question what is the breed of the dog? informs the AI that the animal in the scene is a dog and that there is only one dog present. We make three contributions: (1) providing an extensive qualitative and quantitative analysis of the information contained in human visual questions, (2) proposing two simple but surprisingly effective modifications to the standard visual question answering models that allow them to make use of weak supervision in the form of unanswered questions associated with images and (3) demonstrating that a simple data augmentation strategy inspired by our insights results in a 7.1% improvement on the standard VQA benchmark.
Sprache
Englisch
Identifikatoren
ISSN: 1063-6919
DOI: 10.1109/CVPR.2017.680
Titel-ID: cdi_ieee_primary_8100163

Weiterführende Literatur

Empfehlungen zum selben Thema automatisch vorgeschlagen von bX