Sie befinden Sich nicht im Netzwerk der Universität Paderborn. Der Zugriff auf elektronische Ressourcen ist gegebenenfalls nur via VPN oder Shibboleth (DFN-AAI) möglich. mehr Informationen...
Ergebnis 7 von 142
IEEE transactions on pattern analysis and machine intelligence, 2023-09, Vol.45 (9), p.1-16
2023
Volltextzugriff (PDF)

Details

Autor(en) / Beteiligte
Titel
RelTR: Relation Transformer for Scene Graph Generation
Ist Teil von
  • IEEE transactions on pattern analysis and machine intelligence, 2023-09, Vol.45 (9), p.1-16
Ort / Verlag
United States: IEEE
Erscheinungsjahr
2023
Quelle
IEEE/IET Electronic Library (IEL)
Beschreibungen/Notizen
  • Different objects in the same scene are more or less related to each other, but only a limited number of these relationships are noteworthy. Inspired by Detection Transformer, which excels in object detection, we view scene graph generation as a set prediction problem. In this paper, we propose an end-to-end scene graph generation model Relation Transformer (RelTR), which has an encoder-decoder architecture. The encoder reasons about the visual feature context while the decoder infers a fixed-size set of triplets subject-predicate-object using different types of attention mechanisms with coupled subject and object queries. We design a set prediction loss performing the matching between the ground truth and predicted triplets for the end-to-end training. In contrast to most existing scene graph generation methods, RelTR is a one-stage method that predicts sparse scene graphs directly only using visual appearance without combining entities and labeling all possible predicates. Extensive experiments on the Visual Genome, Open Images V6, and VRD datasets demonstrate the superior performance and fast inference of our model.
Sprache
Englisch
Identifikatoren
ISSN: 0162-8828
eISSN: 1939-3539, 2160-9292
DOI: 10.1109/TPAMI.2023.3268066
Titel-ID: cdi_pubmed_primary_37074895

Weiterführende Literatur

Empfehlungen zum selben Thema automatisch vorgeschlagen von bX