UB Paderborn / Katalog / Suche / Details

Ergebnis 7 von 142

IEEE transactions on pattern analysis and machine intelligence, 2023-09, Vol.45 (9), p.1-16

2023

Volltextzugriff (PDF)

Autor(en) / Beteiligte

Titel

RelTR: Relation Transformer for Scene Graph Generation

Ist Teil von

IEEE transactions on pattern analysis and machine intelligence, 2023-09, Vol.45 (9), p.1-16

Ort / Verlag

United States: IEEE

Erscheinungsjahr

2023

Quelle

IEEE/IET Electronic Library (IEL)

Beschreibungen/Notizen

Different objects in the same scene are more or less related to each other, but only a limited number of these relationships are noteworthy. Inspired by Detection Transformer, which excels in object detection, we view scene graph generation as a set prediction problem. In this paper, we propose an end-to-end scene graph generation model Relation Transformer (RelTR), which has an encoder-decoder architecture. The encoder reasons about the visual feature context while the decoder infers a fixed-size set of triplets subject-predicate-object using different types of attention mechanisms with coupled subject and object queries. We design a set prediction loss performing the matching between the ground truth and predicted triplets for the end-to-end training. In contrast to most existing scene graph generation methods, RelTR is a one-stage method that predicts sparse scene graphs directly only using visual appearance without combining entities and labeling all possible predicates. Extensive experiments on the Visual Genome, Open Images V6, and VRD datasets demonstrate the superior performance and fast inference of our model.

Sprache: Englisch
Identifikatoren: ISSN: 0162-8828
eISSN: 1939-3539, 2160-9292
DOI: 10.1109/TPAMI.2023.3268066
Titel-ID: cdi_pubmed_primary_37074895

Format: –
Schlagworte: Coders, Decoding, Encoders-Decoders, Object detection, Object recognition, One-Stage, Predictions, Predictive models, Proposals, scene graph generation, scene understanding, Task analysis, Transformers, visual relationship detection, Visualization

Empfehlungen zum selben Thema automatisch vorgeschlagen von bX