Sie befinden Sich nicht im Netzwerk der Universität Paderborn. Der Zugriff auf elektronische Ressourcen ist gegebenenfalls nur via VPN oder Shibboleth (DFN-AAI) möglich. mehr Informationen...
ISPRS journal of photogrammetry and remote sensing, 2020-07, Vol.165, p.108-119
2020

Details

Autor(en) / Beteiligte
Titel
UAVid: A semantic segmentation dataset for UAV imagery
Ist Teil von
  • ISPRS journal of photogrammetry and remote sensing, 2020-07, Vol.165, p.108-119
Ort / Verlag
Elsevier B.V
Erscheinungsjahr
2020
Link zum Volltext
Quelle
Elsevier ScienceDirect Journals Complete
Beschreibungen/Notizen
  • [Display omitted] Semantic segmentation has been one of the leading research interests in computer vision recently. It serves as a perception foundation for many fields, such as robotics and autonomous driving. The fast development of semantic segmentation attributes enormously to the large scale datasets, especially for the deep learning related methods. There already exist several semantic segmentation datasets for comparison among semantic segmentation methods in complex urban scenes, such as the Cityscapes and CamVid datasets, where the side views of the objects are captured with a camera mounted on the driving car. There also exist semantic labeling datasets for the airborne images and the satellite images, where the nadir views of the objects are captured. However, only a few datasets capture urban scenes from an oblique Unmanned Aerial Vehicle (UAV) perspective, where both of the top view and the side view of the objects can be observed, providing more information for object recognition. In this paper, we introduce our UAVid dataset, a new high-resolution UAV semantic segmentation dataset as a complement, which brings new challenges, including large scale variation, moving object recognition and temporal consistency preservation. Our UAV dataset consists of 30 video sequences capturing high-resolution images in oblique views. In total, 300 images have been densely labeled with 8 classes for the semantic labeling task. We have provided several deep learning baseline methods with pre-training, among which the proposed Multi-Scale-Dilation net performs the best via multi-scale feature extraction, reaching a mean intersection-over-union (IoU) score around 50%. We have also explored the influence of spatial-temporal regularization for sequence data by leveraging on feature space optimization (FSO) and 3D conditional random field (CRF). Our UAVid website and the labeling tool have been published online (https://uavid.nl/).
Sprache
Englisch
Identifikatoren
ISSN: 0924-2716
eISSN: 1872-8235
DOI: 10.1016/j.isprsjprs.2020.05.009
Titel-ID: cdi_crossref_primary_10_1016_j_isprsjprs_2020_05_009

Weiterführende Literatur

Empfehlungen zum selben Thema automatisch vorgeschlagen von bX