Sie befinden Sich nicht im Netzwerk der Universität Paderborn. Der Zugriff auf elektronische Ressourcen ist gegebenenfalls nur via VPN oder Shibboleth (DFN-AAI) möglich. mehr Informationen...
Ergebnis 12 von 82265

Details

Autor(en) / Beteiligte
Titel
Mitigating the impact of mislabeled data on deep predictive models: an empirical study of learning with noise approaches in software engineering tasks
Ist Teil von
  • Automated software engineering, 2024-05, Vol.31 (1), p.33
Ort / Verlag
New York: Springer US
Erscheinungsjahr
2024
Link zum Volltext
Quelle
Alma/SFX Local Collection
Beschreibungen/Notizen
  • Deep predictive models have been widely employed in software engineering (SE) tasks due to their remarkable success in artificial intelligence (AI). Most of these models are trained in a supervised manner, and their performance heavily relies on the quality of training data. Unfortunately, mislabeling or label noise is a common issue in SE datasets, which can significantly affect the validity of models trained on such datasets. Although learning with noise approaches based on deep learning (DL) have been proposed to address the issue of mislabeling in AI datasets, the distinct characteristics of SE datasets in terms of size and data quality raise questions about the effectiveness of these approaches within the SE context. In this paper, we conduct a comprehensive study to understand how mislabeled samples exist in SE datasets, how they impact deep predictive models, and how well existing learning with noise approaches perform on SE datasets. Through an empirical evaluation on two representative datasets for the Bug Report Classification and Software Defect Prediction tasks, our study reveals that learning with noise approaches have the potential to handle mislabeled samples in SE tasks, but their effectiveness is not always consistent. Our research shows that it is crucial to address mislabeled samples in SE tasks. To achieve this, it is essential to take into account the specific properties of the dataset to develop effective solutions. We also highlight the importance of addressing potential class distribution changes caused by mislabeled samples and present the limitations of existing approaches for addressing mislabeled samples. Therefore, we urge the development of more advanced techniques to improve the effectiveness and reliability of deep predictive models in SE tasks.
Sprache
Englisch
Identifikatoren
ISSN: 0928-8910
eISSN: 1573-7535
DOI: 10.1007/s10515-024-00435-y
Titel-ID: cdi_proquest_journals_3032864977

Weiterführende Literatur

Empfehlungen zum selben Thema automatisch vorgeschlagen von bX