Sie befinden Sich nicht im Netzwerk der Universität Paderborn. Der Zugriff auf elektronische Ressourcen ist gegebenenfalls nur via VPN oder Shibboleth (DFN-AAI) möglich. mehr Informationen...
Ergebnis 1 von 72
Folia oeconomica stetinensia, 2020-06, Vol.20 (1), p.232-247
2020
Volltextzugriff (PDF)

Details

Autor(en) / Beteiligte
Titel
The Influence of Unbalanced Economic Data on Feature Selection and Quality of Classifiers
Ist Teil von
  • Folia oeconomica stetinensia, 2020-06, Vol.20 (1), p.232-247
Ort / Verlag
Szczecin: Wydawnictwo Naukowe Uniwersytetu Szczecińskiego
Erscheinungsjahr
2020
Quelle
Business Source Ultimate【Trial: -2024/12/31】【Remote access available】
Beschreibungen/Notizen
  • Research background: The successful learning of classifiers depends on the quality of data. Modeling is especially difficult when the data are unbalanced or contain many irrelevant variables. This is the case in many applications. The classification of rare events is the overarching goal, e.g. in bankruptcy prediction, churn analysis or fraud detection. The problem of irrelevant variables accompanies situations where the specification of the model is not known a priori, thus in typical conditions for data mining analysts. Purpose: The purpose of this paper is to compare the combinations of the most popular strategies of handling unbalanced data with feature selection methods that represent filters, wrappers and embedded methods. Research methodology: In the empirical study, we use real datasets with additionally introduced irrelevant variables. In this way, we are able to recognize which method correctly eliminates irrelevant variables. Results: Having carried out the experiment we conclude that over-sampling does not work in connection with feature selection. Some recommendations of the most promising methods also are given. Novelty: There are many solutions proposed in the literature concerning unbalanced data as well as feature selection. The innovative field of our interests is to examine their interactions.
Sprache
Englisch
Identifikatoren
ISSN: 1730-4237, 1898-0198
eISSN: 1898-0198
DOI: 10.2478/foli-2020-0014
Titel-ID: cdi_ceeol_journals_891627

Weiterführende Literatur

Empfehlungen zum selben Thema automatisch vorgeschlagen von bX