Sie befinden Sich nicht im Netzwerk der Universität Paderborn. Der Zugriff auf elektronische Ressourcen ist gegebenenfalls nur via VPN oder Shibboleth (DFN-AAI) möglich. mehr Informationen...
Data mining and machine learning approaches for prediction modelling of schistosomiasis disease vectors
Ist Teil von
International journal of machine learning and cybernetics, 2020-06, Vol.11 (6), p.1159-1178
Ort / Verlag
Heidelberg: Springer Nature B.V
Erscheinungsjahr
2020
Link zum Volltext
Quelle
Alma/SFX Local Collection
Beschreibungen/Notizen
This research presents viable solutions for prediction modelling of schistosomiasis disease based on vector density. Novel training models proposed in this work aim to address various aspects of interest in the artificial intelligence applications domain. Topics discussed include data imputation, semi-supervised labelling and synthetic instance simulation when using sparse training data. Innovative semi-supervised ensemble learning paradigms are proposed focusing on labelling threshold selection and stringency of classification confidence levels. A regression-correlation combination (RCC) data imputation method is also introduced for handling of partially complete training data. Results presented in this work show data imputation precision improvement over benchmark value replacement using proposed RCC on 70% of test cases. Proposed novel incremental transductive models such as ITSVM have provided interesting findings based on threshold constraints outperforming standard SVM application on 21% of test cases and can be applied with alternative environment-based epidemic disease domains. The proposed incremental transductive ensemble approach model enables the combination of complimentary algorithms to provide labelling for unlabelled vector density instances. Liberal (LTA) and strict training approaches provided varied results with LTA outperforming Stacking ensemble on 29.1% of test cases. Proposed novel synthetic minority over-sampling technique (SMOTE) equilibrium approach has yielded subtle classification performance increases which can be further interrogated to assess classification performance and efficiency relationships with synthetic instance generation.