UB Paderborn / Katalog / Suche / Details

Zur Ergebnisliste

Ergebnis 4 von 547582

Semi-supervised learning using frequent itemset and ensemble learning for SMS classification

Expert systems with applications, 2015-02, Vol.42 (3), p.1065-1073

2015

Details

Autor(en) / Beteiligte

Titel

Semi-supervised learning using frequent itemset and ensemble learning for SMS classification

Ist Teil von

Expert systems with applications, 2015-02, Vol.42 (3), p.1065-1073

Ort / Verlag

Amsterdam: Elsevier Ltd

Erscheinungsjahr

2015

Link zum Volltext

Quelle

Alma/SFX Local Collection

Beschreibungen/Notizen

•We have used semi-supervised learning with the help of frequent itemset and ensemble learning to classify SMS data into ham and spam.•We have used UCI publicly available SMS spam collection, SMS spam collection corpus v.0.1 small and big data set for experimenting our result.•We have compared our result with existing semi-supervised learning methods PEBL and SpyEM.•We have obtained good results on very low amount of positive dataset and different amount of unlabeled dataset. Short Message Service (SMS) has become one of the most important media of communications due to the rapid increase of mobile users and it’s easy to use operating mechanism. This flood of SMS goes with the problem of spam SMS that are generated by spurious users. The detection of spam SMS has gotten more attention of researchers in recent times and is treated with a number of different machine learning approaches. Supervised machine learning approaches, used so far, demands a large amount of labeled data which is not always available in real applications. The traditional semi-supervised methods can alleviate this problem but may not produce good results if they are provided with only positive and unlabeled data. In this paper, we have proposed a novel semi-supervised learning method which makes use of frequent itemset and ensemble learning (FIEL) to overcome this limitation. In this approach, Apriori algorithm has been used for finding the frequent itemset while Multinomial Naive Bayes, Random Forest and LibSVM are used as base learners for ensemble learning which uses majority voting scheme. Our proposed approach works well with small number of positive data and different amounts of unlabeled dataset with higher accuracy. Extensive experiments have been conducted over UCI SMS spam collection data set, SMS spam collection Corpus v.0.1 Small and Big which show significant improvements in accuracy with very small amount of positive data. We have compared our proposed FIEL approach with the existing SPY-EM and PEBL approaches and the results show that our approach is more stable than the compared approaches with minimum support.

Sprache: Englisch
Identifikatoren: ISSN: 0957-4174
eISSN: 1873-6793
DOI: 10.1016/j.eswa.2014.08.054
Titel-ID: cdi_proquest_miscellaneous_1660084573

Weiterführende Literatur

Empfehlungen zum selben Thema automatisch vorgeschlagen von bX

Menü

Weitere Dienste

Einstellungen

Semi-supervised learning using frequent itemset and ensemble learning for SMS classification

Details

Weiterführende Literatur