Sie befinden Sich nicht im Netzwerk der Universität Paderborn. Der Zugriff auf elektronische Ressourcen ist gegebenenfalls nur via VPN oder Shibboleth (DFN-AAI) möglich. mehr Informationen...
Knowledge and information systems, 2020-03, Vol.62 (3), p.951-985
2020

Details

Autor(en) / Beteiligte
Titel
Integrating learned and explicit document features for reputation monitoring in social media
Ist Teil von
  • Knowledge and information systems, 2020-03, Vol.62 (3), p.951-985
Ort / Verlag
London: Springer London
Erscheinungsjahr
2020
Link zum Volltext
Quelle
Alma/SFX Local Collection
Beschreibungen/Notizen
  • Currently, monitoring reputation in social media is probably one of the most lucrative applications of information retrieval methods. However, this task poses new challenges due to the dynamicity of contents and the need for early detection of topics that affect the reputations of companies. Addressing this problem with learning mechanisms that are based on training data sets is challenging, given that unseen features play a crucial role. However, learning processes are necessary to capture domain features and dependency phenomena. In this work, based on observational information theory, we define a document representation framework that enables the combination of explicit text features and supervised and unsupervised signals into a single representation model. Our theoretical analysis demonstrates that the observation information quantity (OIQ) generalizes the most popular representation methods, in addition to capturing quantitative values, which is required for integrating signals from learning processes. In other words, the OIQ allows us to give the same treatment to features that are currently managed separately. Empirically, our experiments on the reputation-monitoring scenario demonstrated that adding features progressively from supervised (in particular, Bayesian inference over annotated data) and unsupervised learning methods (in particular, proximity to clusters) increases the similarity estimation performance. This result is verified under various similarity criteria (pointwise mutual information, Jaccard and Lin’s distances and the information contrast model). According to our formal analysis, the OIQ is the first representation model that captures the informativeness (specificity) of quantitative features in the document representation.

Weiterführende Literatur

Empfehlungen zum selben Thema automatisch vorgeschlagen von bX