Sie befinden Sich nicht im Netzwerk der Universität Paderborn. Der Zugriff auf elektronische Ressourcen ist gegebenenfalls nur via VPN oder Shibboleth (DFN-AAI) möglich. mehr Informationen...
Natural Language Processing Journal, 2024-03, Vol.6, p.100044, Article 100044
2024
Volltextzugriff (PDF)

Details

Autor(en) / Beteiligte
Titel
Benchmarking topic models on scientific articles using BERTeley
Ist Teil von
  • Natural Language Processing Journal, 2024-03, Vol.6, p.100044, Article 100044
Ort / Verlag
Elsevier B.V
Erscheinungsjahr
2024
Quelle
Alma/SFX Local Collection
Beschreibungen/Notizen
  • The introduction of BERTopic marked a crucial advancement in topic modeling and presented a topic model that outperformed both traditional and modern topic models in terms of topic modeling metrics on a variety of corpora. However, unique issues arise when topic modeling is performed on scientific articles. This paper introduces BERTeley, an innovative tool built upon BERTopic, designed to alleviate these shortcomings and improve the usability of BERTopic when conducting topic modeling on a corpus consisting of scientific articles. This is accomplished through BERTeley’s three main features: scientific article preprocessing, topic modeling using pre-trained scientific language models, and topic model metric calculation. Furthermore, an experiment was conducted comparing topic models using four different language models in three corpora consisting of scientific articles. •Provide transformer-based tools to accelerate topic modeling using scientific articles as input.•Extract text embeddings using bidirectional encoder transformers and Large Language models to discover underlying themes and patterns within documents.•Create visual summaries of science topics from different publication databases.
Sprache
Englisch
Identifikatoren
ISSN: 2949-7191
DOI: 10.1016/j.nlp.2023.100044
Titel-ID: cdi_crossref_primary_10_1016_j_nlp_2023_100044

Weiterführende Literatur

Empfehlungen zum selben Thema automatisch vorgeschlagen von bX