UB Paderborn / Katalog / Suche / Details

Ergebnis 11 von 3436

IEEE access, 2020, Vol.8, p.10896-10906

2020

Volltextzugriff (PDF)

Autor(en) / Beteiligte

Titel

SIFRank: A New Baseline for Unsupervised Keyphrase Extraction Based on Pre-Trained Language Model

Ist Teil von

Ort / Verlag

Piscataway: IEEE

Erscheinungsjahr

2020

Quelle

EZB Electronic Journals Library

Beschreibungen/Notizen

In the age of social media, faced with a huge amount of knowledge and information, accurate and effective keyphrase extraction methods are needed to be applied in information retrieval and natural language processing. It is difficult for traditional keyphrase extraction models to contain a large amount of external knowledge information, but with the rise of pre-trained language models, there is a new way to solve this problem. Based on the above background, we propose a new baseline for unsupervised keyphrase extraction based on pre-trained language model called SIFRank. SIFRank combines sentence embedding model SIF and autoregressive pre-trained language model ELMo, and it has the best performance in keyphrase extraction for short documents. We speed up SIFRank while maintaining its accuracy by document segmentation and contextual word embeddings alignment. For long documents, we upgrade SIFRank to SIFRank+ by position-biased weight, greatly improve its performance on long documents. Compared to other baseline models, our model achieves state-of-the-art level on three widely used datasets.

Sprache: Englisch
Identifikatoren: ISSN: 2169-3536
eISSN: 2169-3536
DOI: 10.1109/ACCESS.2020.2965087
Titel-ID: cdi_proquest_journals_2454736349

Format: –
Schlagworte: Autoregressive models, Computational modeling, Context modeling, Data mining, Digital media, Feature extraction, Information retrieval, Keyphrase extraction, Language, Natural language processing, position-biased weight, pre-trained language model, Segmentation, sentence embeddings, SIFRank, Solid modeling, Sun, Task analysis

Empfehlungen zum selben Thema automatisch vorgeschlagen von bX