Sie befinden Sich nicht im Netzwerk der Universität Paderborn. Der Zugriff auf elektronische Ressourcen ist gegebenenfalls nur via VPN oder Shibboleth (DFN-AAI) möglich. mehr Informationen...

Details

Autor(en) / Beteiligte
Titel
Analysis of the Mutual Relevance of Topical Corpus Documents in the Problem of Assessing the Proximity of Text to the Semantic Standard
Ist Teil von
  • Pattern recognition and image analysis, 2021-07, Vol.31 (3), p.588-594
Ort / Verlag
Moscow: Pleiades Publishing
Erscheinungsjahr
2021
Link zum Volltext
Quelle
Alma/SFX Local Collection
Beschreibungen/Notizen
  • The article is devoted to the problem of the unity and integrity of the image of a semantic standard, allocated by phrases for a topical text. Herewith, the proximity of the text to the standard is assessed without searching for paraphrases, and the base for assessing the proximity of the text to the standard is the division of words of each of its phrases into classes according to the value of the TF-IDF measure relative to the texts of the corpus, previously formed by the expert. The analyzed texts are abstracts of scientific articles together with their titles. The core of the problem is as follows: for each phrase, the maximum proximity to the standard is achieved with respect to its corpus document and, as a consequence, it is required to assess the mutual relevance of such documents for different phrases of the analyzed text. In this study, this problem is solved by introducing the distances between the vectors of the values of the TF-IDF measure of the words of a separate phrase with respect to different documents in the corpus. In this case, the distance between documents, relative to which the closest proximity to the standard of phrases of the analyzed text was achieved, should be minimal. Using the Euclidean metric and Manhattan distance as an example, this study illustrates the application of the proposed approach to the problem of choosing a higher-level text for the given one in the hierarchy being formed in terms of semantic standard complementarity.
Sprache
Englisch
Identifikatoren
ISSN: 1054-6618
eISSN: 1555-6212
DOI: 10.1134/S1054661821030172
Titel-ID: cdi_proquest_journals_2575081614

Weiterführende Literatur

Empfehlungen zum selben Thema automatisch vorgeschlagen von bX