Sie befinden Sich nicht im Netzwerk der Universität Paderborn. Der Zugriff auf elektronische Ressourcen ist gegebenenfalls nur via VPN oder Shibboleth (DFN-AAI) möglich. mehr Informationen...
Ergebnis 6 von 1921
Empirical software engineering : an international journal, 2022-09, Vol.27 (5), Article 105
2022
Volltextzugriff (PDF)

Details

Autor(en) / Beteiligte
Titel
SeSG: a search string generator for Secondary Studies with hybrid search strategies using text mining
Ist Teil von
  • Empirical software engineering : an international journal, 2022-09, Vol.27 (5), Article 105
Ort / Verlag
New York: Springer US
Erscheinungsjahr
2022
Quelle
Alma/SFX Local Collection
Beschreibungen/Notizen
  • A Secondary Study (SS) is an important research method used in several areas. A crucial step in the Conduction phase of a SS is the search of studies. This step is time-consuming and error-prone, mainly due to the refinement of the search string. The objective of this study is to validate the effectiveness of an automatic formulation of search strings for SS. Our approach, termed Search String Generator (SeSG), takes as input a small set of studies (as a Quasi-Gold Standard) and processes them using text mining. After that, SeSG generates search strings that deliver a high F1-Score on the start set of a hybrid search strategy. To achieve this objective, we (1) generate a structured textual representation of the initial set of input studies as a bag-of-words using Term Frequency and Document Frequency; (2) perform automatic topic modeling using LDA (Latent Dirichlet Allocation) and enrichment of terms with a pre-trained dense language representation (embedding) called BERT (Bidirectional Encoder Representations from Transformers); (3) formulate and evaluate the search string using the obtained terms; and (4) use the developed search strings in a digital library. For the validation of our approach, we conduct an experiment—using some SS as objects—comparing the effectiveness of automatically formulated search strings by SeSG with manual search strings reported in these studies. SeSG generates search strings that achieve a better final F1-Score on the start set than the searches reported by these SS. Our study shows that SeSG can effectively supersede the formulation of search strings, in hybrid search strategies, since it dismisses the manual string refinements.

Weiterführende Literatur

Empfehlungen zum selben Thema automatisch vorgeschlagen von bX