Sie befinden Sich nicht im Netzwerk der Universität Paderborn. Der Zugriff auf elektronische Ressourcen ist gegebenenfalls nur via VPN oder Shibboleth (DFN-AAI) möglich. mehr Informationen...
Few-shot named entity recognition framework for forestry science metadata extraction
Ist Teil von
Journal of ambient intelligence and humanized computing, 2024-04, Vol.15 (4), p.2105-2118
Ort / Verlag
Berlin/Heidelberg: Springer Berlin Heidelberg
Erscheinungsjahr
2024
Quelle
Alma/SFX Local Collection
Beschreibungen/Notizen
The effective utilization of accumulated forestry science papers is of paramount significance in enhancing our understanding of the current state of forests and the formulation of strategies for forest environmental preservation. However, the present challenge lies in the deficient richness of metadata associated with these pivotal documents, rendering their comprehensive exploitation a formidable endeavor. Metadata from forestry science papers serves as a foundational cornerstone for the efficient management and utilization of these scholarly documents, playing an indispensable role in the advancement of research within the domain of forestry science. Constructing a training corpus and extracting distant semantic relationships is challenging inherent, the utilization of named entity recognition (
NER
) technology for metadata entity identification in forestry science papers remains an unexplored avenue. To overcome these limitations, this paper creates a specialized training corpus and introduces a novel few-shot
NER
framework tailored specifically for metadata extraction from forestry science papers. Within this innovative framework, a data augmentation layer, employing word replacement (
WR
) and enhanced mixup (
EM
), effectively addresses the issue of suboptimal performance resulting from a scarcity of training data. The semantic comprehension layer incorporates a multi-granularity dilated convolution neural network (
MGDCNN
) to capture and extract distant semantic associations. Moreover, a meta-learning-based reweighting layer is introduced to mitigate the adverse effects of low-quality augmented examples on the model. Experimental results conclusively demonstrate the efficacy of the proposed framework, yielding
precision
,
recall
, and
F
1 of 91.08%, 88.96%, and 90.00%, respectively. Compared to traditional models,
precision
,
recall
, and
F
1 can be improved by up to 10.69%, 7.48%, and 9.07%, respectively.