Sie befinden Sich nicht im Netzwerk der Universität Paderborn. Der Zugriff auf elektronische Ressourcen ist gegebenenfalls nur via VPN oder Shibboleth (DFN-AAI) möglich. mehr Informationen...
LLM Based Biological Named Entity Recognition from Scientific Literature
Ist Teil von
2024 IEEE International Conference on Big Data and Smart Computing (BigComp), 2024, p.433-435
Ort / Verlag
IEEE
Erscheinungsjahr
2024
Quelle
IEEE Electronic Library Online
Beschreibungen/Notizen
Recently, the application of Large Language Models (LLMs) in the field of natural language processing has witnessed remarkable growth, revolutionizing the field of bioinformatics by automating the extraction of biological entities from scientific literature. This study presents the development and evaluation of a Biological Named Entity Recognizer (BNER) using a pre-trained Large Language Model (LLM) refined through prompt engineering. The BNER was tailored to identify proteins, genes, and small molecules within scientific texts, specifically targeting the context of p53 protein-related research. To assess the BNER's efficacy, we curated a dataset comprising ten paragraphs extracted from the abstracts and significant sections of five high-relevance scientific papers. The system's performance was quantified through an entity recognition task, resulting in 51 true positives (TP), 10 false positives (FP), and 3 false negatives (FN). The BNER achieved an F1 score of 0.887, demonstrating a high degree of precision and recall. These results validate the utility of LLMs in bioinformatics and highlight the BNER's potential to support and accelerate scientific discovery by providing accurate, structured data outputs suitable for comprehensive analysis.