Author: Hong, Zhi; Tchoua, Roselyne; Chard, Kyle; Foster, Ian
Title: SciNER: Extracting Named Entities from Scientific Literature Cord-id: bdkg4co9 Document date: 2020_6_15
ID: bdkg4co9
Snippet: The automated extraction of claims from scientific papers via computer is difficult due to the ambiguity and variability inherent in natural language. Even apparently simple tasks, such as isolating reported values for physical quantities (e.g., “the melting point of X is Yâ€) can be complicated by such factors as domain-specific conventions about how named entities (the X in the example) are referenced. Although there are domain-specific toolkits that can handle such complications in certain
Document: The automated extraction of claims from scientific papers via computer is difficult due to the ambiguity and variability inherent in natural language. Even apparently simple tasks, such as isolating reported values for physical quantities (e.g., “the melting point of X is Yâ€) can be complicated by such factors as domain-specific conventions about how named entities (the X in the example) are referenced. Although there are domain-specific toolkits that can handle such complications in certain areas, a generalizable, adaptable model for scientific texts is still lacking. As a first step towards automating this process, we present a generalizable neural network model, SciNER, for recognizing scientific entities in free text. Based on bidirectional LSTM networks, our model combines word embeddings, subword embeddings, and external knowledge (from DBpedia) to boost its accuracy. Experiments show that our model outperforms a leading domain-specific extraction toolkit by up to 50%, as measured by F1 score, while also being easily adapted to new domains.
Search related documents:
Co phrase search for related documents- academic paper and long short term: 1
- academic paper and longitudinal study: 1
- academic paper and machine learning: 1
Co phrase search for related documents, hyperlinks ordered by date