Results

Selected article for: "model train and training data"

Author: Heist, Nicolas; Paulheim, Heiko

Title: Entity Extraction from Wikipedia List Pages

Cord-id: p21fpv29

Document date: 2020_5_7

ID: p21fpv29

Hyperlink: Download document. Google Scholar. Related documents.

Snippet: When it comes to factual knowledge about a wide range of domains, Wikipedia is often the prime source of information on the web. DBpedia and YAGO, as large cross-domain knowledge graphs, encode a subset of that knowledge by creating an entity for each page in Wikipedia, and connecting them through edges. It is well known, however, that Wikipedia-based knowledge graphs are far from complete. Especially, as Wikipediaâ€™s policies permit pages about subjects only if they have a certain popularity,

KG: Link to Knowledge Graph

Complete Snippet

Document: When it comes to factual knowledge about a wide range of domains, Wikipedia is often the prime source of information on the web. DBpedia and YAGO, as large cross-domain knowledge graphs, encode a subset of that knowledge by creating an entity for each page in Wikipedia, and connecting them through edges. It is well known, however, that Wikipedia-based knowledge graphs are far from complete. Especially, as Wikipediaâ€™s policies permit pages about subjects only if they have a certain popularity, such graphs tend to lack information about less well-known entities. Information about these entities is oftentimes available in the encyclopedia, but not represented as an individual page. In this paper, we present a two-phased approach for the extraction of entities from Wikipediaâ€™s list pages, which have proven to serve as a valuable source of information. In the first phase, we build a large taxonomy from categories and list pages with DBpedia as a backbone. With distant supervision, we extract training data for the identification of new entities in list pages that we use in the second phase to train a classification model. With this approach we extract over 700k new entities and extend DBpedia with 7.5M new type statements and 3.8M new facts of high precision.

Search related documents:

Co phrase search for related documents

adapt machine learning and machine learning: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11
additional complexity and machine learning: 1, 2

Co phrase search for related documents, hyperlinks ordered by date

ABSTRACT:

TERMS:

DOCUMENTS: