A Linked Open Data version of the IsA database
Sven Hertling
Heiko Paulheim

Hypernymy relations are an important asset in many applications, and a central ingredient to Semantic Web ontologies. The IsA database is a large collection of such hypernymy relations extracted from the Common Crawl. WebIsALOD is the Linked Open Data version of the IsA database, containing 11.7M hypernymy relations, each provided with rich provenance information and confidence estimates.

Linked Open Data Endpoint

We provide a Linked Data endpoint using derefencable URIs. To browse the LOD enpoint, use, e.g., the concept president

Content negotiation is also provided. For example, you can retrieve the data for the above example as n-quads or csv:


curl -v -H "Accept: application/n-quads" http://webisa.webdatacommons.org/concept/_president_
curl -v -H "Accept: text/csv" http://webisa.webdatacommons.org/concept/_president_

Schema

Example depiction of a hypernymy relation with its metadata:

Example depiction of a hypernymy relation with its metadata

SPARQL Endpoint

The SPARQL Enpoint is available at /sparql.

Patterns

All patterns used for the extraction can be retrived with the following query ( Open results in browser ):


PREFIX prov: <http://www.w3.org/ns/prov#>
PREFIX isao: <http://isadb.webdatacommons.org/ontology#>

SELECT *
WHERE{
   ?pattern_activity a prov:Activity;
       prov:used ?pattern.
   ?pattern a prov:Entity;
       rdfs:label ?pattern_label;
       rdfs:comment ?pattern_comment;
       prov:wasDerivedFrom ?pattern_source;
       isao:hasRegex ?pattern_regex;
       isao:hasType ?pattern_type.
}
ORDER BY ?pattern_label

Dataset description

The VOID file is located at http://webisa.webdatacommons.org/.well-known/void
The dataset is also described at datahub with the name webisalod.

Type breakdown of the instances linked to DBpedia

Type breakdown of the instances linked to DBpedia

Data Dumps

Links to the dumps of the dataset (gzipped n-quads):

Crowdsourcing results

Templates:

Results:

Code Repository

The code repository with all results is hosted at github: sven-h/webisalod

Citing WebIsaLod