AIFB DataSet

From Wikiversity
Jump to: navigation, search

AIFB DataSet is a Semantic Web (RDF) dataset used as a benchmark in data mining.

Get the data[edit]

The dataset is distributed from https://figshare.com/articles/AIFB_DataSet/745364.

1

Download the data file. Which file format is the data encoded with?

Notation3
RDF XML
JSON-LD

2

Which ontology does it use?

SWRC
FOAF
SIOC


Get context[edit]

The dataset was used in Kernel Methods for Mining Instance Data in Ontologies. Find and read the part of the dataset on page 10.

How many instances does the paper record of the class "Person"?

2,547
1,058
1,232


Python[edit]

Setup a Python environment with rdflib installed and load the AIFB file and count the number of times the "affiliation" property is used:

from rdflib import Graph, URIRef

g = Graph()
g.load('aifbfixed_complete.n3', format='n3')
len(list(g.triples((None, URIRef("http://swrc.ontoware.org/ontology#affiliation"), None))))

The URI for the affiliations can be obtained with:

affiliations = g.triples((None, URIRef("http://swrc.ontoware.org/ontology#affiliation"), None))
groups = set(affiliation[2] for affiliation in affiliations)

How many different affiliations are there?

Find the name of the affiliations via "http://swrc.ontoware.org/ontology#name".