Research in programming Wikidata/University

From Wikiversity
Jump to: navigation, search

This research is devoted to the study of the Wikidata object - "University". With the help of SPARQL query to Wikidata, the following tasks are solved:

  • building a list of all universities;
  • building a bubble chart showing the ratio of universities in different countries;
  • the mapping of universities located in Russia.

In the course of the work, information was supplemented at the Wikidata objects corresponding to the universities of Russia, conclusions were drawn about the completeness of the data presented in Wikipedia and in the Wikidata.

List of universities[edit]

Let's create a list of all universities.

#added 2017-02
#List of `instances of` "university" 
SELECT ?university ?universityLabel
WHERE
{
    ?university wdt:P31 wd:Q3918.
    SERVICE wikibase:label { bd:serviceParam wikibase:language "en" }
}

SPARQL query 13298 records.

👍 The most complete and elaborate universities on the Wikidata are: University of Tokyo, Massachusetts Institute of Technology, Moscow State University

👎 Almost empty and uninformative universities turned out to be: Moscow State University of Food Production, Technical University of UMMC, Saratov State Socio-Economic University

Fullness of Wikidata: world universities[edit]

According to the international rating system of universities Webometrics Ranking of World Universities there are more than 19 thousand universities on Earth. The main list [1] includes about 12 thousand of them.

The non-profit rating [2] includes information on 12,000 universities and colleges with a website.

According to the category Университеты по алфавиту of the Russian Wikipedia, there are more than two and a half thousand universities. A single list of all universities in the English Wikipedia does not exist, but there are lists of countries, the category Universities and colleges by country.

The data that can be obtained by looking at these categories are different from those used in the international university rankings. This is due to the fact that many pages of universities are not properly filled in both Wikipedia and Wikidata, they have insufficient information and do not belong to the categories considered. For example, an article in the English Wikipedia Lincoln University of Business and Management has only the category Companies based in Sharjah, which has nothing to do with universities.

Fullness of Wikidata: Russian universities[edit]

According to the website Statistics of Russian education[3] for 2004, there were 4157 universities in Russia. Let's build a SPARQL query to find out how many universities have information in Wikidata:

#added 2017-03
#Number of universities in Russia
SELECT ?university ?universityLabel ?country 
WHERE
{
	?university wdt:P31 wd:Q3918; # instance of university 
        wdt:P17 wd:Q159;          # with country Russia
        rdfs:label ?item_label.    

    FILTER (LANG(?item_label) = "en") .
	SERVICE wikibase:label {
		bd:serviceParam wikibase:language "en".
	}
}

SPARQL query 436 records.

In the category Universities in Russia Russian Wikipedia has a page Project:Education/Lists/Universities in Russia, which contains a list of more than 350 universities. The category Universities in Russia of the English Wikipedia contains information about 77 Russian universities. Thus, Wikipedia provides information on only a tenth of the universities in Russia.

Universities in different countries[edit]

Let's construct a bubble diagram showing the ratio of the number of universities in different countries of the world.

#added 2017-03
#Universities in different countries 
#defaultView:BubbleChart
SELECT ?university ?country (count(*) as ?count)
WHERE
{
	?univer wdt:P31 wd:Q3918 ;
        wdt:P17 ?university.
  	OPTIONAL {
		?university rdfs:label ?country
		filter (lang(?country ) = "en")
	}
}
GROUP BY ?university ?country
ORDER BY DESC(?count)

SPARQL query 199 records.

The result is shown in the screenshot below.

Bubble chart visualization of universities in different countries


Altogether, out of 250 countries, two hundred have universities. Leaders in the number of universities are: the United States of America - 1,608 universities, India - 930, Japan - 836. Russia is in fifth place with 453 universities.

The map of Russian universities[edit]

Let's display on the map the universities located in Russia ("country" property). If the university has a website, add it on a tooltip ("official website" property).

#added 2017-02
#Locations of universities in Russia 
#defaultView:Map
SELECT ?universityLabel ?universityDescription ?website ?coord
WHERE {
	?university wdt:P31 wd:Q3918;
		wdt:P17 wd:Q159;
		wdt:P625 ?coord.
	OPTIONAL {
		?university wdt:P856 ?website
	}
	SERVICE wikibase:label {
		bd:serviceParam wikibase:language "en, ru".
	}
}

SPARQL query 205 records.

Incorrect filling[edit]

PetrSU[edit]

As a result of the script, Petrozavodsk State University (PetrSU) will not be displayed on the map. It is caused by the fact that the object on the Wiktidata, corresponding to PetrSU, does not have the property "coordinate location". To fix this, the coordinates were added to the PetrSU object.

TSU[edit]

As you can see in the screenshot below, the Tver State University (TSU) is displayed on a map in the Atlantic Ocean. This is also associated with the incorrect filling of the property "coordinate location". This problem also was fixed.

Incorrect filling of the property "coordinate location" caused Tver State Universite float at the ocean


100 objects[edit]

Let's data on the location of one hundred universities in the Wikidata. Now check the success of the filling, once again running the SPARQL query, presented above. Now it will return 308 records.

Map of Wikidata objects corresponding to Russian universities and having the property "coordinate location"


Future work[edit]

1. List all universities named after someone ("named after" property). For this list:

  • to learn the names of people universities are called by. What profession they have? The answer should be in the form of a bubble diagram.
  • to learn universities are called by the names of living or dead people? How many of those and other countries?
  • to draw a graph (histogram), where the X axis is the number of years since the death of the person who gave the name to the university, to the institution's founding, along the Y axis - how many such universities (for example, Ivanov AA died in 1825, established his name in 1850 , Then on the X axis at 25 add a single line to the histogram (by Y).

2. Find universities with founders ("founded by" property). Mark them on the map.

3. Make a ranking of universities by the number of awards according to the data of the Wikidata ("award received" property).

Tests[edit]

1

Match the names of universities with their logos:

1
2
3
4


1 2 3 4
University of Tokyo
International Hellenic University
University of Cyprus
Tyumen State University

2

Arrange countries by descending the number of universities:

1 2 3 4
Russia
Japan
India
USA


Keys (SPARQL queries):

References[edit]