Research in programming Wikidata/Countries

From Wikiversity
Jump to: navigation, search

This research is devoted to the study of countries based on the knowledge base of the Wikidata international project. SPARQL queries were used in order to analyse and compare "countries" objects in Wikidata. A list of all currently existing countries, a list of countries ordered by date of creation, a list of demonyms of countries were generated. A bubble chart with the forms of government of countries and a graph of neighboring countries were constructed. In addition, conclusions were drawn regarding the completeness of the Wikidata for this topic.

List of countries[edit]

Let's list all countries.

#added 2017-02
#List of `instances of` "country" 
SELECT ?lang ?langLabel
WHERE
{
    ?lang wdt:P31 wd:Q6256.
    SERVICE wikibase:label { bd:serviceParam wikibase:language "en" }
}
}

SPARQL query, 202 results.

👍 Examples of the most complete and well-developed countries on the Wikidata include United States of America, Canada, Spain

👎 Almost empty and uninformative countries include Sahrawi Arab Democratic Republic, Transnistria, Kosovo

Age of countries[edit]

Let's build a list of countries sorted by the date of the country's foundation (the first mention of the country).

Given:

#List of `instances of` "countries sorted by inception" 
SELECT ?country ?countryLabel ?inception
WHERE
{
    ?country wdt:P31 wd:Q6256.
    ?country wdt:P571 ?inception .
    
    SERVICE wikibase:label { bd:serviceParam wikibase:language "en" }
}

ORDER BY (?inception)

SPARQL query, 112 results.

As a result of the query, a list of countries with the dates of their creation was received. For example: Russia (Q159) - Jan 1 0862, Kosovo (Q1246) - Feb 17 2008, South Sudan (Q958) - Jul 9 2011.

Completeness of Wikidata[edit]

Let's analyze the completeness of the Wikidata.

According to the "Russian classification of countries of the world" there are 251 countries on earth.[1]

This task does not take into account ancient, non-existent states (for example: Assyria (Q41137)), since they are not a "country" object but a "former country" object. Let us note that the number of former countries is an order of magnitude greater than the existing countries (see SPARQL query, returning more than two thousand of these countries)

According to the category of "Alphabetical list of countries and territories" in Russian Wikipedia, there are 252 countries.

According to the category of "List of sovereign states" in English Wikipedia, there are 206 countries.

It is not always possible to specify the exact date of the country's foundation for various reasons: absence, lack or inconsistency of written sources. For example, the basis of the Old Russian state is associated with the vocation of Varangian prince Rurik in 862, but there is no exact date (object Russia (Q159)). Also, some modern countries were preceded by a number of others and the date of formation of which of them should be considered as the date of creation of the country is an open question (for example, Mongolia (Q711))

Countries with an unfilled inception[edit]

Given:

#List of `instances of` "countries without a inception" 
SELECT ?country ?countryLabel 
WHERE
{
    ?country wdt:P31 wd:Q6256. #country
    
    MINUS { ?country wdt:P571 [] } . #inception of country
    SERVICE wikibase:label { bd:serviceParam wikibase:language "en" }
}

SPARQL query, 100 results.

So, on March 6 2017, the Wikidata contains 100 out of 198 entries about the currently existing countries with an unknown year of the country's foundation.

List of demonyms in English[edit]

Let's build a list of countries that have demonyms in English

Given:

#List of countries with demonyms in English
SELECT ?country ?countryLabel 
WHERE
{
	?country wdt:P31 wd:Q6256.       #country
	?country wdt:P1549 ?demonym .    #demonym
	FILTER((LANG(?demonym)) = "en")
	SERVICE wikibase:label { bd:serviceParam wikibase:language "en" }
}

GROUP BY ?country ?countryLabel

SPARQL query, 197 results.

As of April 26 2017, the Wikidata contains 197 of the 202 countries with demonyms.

List of demonyms[edit]

Let's build a list of all demonyms in English

#List of demonyms in English
SELECT ?country ?countryLabel ?demonym
WHERE
{
	?country wdt:P31 wd:Q6256.      #country
	?country wdt:P1549 ?demonym .   #demonym
	FILTER((LANG(?demonym)) = "en")
	SERVICE wikibase:label { bd:serviceParam wikibase:language "en" }
}

SPARQL query, 237 results.

On April 27 2017, the Wikidata contains 237 filled demonyms.

Countries with unfilled demonyms[edit]

Let's build a list of countries which do not have demonyms in English.

#List of countries without demonyms in English
SELECT ?country ?countryLabel 
WHERE
{
	?country wdt:P31 wd:Q6256.              # country
	MINUS { ?country wdt:P1549 ?demonym.    # except with demonyms
            FILTER((LANG(?demonym)) = "en") # in English
          }    
    
	SERVICE wikibase:label { bd:serviceParam wikibase:language "en" }
}
GROUP BY ?country ?countryLabel

SPARQL query, 5 results.

On April 27 2017, the Wikidata comprise 5 of the 202 countries with unfilled demonyms.

Number of completed demonyms in countries[edit]

Let`s display the list of countries, ordered by the number of demonyms filled in Wikidata.

#Count of demonyms in countries
SELECT  ?country ?countryLabel (count(*) as ?count)
WHERE
{
	?country wdt:P31 wd:Q6256.      #country
	?country wdt:P1549 ?demonym .   #demonym
	SERVICE wikibase:label { bd:serviceParam wikibase:language "en" }
}

GROUP BY ?country ?countryLabel 
ORDER BY DESC(?count)

SPARQL query,199 results.

The United States of America object contains the maximum number of demonyms - 41, Great Britain - 40, Germany -40, Canada - 36 and Russia - 34.

Вasic forms of government[edit]

Let`s construct a bubble diagram of the forms of government of countries.

Given:

#basic form of government ranking
#defaultView:BubbleChart
SELECT ?bfog ?form (count(*) as ?count)
WHERE 
{
    ?country wdt:P31 wd:Q6256.
    ?country wdt:P122 ?bfog .
    OPTIONAL {
		?bfog rdfs:label ?form
		filter (lang(?form) = "ru")
	}
}
GROUP BY ?bfog ?form
ORDER BY DESC(?count) ASC(?form)

SPARQL query, 30 results.

Bubble chart forms of government countries


As a result of the query, we get a bubble chart with the most popular forms of government in countries. The popular forms of government of the countries are the republic (in 20 countries), the constitutional monarchy (in 18 countries), the federal republic (in 18 countries), the parliamentary republic (in 17 countries) and the presidential system (in 11 countries).

Neighboring countries[edit]

We will construct a graph of neighboring countries.

Given:

#neighboring countries graph
#defaultView:Graph
SELECT ?country ?countryLabel ?sharesBorderWith ?sharesBorderWithLabel
WHERE
{
    ?country wdt:P31 wd:Q6256.

    SERVICE wikibase:label { bd:serviceParam wikibase:language "en" }
    OPTIONAL { ?country wdt:P47 ?sharesBorderWith . }

}

SPARQL query, 795 results.

Neighboring countries graph


As a result of the query, we get a graph with 787 edges, where the edge is a neighborhood between the two countries. The graph represents several connected components, since there are island countries that do not have neighbors (for example, Mauritius, Maldives, Madagascar).

Future work[edit]

  1. For each country display its flag and motto.
  2. Draw a map with the marked capitals of all the existing countries.
  3. Calculate the first five countries with the largest population density for each continent.
  4. Construct a column diagram showing the distribution of the number of countries by form of government. Estimate whether this distribution is a heavy tail
  5. Build list the countries ordered by the number of neighbors. Which countries have the maximum and minimum number of neighbors, what is the average number of neighbors? Is there a correlation between this index and some other parameter of the countries?

Tasks[edit]

1 Which of the two hundred existing countries today emerged in the most productive years by the number of formed countries?

1821, 1918, 1971, 1991
16 стран: Russia, Moldova, Belarus, Ukraine, Estonia, Slovenia, Republic of Macedonia, Croatia, Azerbaijan, Georgia, Kazakhstan, Uzbekistan, Armenia, Kyrgyzstan, Tajikistan
6 стран: Greece, Peru, Guatemala, Honduras, Costa Rica, Nicaraguа
5 стран: Latvia, Lithuania, Poland, Estonia, Georgia
4 страны: Bangladesh, Bahrain, Qatar, Sri Lanka

2 Latvia has 119, Thailand 77, Denmark 5, and Russia 81. What we are talking about?

Number of cities with a population of more than one million
Number of higher education institutions
Number of Administrative Units
Number of official languages

3 Area: Israel 20770 square kilometers, population 8463400 people, area Mongolia 1566000 square kilometers, population 2953190 people, area Republic of Korea 100295 Square kilometers, population 50219669 people, and the area of Singapore 719.1 square kilometers, the population of 5781728 people.
Arrange the flags of these Asian countries in order of increasing population density.

1 place, 2 place, 3 place, 4 place
Flag of South Korea.svg
Flag of Singapore.svg
Flag of Israel.svg
Flag of Mongolia.svg

4 Which of these languages are official in Russia?

Abaza
Moksha
Erzya
Belarusian

SPARQL queries with answers:

References[edit]

  1. classification of countries 2016.

Links[edit]