Research in programming Wikidata/Cities

From Wikiversity
Jump to: navigation, search

This research is devoted to the study of sister cities based on Wikidata. With help of Wikidata Query Service city-related objects were counted and following information gathered:

  • Number of cities with no sister cities
  • List of cities ordered by number of sister cities
  • Number of cities with certain amount of sister cities (in a form of graph)

Item lists[edit]

«City»[edit]

  • Wikidata element: Q515
SELECT ?city ?cityLabel WHERE {
  ?city wdt:P31 wd:Q515.
  SERVICE wikibase:label { bd:serviceParam wikibase:language "en" }
}

SPARQL-query at query.wikidata.org (20800 items as of 03.05.2017)

Most complete elements include San-Francisco, Berlin, Petrozavodsk, …

Almost empty elements are Madinat Zayed, Muzaffarpur, Willow-River, …

«Big city»[edit]

SELECT ?city ?cityLabel WHERE {
  ?city wdt:P31 wd:Q1549591.
  SERVICE wikibase:label { bd:serviceParam wikibase:language "en" }
}

SPARQL-query at query.wikidata.org (198 items as of 03.05.2017)

Most complete elements include Bern, Berlin, Geneva, …

Almost empty elements are Balanga (Nigeria), Ungaran, Kayes, …

«City» and «Big city»[edit]

SELECT ?city ?cityLabel WHERE {
  { ?city wdt:P31 wd:Q515 } UNION
  { ?city wdt:P31 wd:Q1549591 }
  SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}

SPARQL-query at query.wikidata.org (20998 items as of 03.05.2017)

Tasks[edit]

How many cities don't have a single sister city?[edit]

Used:

SELECT (COUNT(?city) as ?count) WHERE {                             # Counting items ... 
  ?city wdt:P31 wd:Q515.                                            # ... which are cities ...
  FILTER NOT EXISTS { ?city wdt:P190 [] }                           # ... with unfilled property "sister city"
  SERVICE wikibase:label { bd:serviceParam wikibase:language "en" }
}

SPARQL-query at query.wikidata.org (17823 cities as of 03.05.2017)

List of cities ordered by number of sister cities[edit]

All[edit]

Used:

SELECT ?cityLabel (COUNT(?item) AS ?count) WHERE {                   # Counting sister cities  ...
  ?city wdt:P31 wd:Q515.                                             # ... of cities ...
  ?city wdt:P190 ?item.                                              # ... with filled property "sister city"
  SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
GROUP BY ?city ?cityLabel                                            # Grouping by city
ORDER BY DESC(?count)                                                # Sorting by number of sister cities (descending)

SPARQL-query at query.wikidata.org, 2973 records as of 12.05.2017

Russia[edit]

Used:

SELECT ?cityLabel (COUNT(?item) AS ?count) WHERE {                   # Counting sister cities  ...
  ?city wdt:P31 wd:Q515.                                             # ... of cities ...
  ?city wdt:P17 wd:Q159.                                             # ... belonging to Russia ...
  ?city wdt:P190 ?item.                                              # ... with filled property "sister city"
  SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
GROUP BY ?city ?cityLabel                                            # Grouping by city
ORDER BY DESC(?count)                                                # Sorting by number of sister cities (descending)

SPARQL-query at query.wikidata.org, 207 records as of 12.05.2017

Number of cities with certain amount of sister cities[edit]

All[edit]

Used:

#defaultView:LineChart                                                   # Do line chart as result representation
SELECT ?haveNSisterCities (COUNT(?haveNSisterCities) AS ?qty) WHERE {    # Count No. of cities having sister cities 
                                                                         # and number of sister cities themselves
  {
     SELECT (COUNT(?item) AS ?haveNSisterCities) WHERE {                 # Count sister cities ...
       ?city wdt:P31 wd:Q515.                                            # ... of cities ...
       ?city wdt:P190 ?item.                                             # ... that do have them
        SERVICE wikibase:label { bd:serviceParam wikibase:language "en".} 
      }
      GROUP BY ?city                                                     # Group list by city
      ORDER BY DESC(?haveNSisterCities)                                  # Order by sister city qty (descending)
  }
}
GROUP BY ?haveNSisterCities                                              # Group by sister city qty
ORDER BY DESC(?haveNSisterCities)                                        # Order by sister city qty (descending)

SPARQL-query at query.wikidata.org

Russia[edit]

Used:

#defaultView:LineChart                                                   # Do line chart as result representation
SELECT ?haveNSisterCities (COUNT(?haveNSisterCities) AS ?qty) WHERE {    # Count No. of cities having sister cities 
                                                                         # and number of sister cities themselves
  {
     SELECT (COUNT(?item) AS ?haveNSisterCities) WHERE {                 # Count sister cities ...
       ?city wdt:P31 wd:Q515.                                            # ... of cities ...
       ?city wdt:P17 wd:Q159.                                            # ... belonging to Russia ...
       ?city wdt:P190 ?item.                                             # ... that do have them
        SERVICE wikibase:label { bd:serviceParam wikibase:language "en".} 
      }
      GROUP BY ?city                                                     # Group list by city
      ORDER BY DESC(?haveNSisterCities)                                  # Order by sister city qty (descending)
  }
}
GROUP BY ?haveNSisterCities                                              # Group by sister city qty
ORDER BY DESC(?haveNSisterCities)                                        # Order by sister city qty (descending)

SPARQL-query at query.wikidata.org

Relation between number of sister cities the city have (S) and number of cities which have this amount of sister cities (N)


As shown on the graph above most cities (78% or 162 cities) have one to five sister cities.

Wikidata completeness[edit]

City is a type of human settlement with people not occupied with agriculture. Difficult part is that different countries use different approaches to claim human settlement a city. Primarily — number of inhabitants. Some countries don't define a term "city" at all. Like in France there is only one settlement unit of this type — commune, regardless of polulation or people occupation. Thus it's uncertain whether some settlement should be included in cities list or not.

By 2010 Russian Census results there were 1100 cities in Russia.[1]

Number of Russian cities having an artcle in Russian Wikipedia — 1113, in English Wikipedia — 1110.

Number of elements in Wikidata that are Russian cities — 1122.[2] Thus Wikidata completely covers at least Russian cities.

Another problem with Wikidata city analyzing is related to the property "instance of". For some city it may be just city, big city, or even both. Some cities (like Petrozavodsk) having more than 100000 inhabitants are marked as "city" making it hard to analyze all cities. The solution of this is to query items like that:

?X wdt:P31/wdt:P279* ?Y

In this case all direct and indirect subclasses of ?Y will be used as ?X.

Future work[edit]

  1. Plot diagram of Russian sister cities.
  2. Get list of Russian cities situated beyond the Arctic circle.
  3. Which rivers of Russia have most number of cities standing on them?

Tests[edit]

1

Which of these cities were named after toponyms?

Tolyatti
Tula
Chernyakhovsk
Kurilsk
Vologda
Obninsk

2

Which of these flags are belonging to these cities: Nizhnevartovsk, Petropavlovsk-Kamchatsky, Neftekamsk, Karabulak?

Flag of Neftekamsk.svg
Flag of Karabulak (Ingushetia).png
Steag aneniinoi.jpg
Flag of Nizhnevartovsk (Khanty-Mansia).svg
Flag of Pastavy.svg
Flag of Petropavlovsk-Kamchatsky (Kamchatka krai).png

3

Which of these cities were founded more than 400 years ago?

Moscow
Sarov
Kazan
Astrakhan
Samara
Voronezh

Check yourself:

  1. cities named after toponyms
  2. flags of cities
  3. cities founded more than 400 years ago

Addon[edit]

  1. Total number of sister city statements per country

References[edit]

Links[edit]