Research in programming Wikidata/Newspapers

From Wikiversity
Jump to navigation Jump to search

The newspaper is a printed periodical. The article is devoted to the study of the Wikidata objects "newspaper". With the help of SPARQL-queries, computed on Wikidata objects, the following tasks have been solved: there are 106 newspaper magazines with geo-referencing in Wikidata (properties of "location coordinates") and most of all newspapers possessing data having corresponding cities in Europe and America. It was find out that the most popular genres of newspapers in the world are satire, information, analytics, scientific journal and omniscience.

Instances of the object "Newspapers"[edit | edit source]

Let`s create a list of newspapers around the world using the following script:

SELECT ?newspaper ?newspaperLabel 
  ?newspaper wdt:P31 wd:Q11032.
  SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }

SPARQL-query, 14949 results.

List of newspapers that have a label in Russian and English:

SELECT ?newspaper ?label_en ?label_ru WHERE {
  ?newspaper wdt:P31 wd:Q11032.
  ?newspaper rdfs:label ?label_en.
  ?newspaper rdfs:label ?label_ru.
  FILTER((LANG(?label_en)) = "en")
  FILTER((LANG(?label_ru)) = "ru")

SPARQL-query, 364 results.

In the second script, there were fewer entries than in the first one, because not all newspapers have a labels in Russian and English at the same time.

The most complete and well-developed newspapers on the Wikidata are:

The low-information newspapers on Wikidata were:

Completeness of Wikidata[edit | edit source]

Let's analyze the completeness of Wikidata. According to the teaching aid [1] to 2009, more than 50,000 print media were registered in the Russian Federation, including 27425 newspapers and weekly newspapers and 20433 journals.

According to the category List of newspapers in Russia of English Wikipedia there are 16 daily newspapers in Russia, as well as 9 newspapers, which are published with a frequency of one to four issues a week.

According to the category List of national newspapers, the newspaper is regarded as a national newspaper, that is, it must be distributed throughout the country, unlike a local newspaper that is published in a certain city or region. There are 87 national newspapers, including the capital's newspapers.

According to the category Russian newspapers in Russian Wikipedia there are 115 newspapers that are printed in Russia. Many newspapers have not only a print edition, but also a website, see for example the website of Russia Beyond the Headlines.

Only 0.8% of newspapers (of the total number of registered newspapers (27425)) are presented in Wikidata, according to all the above categories. This indicates the low occupancy of the Wikidata.

The genre of newspapers[edit | edit source]

Newspaper materials should have a certain focus - careful consideration of all the specific features that are specific to the audience of a country or group of countries for which the publication is intended [2]. There are three main genres in the newspapers:

  1. informative
  2. analytical
  3. artistic and journalistic

The informative genre includes: notes, reports, interviews, reports. Immediately this genre transmits to the audience all past announcements [3].

Analytical genres - correspondence, commentary, article, review, review of the press, letter, review - have broader time boundaries, they contain a study and analysis of the system of facts, situations, generalizations and conclusions [4].

Artistic-publicistic genres - sketch, feuilleton, pamphlet - have a greater emotional power, contain figurative expressive means [5].

Let us construct a bubble diagram of the distribution of newspapers by genre.

#basic form of government ranking
SELECT ?genre ?form (count(*) as ?count)
    ?newspaper wdt:P31 wd:Q11032. # instance of newspaper
    ?newspaper wdt:P136 ?genre .  # genre of newspaper
		?genre rdfs:label ?form
		filter (lang(?form) = "ru")
GROUP BY ?genre ?form
ORDER BY DESC(?count) ASC(?form)

SPARQL-query, 19 results.

Fig. 1. Bubble diagram of newspapers with property genre

The most popular genres were: satire, information, analytics, scientific journal, omniscience.

Newspapers on the map[edit | edit source]

The property "coordinate location" means the geographical coordinates of the city in which the newspaper is printed. We will publish newspapers that have the "coordinate location" property on the world map. For example, the newspaper "Banner" has in the "coordinate location" property the following coordinates: 54°30'34"N,36°14'59" E, and passing through them the city of Kaluga is displayed. This means that this newspaper "Banner" is published in Kaluga.

#Newspapers on the world map 
SELECT ?newspaper ?location ?newspaperLabel WHERE {
  ?newspaper wdt:P31 wd:Q11032.   # instance of newspaper
  ?newspaper wdt:P625 ?location.  # location of newspaper
  SERVICE wikibase:label { bd:serviceParam wikibase:language "en".}

SPARQL-query, 95 results.

Fig. 2. Newspapers with the property coordinate location on the world map

With the help of this script, it is notice that newspapers with the "coordinate location" property have geographical coordinates, in most cases, corresponding to cities in Europe and America.

Filling out the Wikidata[edit | edit source]

The property "genre" (genre) means the way and form of information transfer in newspapers. For example, take the newspaper "Work". which in the property "genre" indicates "information" and this means that this newspaper refers to the information genre.

Let's construct the list of newspapers without the filled property genre (Q483394) and main subject (P921), that find out which newspapers should be added the "genre" and "main subject" property.

SELECT ?newspaper
  ?newspaper wdt:P31 wd:Q11032.           # instance of newspaper
  { ?newspaper wdt:P17 wd:Q34266 } UNION  # Russian Empire
  { ?newspaper wdt:P17 wd:Q15180 } UNION  # Soviet Union
  { ?newspaper wdt:P17 wd:Q159 }          # Russia
 MINUS { ?newspaper wdt:P136 [] }. # newspaper without a genre
 MINUS { ?newspaper wdt:P921 [] }. # newspaper without a main subject
 SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }

SPARQL-query, 26 results.

In the course of the work, the genre and main subject properties were filled in 100 newspaper objects.

With the help of the last script it was possible to get a list of 26 newspapers that do not have the properties of genre and main subject. These properties were filled in 26 newspapers. 74 objects in the category Russian newspapers were also examined. They had the genre and main subject properties filled.

Total, the properties of genre and main subject are filled in 100 objects (newspapers).

Let us construct a bubble diagram according to the "main subject" property of the newspapers of the whole world on the Wikidate:

SELECT ?subject ?form (COUNT(*) AS ?count) WHERE {
  ?newspaper wdt:P31 wd:Q11032.      # instance of newspaper
  ?newspaper wdt:P921 ?subject.      # main subject of newspaper
    ?subject rdfs:label ?form.
    FILTER((LANG(?form)) = "en")
GROUP BY ?subject ?form
ORDER BY DESC(?count) ?form

SPARQL-query, 68 results.

Fig. 3. Bubble diagram of newspapers with main subject in the world

This script showed that the most popular topics in the newspapers are:

  • news (66 newspapers),
  • politics (50 newspapers),
  • economic science (26 newspapers),
  • culture (21 newspapers),
  • sport (21 newspapers).

Let us construct a bubble diagram according to the "genre" property from the newspapers of the whole world on the Wikidata.

SPARQL-query, 20 results.

Fig. 4. Bubble diagram of newspapers with genre in the world

The main newspaper genres are:

  • information (103),
  • satire (18),
  • analytics (5).

A number of genres are much smaller than a of main subjects, since a newspaper can have only one genre, and a newspaper can have several main subjects.

Future work[edit | edit source]

  1. Output 20 newspapers with a circulation, using the property quantity (Q41792217).
  2. Find the newspaper that has the longest printing history in Russia using the inception (P571) property.
  3. Create a diagram that clearly shows where the world's most produced newspapers with political and economic themes. Use the main subject (P921) property.

Test[edit | edit source]

1 The following newspaper titles are listed: New Look, Prinevsky Krai, Private Correspondent. And as the year of their creation: 1919, 1992, 2008.
Correlate the name of the newspaper and the date of its creation.

1919, 1992, 2008

2 Select the newspaper (s) that were printed only in Russia.

Bulletin of Manchuria

3 The following newspapers are given: Le Temps, Kyym, Pavlovsky-Posadskie Izvestia, true. Each of them has its own circulation. The circulation of the newspaper Le Temps is 29.6 thousand, Kym is 23 thousand, Pavlovo-Posad news - 4050, the Crimean truth - 30 thousand. The newspaper Le Temps was published in France with a population of 66.6 million people, Kyim, Pavlovo-Posadskie Izvestia and the Crimean truth are Russian newspapers with a population of 146.8 million, the Crimean truth newspaper was published on the Crimean Truth with a population of 2.3 million. It is necessary to calculate: how many people account for one newspaper in the country and to answer in ascending order?

78.03, 2251.5, 6383, 36248
Le Temps
Pavlovsky-Posadskie Izvestia
Crimean truth

SPARQL queries with answers:

References[edit | edit source]

  1. Radhenko 2005, p. 5.
  2. Grabelnikov 2009, p. 233.
  3. Bobkov 2005, p. 8.
  4. Bobkov 2005, p. 16.
  5. Bobkov 2005, p. 6.

Literature[edit | edit source]

  • A.K. Bobkov (2005). "Newspaper genres: a tutorial" (in Russian). Irkutsk State University. p. 64. (PDF-version)