Research in programming Wikidata/Operating systems

From Wikiversity
Jump to navigation Jump to search

The article explores the object of the "operating system" and its properties. The following problems were solved in the paper with the help of SPARQL queries: finding instances of the object "operating system", building a list of operating systems (OS) by base, by creation time, by programming language, in which the OS was written. Also a histogram is constructed, it shows the number of programs written in some programming language, and the proportion of how many of them work for some OS. A lot of software does not specify the programming language on which it was developed. The property "programming language" was added to several objects to improve the results.

Instances of the object "operating system"[edit | edit source]

Let's build a list of all the operating systems.

#added 2017-03
#List of `instances of` "operating system" 
SELECT ?os ?osLabel
WHERE
{
    ?os wdt:P31 wd:Q9135.
    SERVICE wikibase:label { bd:serviceParam wikibase:language "en" }
}

SPARQL query 510 results (January 2018), 1086 results (September 2020).

[+]> The most complete and detailed operating systems on Wikidata are: Linux, Windows, Windows 8

[-]> Almost empty and less informative operating systems are: SPIN, JavaOS, Atari TOS, Xubuntu

According to ProWD the only one Russian operating system on Wikidata is Miraculix, which has 7 properties. The leaders in terms of the number of properties (24 properties) among operating systems around the world are Microsoft Windows and Windows 8.

List of operating systems by base[edit | edit source]

SELECT ?osLabel ?baseLabel
WHERE
{
    ?os wdt:P31 wd:Q9135.
  	?os wdt:P144 ?base.
    SERVICE wikibase:label { bd:serviceParam wikibase:language "en" }
}
GROUP BY ?osLabel ?baseLabel

SPARQL query 159 results (January 2018), 118 results (September 2020).

The query shows relation between OS and it's base.

List of operating systems by creation time[edit | edit source]

#defaultView:Timeline
SELECT ?osLabel ?time
WHERE
{
    ?os wdt:P31 wd:Q9135. 
  	?os wdt:P571 ?time.
    SERVICE wikibase:label { bd:serviceParam wikibase:language "en" }
}
GROUP BY ?osLabel ?time
ORDER BY DESC(?time)

SPARQL query 298 results (January 2018), 238 results (September 2020).

Count of operating systems by programming language[edit | edit source]

#defaultView:BarChart
SELECT ?lang (count(*) as ?count)
WHERE 
{
    ?os wdt:P31 wd:Q9135.
    ?os wdt:P277 ?langObj .
    OPTIONAL {
		?langObj rdfs:label ?lang
		filter (lang(?lang) = "en")
	}
}
GROUP BY ?lang
ORDER BY DESC(?count) ASC(?lang)

SPARQL query 35 results (January 2018), 37 results (September 2020).

The query shows (only on the basis of the completed wikis, so it's not a fact that it's true) that the OS is predominantly written in Assembler language, which is certainly true, because it is the fastest, yet convenient programming language. On the second and third places are C and C++, which are not the worst analogue, because in spite of its "slowness", they are the most convenient and simple programming languages.

The programming languages used to write the operating system[edit | edit source]

It is also interesting to look at the results of this query in the form of a graph, it is also perfectly visible on it how many objects simply have an empty field "programming language".

#defaultView:Graph
SELECT ?os ?osLabel ?sharesBorderWith ?sharesBorderWithLabel
WHERE
{
    ?os wdt:P31 wd:Q9135.
    SERVICE wikibase:label { bd:serviceParam wikibase:language "en" }
    OPTIONAL { ?os wdt:P277 ?sharesBorderWith . }
}

SPARQL query 533 results (March 2017), 1117 results (September 2020)

If you look at the same query, but with such a restriction that at least the number of operating systems written in the language is at least 2, you can see a significant difference with the result of the previous query.

#defaultView:Graph
SELECT ?os ?osLabel ?language ?languageLabel
WHERE
{
  {
    SELECT ?language ?languageLabel
    WHERE {
      ?os wdt:P31 wd:Q9135. # os is os
      ?os wdt:P277 ?language. # os written by language
      SERVICE wikibase:label { bd:serviceParam wikibase:language "en" }
    } 
    Group by ?language ?languageLabel 
    Having (Count(?os) > 1) # get laguages which has more than one written os
  }
  ?os wdt:P31 wd:Q9135. # os is os
  ?os wdt:P277 ?language. # os written by language
  SERVICE wikibase:label { bd:serviceParam wikibase:language "en" }
}

SPARQL query 118 results (October 2020)

Graph of languages used to create operating systems 2020.


Completeness of the Wikidata[edit | edit source]

According to information from the site www.operating-system.org, there are about 611 operating systems [1] (not including Linux distributions, which number exceeds the number of operating systems themselves). SPARQL query told us only about 510 operating systems. And if you look through a large number of objects from the query, it becomes clear that many of them are not very well filled, or even completely empty. From this observation we can conclude about the incompleteness of the wikidata.

Problem[edit | edit source]

  1. List of OSs and its programming languages (programming language (P277))
SELECT ?osLabel ?langLabel
WHERE 
{
    ?os wdt:P31 wd:Q9135.
    ?os wdt:P277 ?lang .
    SERVICE wikibase:label { bd:serviceParam wikibase:language "en" }
}

SPARQL query 147 results (September 2020).

2. List of objects that have the "operating system" property (P306)

SELECT ?soft ?softLabel ?os ?osLabel
WHERE
{
  	?soft wdt:P306 ?os.
    SERVICE wikibase:label { bd:serviceParam wikibase:language "en" }
}

SPARQL query 5738 results (January 2018), 30184 results (September 2020).

3. List of objects that have the "operating system" property and its programming languages (programming language (P277))

SELECT ?soft ?softLabel ?os ?osLabel (count(*) as ?count)
WHERE
{
  	?soft wdt:P306 ?os.
    ?soft wdt:P277 ?lang .
    SERVICE wikibase:label { bd:serviceParam wikibase:language "en" }
}
GROUP BY ?soft ?softLabel ?os ?osLabel
ORDER BY DESC(?count) ASC(?lang)

SPARQL query 2259 results (December 2018), 6883 results (September 2020).

4.1. Cartesian product of OSs with programming languages and software with programming languages.

SELECT ?soft ?softLabel ?os ?osLabel ?lang1Label ?lang2Label
WHERE
{
  	?soft wdt:P306 ?os.
    ?soft wdt:P277 ?lang1 .
    ?os wdt:P277 ?lang2 .
    SERVICE wikibase:label { bd:serviceParam wikibase:language "en" }
}
GROUP BY ?soft ?softLabel ?os ?osLabel ?lang1Label ?lang2Label
ORDER BY DESC(?softLabel)

SPARQL query 5336 results (January 2018), 18976 results (September 2020).

4.2. The query shows how many software was written in lang1 for OS written in lang2.

SELECT ?lang1Label ?lang2Label (count(*) as ?count)
WHERE
{
  	?soft wdt:P306 ?os.
    ?soft wdt:P277 ?lang1 .
    ?os wdt:P277 ?lang2 .
    SERVICE wikibase:label { bd:serviceParam wikibase:language "en" }
}
GROUP BY ?lang1Label ?lang2Label
ORDER BY DESC(?count) DESC(?lang1Label) DESC(?lang2Label)

SPARQL запрос 418 results (January 2018), 829 results (September 2020). The query shows that most of the software written for OS written in C/C ++ is also written in C/C ++. On the whole, it can be seen that most of the software is written in C, C ++, Python, Java, ObjectiveC.

5. How many software was written for the operating system using a language

SELECT ?osLabel ?lang1Label (count(*) as ?count)
WHERE
{
  	?soft wdt:P306 ?os.
    ?soft wdt:P277 ?lang1 .
    ?os wdt:P277 ?lang2 .
    SERVICE wikibase:label { bd:serviceParam wikibase:language "en" }
}
GROUP BY ?os ?osLabel ?lang1Label
ORDER BY DESC(?count) DESC(?osLabel)

SPARQL запрос 378 results (January 2018), 671 results (September 2020). The query shows that most of the software written for macOS is written in C ++, C, Python, for Android - in C ++ and Java, for iOS - in C ++.

6. The histogram shows how many software was written in programming language, and how many of it works under a some OS

#defaultView:BarChart
SELECT ?lang1Label (count(*) as ?count) ?osLabel
WHERE
{
  	?soft wdt:P306 ?os.
    ?soft wdt:P277 ?lang1 .
    ?os wdt:P277 ?lang2 .
    SERVICE wikibase:label { bd:serviceParam wikibase:language "en" }
}
GROUP BY ?lang1Label ?softLabel ?osLabel
ORDER BY DESC(?count) DESC(?osLabel)

SPARQL запрос 378 results (January 2018), 671 results (September 2020).


Programming languages and count of OSs for which programs written in languages (2018).
Programming languages and count of OSs for which programs written in languages (2020).



The histogram in the figure allows you to see for each programming language the number of programs that were written on it, and for which operating systems these programs work. It can be seen from the graph that the largest number of programs is written on С(1084), С++(1598), Java(526), JavaScript(242), Objective C(252), Python(454).

Let's look at each of these languages in more details.

Most of the programs which are written in C are for macOS(472) and Linux(235). The language was developed in 1972, but it still does not lose its popularity because, probably, it is using to write low-level applications.

Most of the programs which are written in С++ are for macOS(780), Linux(265) and Android(264). Probably, C++ will lead for a long time, because at the moment it is using for solutions that require high performance, which is not allowed by high-level languages like Java or C#.

Most of the programs which are written in Java are for macOS(196) and Android(156). Probably, Java is popular due to code portability, i.e. the Java code will be run on any machine in which the JVM is installed.

Most of the programs which are written in JavaScript are for macOS(100) and Android(60) и iOS(40). It is using to write the client side of web applications, it reduces server load and increases application speed.

Most of the programs which are written in ObjectiveC are for macOS(112) and iOS(72). Some time ago, if was especially using by the Apple corporation.

Most of the programs which are written in Python are for macOS(212) и Linux(107). It is a high-level language, has a low entry threshold. It is using, for example, to write web applications and data analysis.

Looking at the histogram, we can conclude that each of these languages has taken its "region" in the field of software development and is used for a certain range of tasks. It is also seen, that most of the programs are for macOS(2388), Linux(895) or Android(908).

Completeness of the Wikidata[edit | edit source]

Let's compare queries 2 and 3. Оbviously that a lot of software products don't have "programming language" property.

Filling in the Wikidata[edit | edit source]

After filling the "programming language" field in 100 software products, query 3 shows 2502 results, 06.11.2017 01:40.

Future work[edit | edit source]

  1. Show all those OSs that have a "logo (P18)" property.
  1. Construct a diagram reflecting the statistics of how many OSs was created in which country. Шt is permitted to use properties developer, country или headquarters location, if creator is a company, or country of citizenship, if creator is an individual developer.
  1. Count how many OSs was created (inception (P571)) in 1995.

Exercises[edit | edit source]

1 Specify the relation between the OS and its developer:

Apple Sun Microsystems Canonical Ltd.
Newton OS
JavaOS
Ubuntu Touch

2 Select desktop image of OS Fuduntu:

Ubuntu-smartphone.png
Linux Mint 17 (Qiana) Cinnamon.png
Fuduntu14.9-defaultdesktop.png
Kubuntu 16.10.png

3 Select the OS based on which most other OSs were created:

Debian
Android
Ubuntu
Linux kernel


1. SPARQL query, OSs and developers

2. SPARQL query, OSs and logos

3. SPARQL query, OSs and countries

4. SPARQL query, OSs and count of "descendants"

References[edit | edit source]

  • Sysoev M. "Operating systems in Russia". ProWD. Retrieved 2020-09-28.