Research in programming Wikidata/Operating systems
The article explores the object of the "operating system" and its properties. The following problems were solved in the paper with the help of SPARQL queries: finding instances of the object "operating system", building a list of operating systems (OS) by base, by creation time, by programming language, in which the OS was written. Also a histogram is constructed, it shows the number of programs written in some programming language, and the proportion of how many of them work for some OS. A lot of software does not specify the programming language on which it was developed. The property "programming language" was added to several objects to improve the results. Wikidata plays a big role in software documentation.
Instances of the object "operating system"
[edit | edit source]- Objects: operating system(Q9135)
Let's build a list of all the operating systems.
#added 2017-03
#List of `instances of` "operating system"
SELECT ?os ?osLabel
WHERE
{
?os wdt:P31 wd:Q9135.
SERVICE wikibase:label { bd:serviceParam wikibase:language "en" }
}
SPARQL query 510 results (January 2018), 1086 results (September 2020).
[+]> The most complete and detailed operating systems on Wikidata are: Linux, Windows, Windows 8
[-]> Almost empty and less informative operating systems are: SPIN, JavaOS, Atari TOS, Xubuntu
According to ProWD the only one Russian operating system on Wikidata is Miraculix, which has 7 properties. The leaders in terms of the number of properties (24 properties) among operating systems around the world are Microsoft Windows and Windows 8.
List of operating systems by base
[edit | edit source]SELECT ?osLabel ?baseLabel
WHERE
{
?os wdt:P31 wd:Q9135.
?os wdt:P144 ?base.
SERVICE wikibase:label { bd:serviceParam wikibase:language "en" }
}
GROUP BY ?osLabel ?baseLabel
SPARQL query 159 results (January 2018), 118 results (September 2020).
The query shows relation between OS and it's base.
List of operating systems by creation time
[edit | edit source]#defaultView:Timeline
SELECT ?osLabel ?time
WHERE
{
?os wdt:P31 wd:Q9135.
?os wdt:P571 ?time.
SERVICE wikibase:label { bd:serviceParam wikibase:language "en" }
}
GROUP BY ?osLabel ?time
ORDER BY DESC(?time)
SPARQL query 298 results (January 2018), 238 results (September 2020).
Count of operating systems by programming language
[edit | edit source]#defaultView:BarChart
SELECT ?lang (count(*) as ?count)
WHERE
{
?os wdt:P31 wd:Q9135.
?os wdt:P277 ?langObj .
OPTIONAL {
?langObj rdfs:label ?lang
filter (lang(?lang) = "en")
}
}
GROUP BY ?lang
ORDER BY DESC(?count) ASC(?lang)
SPARQL query 35 results (January 2018), 37 results (September 2020).
The query shows (only on the basis of the completed wikis, so it's not a fact that it's true) that the OS is predominantly written in Assembler language, which is certainly true, because it is the fastest, yet convenient programming language. On the second and third places are C and C++, which are not the worst analogue, because in spite of its "slowness", they are the most convenient and simple programming languages.
The programming languages used to write the operating system
[edit | edit source]It is also interesting to look at the results of this query in the form of a graph, it is also perfectly visible on it how many objects simply have an empty field "programming language".
#defaultView:Graph
SELECT ?os ?osLabel ?sharesBorderWith ?sharesBorderWithLabel
WHERE
{
?os wdt:P31 wd:Q9135.
SERVICE wikibase:label { bd:serviceParam wikibase:language "en" }
OPTIONAL { ?os wdt:P277 ?sharesBorderWith . }
}
SPARQL query 533 results (March 2017), 1117 results (September 2020)
If you look at the same query, but with such a restriction that at least the number of operating systems written in the language is at least 2, you can see a significant difference with the result of the previous query.
#defaultView:Graph
SELECT ?os ?osLabel ?language ?languageLabel
WHERE
{
{
SELECT ?language ?languageLabel
WHERE {
?os wdt:P31 wd:Q9135. # os is os
?os wdt:P277 ?language. # os written by language
SERVICE wikibase:label { bd:serviceParam wikibase:language "en" }
}
Group by ?language ?languageLabel
Having (Count(?os) > 1) # get laguages which has more than one written os
}
?os wdt:P31 wd:Q9135. # os is os
?os wdt:P277 ?language. # os written by language
SERVICE wikibase:label { bd:serviceParam wikibase:language "en" }
}
SPARQL query 118 results (October 2020)
Completeness of the Wikidata
[edit | edit source]According to information from the site www.operating-system.org, there are about 611 operating systems [1] (not including Linux distributions, which number exceeds the number of operating systems themselves). SPARQL query told us only about 510 operating systems. And if you look through a large number of objects from the query, it becomes clear that many of them are not very well filled, or even completely empty. From this observation we can conclude about the incompleteness of the wikidata.
Programming languages for creating operating systems
[edit | edit source]List of operating systems and languages in which they are written
[edit | edit source]To get a list of the operating systems (OS) links and the programming language used to create it, you can run the following query
SELECT ?osLabel ?langLabel
WHERE
{
?os wdt:P31 wd:Q9135. # os is instace of operating system
?os wdt:P277 ?lang. # os is written on programming language
SERVICE wikibase:label { bd:serviceParam wikibase:language "en" }
}
SPARQL query 147 results (September 2020).
Software and operating systems on which they are used
[edit | edit source]The amount of software can be regarded as an indicator of the importance of the OS. The more OS users, the more software vendors will want to provide their products to such an audience. Hence the conclusion suggests itself: the more software is written for the system, the more significant it is. This request shows which software is supported by which OS.
SELECT ?software ?softwareLabel ?os ?osLabel
WHERE
{
?software wdt:P306 ?os. # on which os can works software
SERVICE wikibase:label { bd:serviceParam wikibase:language "en" }
}
SPARQL query 5738 results (January 2018), 30184 results (September 2020).
To get the most popular operating systems for software developers, you can modify the previous request in this way
#defaultView:BarChart
SELECT ?os ?osLabel (COUNT(*) as ?count)
WHERE
{
?software wdt:P306 ?os. # on which os can works software
SERVICE wikibase:label { bd:serviceParam wikibase:language "en" }
}
GROUP BY ?os ?osLabel
LIMIT 10
As you can see, the priorities for developers are: Linux. Microsoft Windows, Ubuntu.
A number of programming languages were used to create software for the operating system
[edit | edit source]SELECT ?software ?softwareLabel ?os ?osLabel (count(*) as ?count)
WHERE
{
?software wdt:P306 ?os. # on which os can works software
?software wdt:P277 ?lang. # programming language in which developed software
SERVICE wikibase:label { bd:serviceParam wikibase:language "en" }
}
GROUP BY ?software ?softwareLabel ?os ?osLabel
ORDER BY DESC(?count)
SPARQL query 2259 results (December 2018), 6883 results (September 2020). The request shows for each software for each OS in how many languages it is written
Cartesian product of OS and languages with software and languages
[edit | edit source]SELECT ?software ?softwareLabel ?os ?osLabel ?softwareLanguageLabel ?osLanguageLabel
WHERE
{
?software wdt:P306 ?os. # software works on os
?software wdt:P277 ?softwareLanguage. # software is written by parogramming language
?os wdt:P277 ?osLanguage. # os is written by parogramming language
SERVICE wikibase:label { bd:serviceParam wikibase:language "en" }
}
GROUP BY ?software ?softwareLabel ?os ?osLabel ?softwareLanguageLabel ?osLanguageLabel
ORDER BY DESC(?softwareLabel)
SPARQL query 5336 results (January 2018), 18976 results (September 2020).
How much software was written using a language for an OS written using a programming language
[edit | edit source]SELECT (count(*) as ?count) ?osLanguageLabel ?softwareLanguageLabel
WHERE
{
?software wdt:P306 ?os. # software works on os
?software wdt:P277 ?softwareLanguage. # software is written by parogramming language
?os wdt:P277 ?osLanguage. # os is written by parogramming language
SERVICE wikibase:label { bd:serviceParam wikibase:language "en" }
}
GROUP BY ?osLanguageLabel ?softwareLanguageLabel
ORDER BY DESC(?count) DESC(?osLanguageLabel) DESC(?softwareLanguageLabel)
SPARQL запрос 418 results (January 2018), 829 results (September 2020). The query shows that most of the software written for OS written in C/C ++ is also written in C/C ++. On the whole, it can be seen that most of the software is written in C, C ++, Python, Java, ObjectiveC.
How many software was written for the operating system using a language
[edit | edit source]SELECT ?osLabel ?softwareLanguageLabel (count(*) as ?count)
WHERE
{
?software wdt:P306 ?os. # software works on os
?software wdt:P277 ?softwareLanguage. # software is written by parogramming language
?os wdt:P277 ?osLanguage. # os is written by parogramming language
SERVICE wikibase:label { bd:serviceParam wikibase:language "en"}
}
GROUP BY ?osLabel ?softwareLanguageLabel
ORDER BY DESC(?count) DESC(?osLabel)
SPARQL query 378 results (January 2018), 671 results (September 2020). The query shows that most of the software written for macOS is written in C ++, C, Python, for Android - in C ++ and Java, for iOS - in C ++.
How many software has been written in one or another programming language, and which part of them works under a particular operating system
[edit | edit source]The histogram shows how much software was written in a particular programming language, and which part of them works under a particular operating system
#defaultView:BarChart
SELECT (count(*) as ?count) ?softwareLanguageLabel ?osLabel
WHERE
{
?software wdt:P306 ?os. # software works on os
?software wdt:P277 ?softwareLanguage. # software is written by parogramming language
?os wdt:P277 ?osLanguage. # os is written by parogramming language
SERVICE wikibase:label { bd:serviceParam wikibase:language "en"}
}
GROUP BY ?softwareLanguageLabel ?osLabel
HAVING (?count > 50)
ORDER BY DESC(?count) DESC(?osLabel)
SPARQL запрос 378 results (January 2018), 671 results (September 2020).
The histogram in the figure allows you to see for each programming language the number of programs that were written on it, and for which operating systems these programs work. It can be seen from the graph that the largest number of programs is written on С(1084), С++(1598), Java(526), JavaScript(242), Objective C(252), Python(454).
Let's look at each of these languages in more details.
Most of the programs which are written in C are for macOS(472) and Linux(235). The language was developed in 1972, but it still does not lose its popularity because, probably, it is using to write low-level applications.
Most of the programs which are written in С++ are for macOS(780), Linux(265) and Android(264). Probably, C++ will lead for a long time, because at the moment it is using for solutions that require high performance, which is not allowed by high-level languages like Java or C#.
Most of the programs which are written in Java are for macOS(196) and Android(156). Probably, Java is popular due to code portability, i.e. the Java code will be run on any machine in which the JVM is installed.
Most of the programs which are written in JavaScript are for macOS(100) and Android(60) и iOS(40). It is using to write the client side of web applications, it reduces server load and increases application speed.
Most of the programs which are written in ObjectiveC are for macOS(112) and iOS(72). Some time ago, if was especially using by the Apple corporation.
Most of the programs which are written in Python are for macOS(212) и Linux(107). It is a high-level language, has a low entry threshold. It is using, for example, to write web applications and data analysis.
Looking at the histogram, we can conclude that each of these languages has taken its "region" in the field of software development and is used for a certain range of tasks. It is also seen, that most of the programs are for macOS(2388), Linux(895) or Android(908).
Completeness of the Wikidata
[edit | edit source]Let's compare queries 2 and 3. Оbviously that a lot of software products don't have "programming language" property.
Filling in the Wikidata
[edit | edit source]After filling the "programming language" field in 100 software products, query 3 shows 2502 results, 06.11.2017 01:40.
Software documentation
[edit | edit source]Wikidata plays a big role in software documentation. This is illustrated by the programs included in the GNOME and KDE[1]. This article shows that while the English Wikipedia describes almost all the programs included in GNOME and KDE, the Italian and French ones only contain a subset of the articles. Documenting large projects is a well-known and difficult task. To solve it, you need a centralized system. It is in this role that the bunch of Wikipedia and Wikidata acts[1].
Future work
[edit | edit source]- Show all those OSs that have a "logo (P18)" property.
- Construct a diagram reflecting the statistics of how many OSs was created in which country. Шt is permitted to use properties developer, country или headquarters location, if creator is a company, or country of citizenship, if creator is an individual developer.
- Count how many OSs was created (inception (P571)) in 1995.
Exercises
[edit | edit source]
1. SPARQL query, OSs and developers
2. SPARQL query, OSs and logos
3. SPARQL query, OSs and countries
4. SPARQL query, OSs and count of "descendants"
References
[edit | edit source]- Sysoev M. "Operating systems in Russia". ProWD. Retrieved 2020-09-28.
Links
[edit | edit source]- John Samuel (2020-12-13). "Documenting Software Applications on Wikidata". Retrieved 2021-01-23.