BIM-224 Research Infrastructures 23
Materials and Tasks for the module "BIM-224, SoSe 2023, Blümel/Rossenova" for students at Hochschule Hannover. The materials are prepared with several colleagues from the Open Science Lab at TIB Hannover.
Session 1: Data harvesting interfaces / data collection
[edit | edit source]Slides are available here: https://docs.google.com/presentation/d/1IxRTQhTY8nwFaijHq78m0NvW6Qw_YAj3YQtyO9Nn6dg/edit?usp=sharing
Student homework task pages
[edit | edit source]- Gizem Ergün / https://de.wikiversity.org/wiki/Benutzer:ErGiz
- BIM-224, SoSe 2023 - Ahmad Hasan Ahmad
- User:Ahmad.Aroud - Wikiversity
- BIM-224, SoSe 2023 – Lisa Sommer
- Mohammad Darkhbani
- Anna Rahr
- https://beta.wikiversity.org/wiki/BIM-224,_SoSe_2023_–_Josef_Debase
- Memo Loran Tuku
- Marcel Kromm
Group task 1
[edit | edit source]Platform list
[edit | edit source]- Radar4Culture
- GNM catalog
- Forschungsbibliothek Gotha der Universität Erfurt
- Datenportal des MfN Berlin
- Herbarium Berolinense
- Sketchfab
- Porta Fontium
- Coding da Vinci
Type of API list
[edit | edit source]- OAI-PMH, example: https://dhb.thulb.uni-jena.de/oai/prints?verb=Identify
Group task 2
[edit | edit source]- Student name / dataset link
- Josef Debase / https://creating-new-dimensions.org/Herbarium/
- Gizem Ergün / https://creating-new-dimensions.org/mkg-in-3d/
- Lisa Sommer / Pflanzenbelege aus dem Botanischen Garten Berlin
- Ahmad Aroud /https://codingdavinci.de/node/2284
- Marcel Kromm / Schrott or not? – "Geschmacksverirrungen" aus der Zeit um 1900
- Ahmad Hasan Ahmad / HISTORISCHER PORTRÄTAUFNAHMEN
- Mohammad Darkhbani / https://creating-new-dimensions.org/Gothaer-Kunstkammer/
- Memo Loran Tuku / https://creating-new-dimensions.org/Porta-fontium/
- Jana Cornelius / SchauMichAn - Franz Seraph Stirnbrand (um 1788-1882) und seine Porträts der Stuttgarter Gesellschaft
- Anna Rahr / Raritäten aus der Sammlung der Historischen Kommunikation der Robert Bosch GmbH
Session 2: Data cleaning, reconciliation and enrichment
[edit | edit source]Slides are available here: https://docs.google.com/presentation/d/1HpXUXYcs-LDOQYuQzFv1SYYutKUYP8qG0mR5BG3fLyw/edit?usp=sharing
OpenRefine official documentation:
[edit | edit source]https://openrefine.org/docs/manual/facets
https://openrefine.org/docs/manual/transforming
OpenRefine video tutorial:
[edit | edit source]Homework presentations:
[edit | edit source]- Gizem Ergün / https://www.canva.com/design/DAFlSBYZBXM/yak3bDV2gREj--Isgx4Vpw/edit?utm_content=DAFlSBYZBXM&utm_campaign=designshare&utm_medium=link2&utm_source=sharebutton
- Lisa Sommer / Pflanzenbelege aus dem Botanischen Garten Berlin
- Jana Cornelius / https://de.wikiversity.org/wiki/BIM-224,_SoSe_2023_-_Janacn#Homework_Assignment_19.05.23
- Ahmad Hasan Ahmad / Screenshots vom Datensatz in OpenRefine - Link
- Memo Loran Tuku / Screenshots - Imgur Link
- Josef Debase / https://imgur.com/a/3cDRlZd
- Ahmad Aroud / https://imgur.com/a/NdYbqEa
- Marcel Kromm / Screenshots
Session 3: Data in Wikidata
[edit | edit source]Slides are available here: https://docs.google.com/presentation/d/1bCilgycOApKcFjzelntD6zRf5WBU9804t_9Fb-Lc1E8/edit?usp=sharing
Homework presentations:
[edit | edit source]- Gizem Ergün / https://docs.google.com/presentation/d/1jdnIFpNe_P8d4SlN85J4MGHxTPgyaapMdopFyiwinDY/edit?usp=sharing
- Anna Rahr / https://imgur.com/a/AzJbEsD
- Memo Loran Tuku / https://miro.com/app/board/uXjVMEpcFxg=/?share_link_id=587248504323
- Josef Debase / https://imgur.com/a/YCJ1RP1
- Ahmad Hasan Ahmad / https://miro.com/app/board/uXjVMEvRMXE=/?share_link_id=326114468201
- Jana Cornelius / https://de.wikiversity.org/wiki/BIM-224,_SoSe_2023_-_Janacn#Homework_Assignment_26.05.23
- Ahmad Aroud / https://imgur.com/a/rRwFtRx
- Lisa Sommer / https://miro.com/app/board/uXjVME1u33Q=/?share_link_id=981671095755
- Marcel Kromm / Presentation
Session 4: Data Upload and querying (26.05)
[edit | edit source]Slides are available here: https://docs.google.com/presentation/d/1ebFJXSKikUSyjjPIsXFwTqVV2-igku6ra5Vm83h5SWQ/edit?usp=sharing
Additional tutorials:
[edit | edit source]Complete upload pipeline tutorial: https://en.wikiversity.org/wiki/OpenRefine_to_Wikibase%3A_Data_Upload_Pipeline
Upload tutorial for media files in Wikimedia Commons: https://en.wikiversity.org/wiki/Uploading_media_files_to_a_Wikibase_with_OpenRefine
Homework presentations:
[edit | edit source]- Student name / presentation link (google slides, other slide platform, or wiki pages with screenshots)
- Jana Cornelius / https://de.wikiversity.org/wiki/BIM-224,_SoSe_2023_-_Janacn#Homework_Assignment_09.06.23
- Josef Debase / https://imgur.com/a/w6OSArp
- Anna Rahr / Google Slides Link
- Marcel Kromm / https://imgur.com/a/i676VZ5
- ...
Session Workshop: Fermenting Data Workshop (02.06)
[edit | edit source]Slides are available here: https://docs.google.com/presentation/d/1BHlO17nTTXccoPMgqXZBx46zuDvhnM52h5Wj9X8p36M/edit?usp=sharing
Wikibase instance:
[edit | edit source]https://fermentingdata.wikibase.cloud/w/index.php?title=Special:CreateAccount&returnto=Main+Page
Session 5: Data upload and querying (cont.) / Data visualisation and presentation (09.06)
[edit | edit source]Video recording of the lecture: https://drive.google.com/file/d/1q94LdQauMPErzK5Yp2jD1zq0_MjWWgCX/view?usp=sharing
Slides are available here: https://docs.google.com/presentation/d/1T1fPDI2jSQJ1Q6rAaARIgxTbmST5Py_C8pmCBBAlXWQ/edit?usp=sharing
Book an individual feedback session - 15 mins per person:
[edit | edit source]- 15:00: Lisa Sommer
- 15:15: -
- 15:30: Ahmad Aroud
- 15:45: -
- 16:00: Anna Rahr
- 16:15: Gizem Ergün
- 16:30: Memo Loran Tuku
- 16:45: Josef Debase
- 17:00: Ahmad Hasan Ahmad
- 17:15: Jana Cornelius
- 17:30: -
Session 6: Data publication and review
[edit | edit source]In this session we will review homework and discuss requirements for final assignment submission.
Final submission deadline is July 7th.
Final assignment submission instructions
[edit | edit source]1) Spreadsheet with data you uploaded to Wikidata
2) Spreadsheet with the data you can download from the SPARQL endpoint with your main data query
3) Publication on GitHub Pages containing:
- your custom query results
- customized title / author / cover image
- customized additional text and optionally embedded data visualization as .svg and/or live results in an iframe.
Infos discussed during the session today
[edit | edit source]1) Adding proper Wikitext to Images in Commons when Uploading via OpenRefine
[edit | edit source]- A more detailed tutorial page, if you want to go more in-depth (esp. page 5 & 6): https://docs.google.com/document/d/1ENpZBOHvMESOst4Phh5gSRWlnAdBs-OMZt5j_cL-YGA/edit?usp=sharing
- For quick reference, I advise you to just check the screenshot here: and try to replicate in your schema builder when uploading. You need to make sure you have all of these statements for the images, in addition to the Wikitext. Depicts / Main subject link your image to the main object / artwork you uploaded to Wikidata.
- If you have photos of objects, you can use this simple Wikitext for all your photos (in addition to the statements as shown in the screenshot)
== {{int:filedesc}} == {{Art photo}} == {{int:license-header}} == {{CC-BY-4.0}}
Note to check the license – the above is just an example!
Note that if you copy my screenshot schema you will need to update the museum to match the museum you’re working with and license, too.
If you have photos of paintings / artworks, you can use this simple Wikitext for all your photos (in addition to the statements as shown in the screenshot)
== {{int:filedesc}} == {{Artwork}} == {{int:license-header}} == {{CC-BY-4.0}}
- More details are available in the google doc I shared above, but these instructions should be sufficient, too.
2) Using OpenRefine online
[edit | edit source]- There is actually an online version of OpenRefine! It is a bit old and does not have all new functionalities, e.g. you can’t upload images with it, but other than that it can be helpful in cases when you can’t use it on a personal or institutional computer for technical or other reasons. You need to go here: hub-paws.wmcloud.org and log in with your Wikimedia account. Then select OpenRefine from the set of tools available.
3) Issues with SPARQL queries, e.g. removing multiple line results for same item, etc.
[edit | edit source]- You can use a group_concat clause to concatenate multiple values in a single column, in order to avoid duplication of the same item over multiple lines, e.g. see this example: https://w.wiki/6qbP
- If you need more help customizing your queries, you can ask your peers, ask ChatGPT (though do not rely on it too much, it is still not very good with SPARQL and you have to be a magician with the prompts to get it all correct), or you can always consult trusted sources like StackOverflow and this very helpful SPARQL learning page on Wikidata - https://www.wikidata.org/wiki/Wikidata:SPARQL_query_service/queries/examples
Final updates regarding publications:
[edit | edit source]For reference, you can have a look at the publications of your peers, or you can also double-check my own publication, which exemplifies different parts of the assignment.
- published view here: https://lozanaross.github.io/catalogue-003/
- Github code view here: https://github.com/lozanaross/catalogue-003
FINAL SUBMISSION
[edit | edit source]Send the spreadsheets to the instructor via email.
Add your name & link to your publication below:
- Jana Cornelius / https://janacnl.github.io/catalogue-003/
- Ahmad Hasan Ahmad / https://ahmad19111.github.io/catalogue-003/
- Ahmad Aroud / https://ahmadaroud.github.io/catalogue-003/
- Anna Rahr / https://calnfynn.github.io/catalogue-003/
- Memo Loran Tuku / https://mloran.github.io/catalogue-003/
- Mohammad Darkhabani / https://mohammad19921991.github.io/catalogue-003/
- Lisa Sommer / https://pgxe9zu1.github.io/catalogue-003/