OpenSpeaks/process

From Wikiversity
Jump to navigation Jump to search

Background[edit | edit source]

OpenSpeaks as a project saw the first update in 2017. The initial release was on a standalone website https://openspeak.com. The content was later migrated to https://theofdn.org/openspeaks. The ongoing development of the project happens here on Wikiversity in an open and collaborative manner. In September 2020, the project was awarded by Creative Commons to develop further a chapter on consent, copyright and open licensing with a translation in one of the indigenous languages of India. Santali, an Adivasi language that is spoken by about 7..6 million people, was selected based on the current use of the language both orally and in writing. As OpenSpeaks still needs to develop well for it to be useful as an oral OER, the current version has its limitations too. This update started with a bilingual (English and Santali) survey in November and the update is due by mid-January 2021. The new revision process also extends into making an overall update of the OpenSpeaks project by March-April 2021.

Updated 18 January 2021[edit | edit source]

  • The main project is updated at OpenSpeaks along with a Santali version of the Chapter 1: Consent, Content Rights and Content Licensing.
  • What was done differently:
    • Unlike the previous Chapters, there was a considerable effort taken to use simple English making it easier for non-native speakers to translate into their own language. The first test for Santali was still tough as many terms (e.g. copyright, open licensing) did not exist before.
    • The three community leaders who were involved -- R. Ashwani Banjan Murmu, Fagu Baskey and Joy Murmu -- took great care to make the Santali version as inclusive as possible by adding newly coined terms alongside transliterated ones.
    • A glossary was created to make the Chapter even simpler for the readers.

Updated 12 January 2021[edit | edit source]

  • A survey was conducted prior to the development of the chapter to identify areas that needs attention in the new version of OpenSpeaks, and the suggestions were incorporated into the new version.
  • A conscious effort to write the English version in Simple English. This would help many non-native speakers to easily translate into their own language. The first test pilot is the Santali version. It is currently being translated from the newly developed Chapter.
  • The translation process encourages the translator to include any additional detail that the English version does not include -- nuanced information relevant to the translated language.
  • All the English-Santali sentence-to-sentence or phrase-to-phrase translation pairs will be donated under a CC0 License so that the corpus can be used by the Santali community for future machine translation projects.
  • The Santali sentences in this project will be also licensed additionally under a CC0 license so that they can be used for creating a pronunciation library at the Mozilla Common Voice to eventually help any future speech synthesis project in Santali.