WikiJournal Preprints/OpenSpeaks: Open Toolkit for Multimedia Documentation of Indigenous Languages

From Wikiversity
Jump to navigation Jump to search

WikiJournal Preprints logo.svg

WikiJournal Preprints
Open access • Publication charge free • Public peer review

WikiJournal User Group is a publishing group of open-access, free-to-publish, Wikipedia-integrated academic journals. <seo title=" Wikiversity Journal User Group, WikiJournal Free to publish, Open access, Open-access, Non-profit, online journal, Public peer review "/>

<meta name='citation_doi' value=>

Article information

Author: Subhashish Panigrahi[a][i]ORCID iD.svg 

Subhashish Panigrahi, "OpenSpeaks: Open Toolkit for Multimedia Documentation of Indigenous Languages", WikiJournal Preprints, Wikidata Q106806074


OpenSpeaks was designed to be used as an open toolkit for citizen language archivists who are documenting languages in in multimedia mediums. Though the original focus was on indigenous, endangered, other low-resource and first languages, it can be used by anyone. Ever since its first publication in 2017 as a standalone website and later a subsequent move inside Wikiversity, it has been edited multiple times. There are currently four modules in this Open Educational Resource and each one is designed keeping beginner/intermediate-level language archivists. Considering the diversity of cultures, there are different frameworks, collectively known as the OpenSpeaks Frameworks, that are used to accommodate a wide range of different environments with a (native language speaker) "community-first" approach. Some basic understanding of audiovisual documentation is required to use this toolkit.

Introduction[edit | edit source]

OpenSpeaks was created in 2017 as an Open Educational Resource as a standalone project and was hosted at It was subsequently brought into Wikiversity. The project has a core focus on developing a framework, guidelines and resources for citizen archivists who plan to document languages. The overall project being centered around on all languages, there are specific resources included keeping indigenous, endangered and other low-resource languages.

A language can be documented in in many different environments. In a spectrum of digital documentation environments, natural and conversational use of languages and recording in a controlled environment (such as a scripted recording inside studio) lie in two extremes. The linguistic discipline of Documentary linguistics or Language documentation helps keep historical audiovisual-records of a society and protect cultures. Language documentation also furthers the growth of languages through practical use. Recording a language while people are having a conversation helps understand the informal use of a language. It even can be useful to better design a multimedia literacy program. Similarly, recording pronunciations of words and phrases inside studio with no noise can help create a speech synthesis system. Factors such as age, gender, influence of non-native languages over one's native language, and socioeconomic strata of an interviewee also make a deep impact on a person's overall speech. For a language to survive and grow, it is essential to a wide range of recordings keeping in mind the aforementioned human and other environmental factors.

OpenSpeaks can be useful for framing three major aspects:
Who/what will affect and who will be affected during documentation Things that will affect and scenarios
Potential problem Possible way to address
1. Your own status quo as an archivist What you can and cannot do:
You might not be able to afford a crew. You need to edit the footage yourself.
You might not have access to expensive equipment. You need to record using a phone.
2. Your interviewee Their age, gender, mental and socioeconomic state, what kind of questions would be ethical and what you should not ask.
Your interviewee is grieving the loss of a family member/relative.
  • You might cancel the recording entirely.
  • You must not schedule recording of a celebratory custom which would sound disrespectful.
Your interviewee has to take time out from their work.
  • Discuss and find a way to avoid such situations.
  • Try to compensate for their time if possible.
3. Environment Physical or social environment:
Your interviewee has small children who they need to attend constantly.
  • You need to work with them about their availability.
  • You should be ready to stop the interview immediately when they need to attend the children.

The process of documentation vary from archivist to archivist and all the above factors. For instance, documentary linguistic as a practice focuses on collecting the rich linguistic data of a language[B 1] such as recording everyday conversation in a marketplace whereas documentary filmmakers focus on the aesthetics and storytelling. Training volunteers to create informational videos, such as the virALLanguages project that helps create COVID-19 awareness videos in indigenous/minority languages, can also be counted as documentation of language.[B 2]

These resources of this toolkit are divided into four interrelated chapters:

  1. Chapter 1: Consent, Rights, Copyright and Open Licensing
  2. Chapter 2: Multimedia (audiovisual) recording
  3. Chapter 3: Metadata collection and publication
  4. Chapter 4: Accessibility considerations.

Chapter 1: Consent, Content Rights and Content Licensing[edit | edit source]

While making audio and video documentations of language, you encounter questions related to consent, rights and copyright, and licensing. Some of the most asked questions can be:

  • How do I take permission for an interview?
  • Do I take permission from an interviewee in writing or verbally?
  • Who owns the rights when I make an audio or video recording?
  • What kinds of ownership rights exist?
  • What is copyright and who owns copyright when I make a recording?
  • Do I need to register for copyright?
  • Is there a license for publishing the recording?

You might not find direct answers to each such question. Because there is no easy answer to any of these questions. Such situations are unique. So, in this chapter, you will find ways to address. There are different contexts and backgrounds provided below. They will hopefully help you assess your own situation and make a judgement.

Here are some of the common terms you will encounter below:

  • Documentation: Recording any information so that it can be used later. Reciting a poem to someone (who can remember it later), writing it in a paper and making an audio/video recording of the narration are different kinds of documentation.
  • Media: Channels and tools for sharing information. The same information can be printed in a book (print media) or shared over a chat application like WhatsApp or Signal (digital media) or written in a CD (an old digital media) or a cassette tape (a much older and analog media).
  • Content or media content: Information and experiences for end users. Content is often documented in different mediums (e.g. physical mediums include a paper note or a book and digital mediums include a SD card or internal memory of a phone).
  • Consent: Voluntary agreement by a person to the proposal of another person. It happens verbally, by other physical gestures and in writing. Consent is generally taken before legal, medical (e.g. vaccination), research and sexual relations.
  • License or licence: An official permission or permit or the proof of the same that allow someone to use or own something. That "something" can be a manuscript of a writing, recording of a narration or even a doctor's medical license. A software developer (individual) can provide a license if they create a new software or a medical board (organization) can provide a license to a doctor.
  • There are more terms like copyright, moral rights, open licensing. But you will learn about them in details below.

Consent for documentation[edit | edit source]

In the context of language documentation, consent is often given voluntarily by the interviewee to the interviewer. It indicates a prior approval for the recording. The interviewer would request the interviewer their permission for the recording. The interviewee will need to understand the request. Then they would give explicit permission for the recording and the subsequent publication of the recorded media content. This chapter discusses the how, when, where and who for acquiring consent.

In many interviews, an indirect consent is assumed when the archivist sets up the recording equipment like camera and microphone. It is assumed that a person who is being interviewed is aware of the recording by looking at those equipment. But this is not universally applicable. The interviewee might be visually impaired or they might be unaware of the recording process. Also, it is hard to prove legally or ethically at a later date that such a consent is good enough in case of a conflict.

Written consent[edit | edit source]

It is always recommended to use a written or a printed consent form. If using a printed form, make two copies: one for the interviewer and the other for the interviewee. The interviewee should be an adult literate who can understand the text in the form to provide consent by signing it. If the interviewee cannot provide consent then please discuss who should provide consent on their behalf. An adult parent or guardian in case of a minor can provide consent for a minor. A caretaker or a family member can provide consent for an interviewee with physical disability. It is strongly advised to not interview a person with mental disability for ethical reasons.

See an example form below. You can also copy the content, modify if needed and translate into your preferred language (preferably an official language used in the respective jurisdiction).

                            RELEASE FOR CONSENT AND RIGHTS


• I agree to publish the "WORK" under the LICENSE below.

• LICENSE: I acknowledge that by doing so, I grant anyone the right to use the work in all permissible ways the chosen License (Creative Commons Attribution-ShareAlike 4.0 International
(CC BY-SA 4.0 - allows which might include use in a commercial product or otherwise, and to modify it according to their needs, provided that they abide by the terms of the license and any other applicable laws.

• I am aware that this agreement is not limited to the "PRODUCER".

• I am aware that the copyright holder always retains ownership of the copyright as well as the right to be attributed in accordance with the license chosen.

• I acknowledge that I cannot withdraw this "Agreement".

• I am aware and I agree that this Agreement document can be annotated, subtitled, translated, published, distributed and broadcast, without any further approval from myself or my representatives.

• I affirm that: (a) I have the full power and authority to grant the rights and releases set forth in this Release. If any third party claims that the use of the Content violates its rights, I agree to cooperate fully with the PRODUCER to defend against or otherwise respond to such claim.

Share the name and/or explain about the "WORK" *
(If it is an interview, you can write "My interview on TOPIC-NAME" whereas "TOPIC-NAME" is what you spoke about.; If you submitted a video/audio/picture, you can share the name of the file or describe what the video/audio/picture is about)

• Place where this was signed:
• Date of signing*
• Language(s) spoken:
• Media kind (digital audio, video, photograph):

                        INTERVIEWEE DETAILS

• Full name:

• Email or other contact information: 

• Do you agree to the aforementioned CLAUSES explained?*
- Yes, I AGREE to all the above ☐

• Signature:

                        INTERVIEWER DETAILS
• Full name:

• Email or other contact information:

• Do you agree that:
a. you will provide open access to the content, especially to the native speakers?
b. you will use the content primarily for the promotion, protection and preservation of the language?
d. you will never claim ownership over the wisdom shared in the content but will attribute the interviewee and/or their community?

- Yes, I AGREE to all the above ☐

• Signature:

A document like above becomes a mutual agreement of consent. It might not be a legal document. It also contains terms like "Open Access" and "Creative Commons" which are explained later in this chapter. If a written consent is not feasible for any reason, a verbal consent in a recorded form can be asked for.

Verbal consent[edit | edit source]

It is not always possible to acquire consent in many cases. For instance, the interviewer and the interviewee might speak different languages. The interviewee might be illiterate or have a disability to understand a written consent. In such a case, it would not be ethical to acquire a written consent even if the interviewee is willing to sign. Let us look at some scenarios that will help while acquiring consent.

Recording scenario How Consent type (verbal, written)
Large group activity either in public or in a closed space (more than four-five people) like singing, dancing or even having a meal Interviewee can plainly ask if it is okay to record and publish, and record the collective verbal agreement Verbal recorded (optional if the group is very large and the activity is public)
Small group activity (four-five or less number of people) Each consenting adult has to provide consent Verbal recorded or signed written consent
Individual interview a) interviewee of they are adult and can consent

b) a parent or guardian (when a parent is unavailable) if a minor

c) a guardian if interviewee is not eligible for their own physical/mental disabilit

Written preferred
Video/image not showing human faces or exposing any personal information (name, address, location and other personal data; applicable to audio exposing personal information) A consent is generally not required A verbal agreement (good to keep on record) is recommended if the recording includes religious/culturally sacred sites/rituals and other such elements of a society that are guarded carefully

There can be endless scenarios beyond the above. There can also be situations while acquiring prior consent would not be possible. In such a case, keep the recording very private and acquire consent as soon as possible. If needed, mention about the delay in the consent form for transparency. If you cannot acquire the consent, you should not use the content in your final production. You should destroy the recording immediately instead. If the recording is vital to your production, then you have to ensure all personal information is redacted. With many modern tools, it is now easier to find personal information in digital content. So, it is strongly advised to not publish content that are acquired without consent.

It is important to note that consent is not just a legal matter, but it is very much a social, ethical and moral subject. It has to be done in a careful manner with mutual agreement.

Rights: Copyright, Moral Right and Other Ownership Rights[edit | edit source]

Copyright is the legal ownership of content. It empowers the legal owner of any work (e.g. text, image, audio and video, data and software) so that they can decide how others could use that work. In simple words, copyright protects the content from unlawful use. Copyright is a really complex subject. In language documentation, there are often confusions on the ownership of the documented content. There are different levels of rights. Moral right and copyright are two levels of rights that are often discussed. Moral right is the right that the original creator of a content has over their work.

Case study: Recording of folk songs in Colombia[edit | edit source]

“Who Owns The Content” is a short film which rotates around a central question on ownership of oral culture.

Let's take the example of an incident that happened in Colombia to understand more about the rights over a work. A young guy discovered cassette tapes that included recordings of folk songs sung by his late father. A European researcher made the recordings. The songs are known to his local community. After finding the tapes, the guy made digital versions of the songs and uploaded them on the internet. But this made the siblings furious. There are three – four levels of "ownerships" that exist here:

a) moral right co-owned by the community where the folk songs are sung and the late father (as the singer/narrator of the songs that provide evidence of the songs)

b) copyright of only the recordings owned either by the researcher (if he self-sponsored) or by the late father (who might have commissioned the recording) or by the institution that might have employed the researcher

c) the physical copies (cassette tapes) of the recording owned by the young guy and his siblings who are the legal heir of the father

In the above case, the copyright is unknown. One has to investigate further to find out any evidence to support the copyright claim.

Short film "Public Domain Day" explaining how Public Domain works.

Copyright generally lasts for the lifetime of the original creator and certain number of years after their death. After that, the work goes to Public Domain. Different countries have different number of years as their respective copyright terms (see list here). There is no registration required to acquire copyright. Any work with originality becomes a copyrighted work by default. The symbol "©" is usually used to denote a copyrighted work.

A legal contract or agreement helps decide who the copyright owner will be. Simply put: if an employee of an organization is paid by the employer to do a certain original work, then the employee has a moral right over the produced work whereas the employer has the copyright. Most organizations include a blanket agreement with their employees for a right over all the work produced by the employee in the latter's official capacity. Similarly, a commissioned work would be copyrighted by the individual/organization who commissions the work.

Originality is a grey area. Taking a picture of a natural landscape will result in an "original work" and hence copyrighted whereas taking picture of a painting will not be counted as original work. Look at the examples below to get some insights on copyright in the context of language documentation.

Content Owners (both copyright and moral rights)
Video/audio recording of a individual singing a folk song Multiple owners:

a) if the original author of the song is unknown then the song's copyright will be assumed as Public Domain

b) the narrator or singer will have a moral right

c) copyright of only the recording will be owned by the archivist (in case of non-commissioned work) or the individual/organization that has commissioned

Video/audio recording of a cultural or social event a) the people involved in the event and/or the larger community has a moral right

b) copyright of only the recording owned by the archivist or the commissioning individual/organization

Photograph of an artwork, painting, mural, etc. Original artist of the artwork

During the production of audio or video, supporting content (e.g. newspaper clippings, stock images, audio or video footage) are generally used. It is a must to acquire permission from the copyright holder for using such works if you are creating something for commercial purposes. Many use such supporting works copyrighted by others under "fair use" (in the United States) or "fair dealing" (in countries outside the U.S.). However, it is a considerable grey area and involves copyright infringement risks.

The sample release here has a provision to include clarity on copyright. While recording an audio/video in a language, you can seek for written permission (or verbal permission recorded preferably as video) to use and distribute the work. The form has included the Creative Commons Attribution-ShareAlike 4.0 International (CC-BY-SA 4.0) License as a suggestion. But you can use a license that works in your particular case. More details on the "open" (meaning that allow for a wider and open dissemination of information) licenses are discussed in the next section.

Licensing: Public Licenses and Open Licenses[edit | edit source]

In the previous section you read about how content is protected by copyright. Generally, you need to take permission to use it. Languages are usually documented for public use. Most importantly, such documentation must be accessible to all the native speakers. Imposing strict copyright restrictions might not allow open and public access to recorded content in a language. Not all users know about the legal terms of using copyrighted content. Similarly, not all copyright owners (artists, authors, performers, etc.) also do not know which license to use for their work. If a specific license is not mentioned during publishing the work then the work is automatically copyrighted. "All Rights Reserved" is popularly written for such works. However, copyrighting language documentations accidentally might restrict many native speakers to access. This would also hamper the growth of low-resource languages. So, you are encouraged to use different open licenses whenever possible. There are also some non-open public licenses. But let us learn about these license definitions first.

Public Licenses[edit | edit source]

Public licenses or public copyright licenses are for the general public. By using such a license, the copyright owner grants a universal permission. Any permission that is specific (for some individuals or some organizations or only for legal residents of a country) cannot be called public licenses. Open Knowledge Foundation has recommended the use of Open Licenses for creative work which is guided by the Open Definition.

Free/Open Licenses[edit | edit source]

These licenses are commonly known as open licenses. They are inspired by the Four Essential Freedoms of the Free Software. Such licenses allow everyone to use, modify, share and improve. Different open licenses have different degrees of these four freedoms.

Creative Commons Licenses[edit | edit source]

A set of licenses known as Creative Commons licenses (acronymed as "CC" licenses) are generally used for copyrighted works. CC licenses apply to all such works including text, image, other multimedia content like audio and video, and even datasets. There are seven main Creative Commons Licenses. The table below gives more clarity about what these licenses are.

License logo, name and shortened name What is allowed What is NOT allowed Commercial use allowed
CC0 icon

CC-Zero or CC0

  • Copyright Owner allows others to use, distribute, remix and use this work to create other works
  • A user can re-distribute for free or make money
  • The user does not need to give credit to the Original Owner
There are no such restrictions for users Yes
CC-BY icon

CC Attribution or CC-BY

Copyright Owner allows others to:
  1. use the work
  2. distribute it
  3. adapt or remix or create derivative works and re-distribute such works both for free or for money

The user MUST credit the original Copyright Owner for the above use.

This is the most accommodating of licenses offered. Recommended for maximum dissemination and use of licensed materials.

The user cannot claim the original work as their own. This means the user has to give credit to original Copyright Owner in all places where they use the work. This applies to all the licenses below in this table. Yes
CC-BY-SA icon

CC Attribution-ShareAlike or CC-BY-SA

Copyright Owner allows others to:
  1. use the work
  2. distribute it
  3. adapt or remix or create derivative works and re-distribute such works both for free or for money

The user MUST credit the original Copyright Owner for the above use and they must release the new creations under the license the Owner used.

User cannot release works derived from original work under a different license (than the one used in the original work) Yes
CC-by-NC icon

CC Attribution NoCommercial or CC-BY-NC

Copyright Owner allows others to:
  1. use the work
  2. distribute it
  3. adapt or remix or create derivative works and re-distribute such works ONLY for non-commercial purposes (user can NOT earn money from the original work or new works using the original work)

The user MUST credit the original Copyright Owner for the above use.

User cannot make money from new works derived from original work No
CC-BY-NC-SA icon

CC Attribution-Noncommercial ShareAlike or CC BY-NC-SA

Copyright Owner allows others to:
  1. use the work
  2. distribute it
  3. adapt or remix or create derivative works and re-distribute such works ONLY for non-commercial purposes

For the above use, the user

  • MUST credit the original Copyright Owner
  • MUST share the new creation under the same license the original Copyright Owner had used
User cannot release works derived from original work under a different license (than the one used in the original work)
User cannot make money from new works derived from original work
CC-BY-ND icon

CC Attribution-NoDerivatives or CC Attribution-NoDerivs or CC BY-ND

Copyright Owner allows others to:
  1. use the work
  2. re-distribute it both for free or for money

The users must credit the original Copyright Owner for the above use.

User cannot re-distribute new creations using original work Yes
CC-BY-NC-ND icon

CC Attribution-Noncommercial-NoDerivatives or CC BY-NC-ND

Copyright Owner allows others to:
  1. use the work
  2. distribute it

For the above use, the user MUST credit the original Copyright Owner

User cannot use original works to create new creations

or re-distribute such creations using original work

The above table is inspired by the "seven regularly used licenses" section of Wikipedia.

Here are some recommended tools and other resources that you can use to identify which Creative Commons License you need to use:

  • CC Chooser (see legacy version): a form where you can fill the options to find an appropriate license for your work. You can simply copy the License text (or code if using in a website) and use it.
  • Internet Archive: a free repository that is strongly recommended for uploading your language documentation work. It supports a wide range of file types (images, documents, audio and video) and formats apart from Creative Commons Licenses. If you want your file to be used for Wikipedia and other Wikimedia projects, you need to upload them to Wikimedia Commons. Only CC0, CC-BY and CC-BY-SA licenses are allowed there.

Creative Commons Licenses are not the only kind of free/open licenses. The GNU Free Documentation License is a popular free license that is used for many text materials like books and manuals. It generally allows a user to use the original work, make a copy, redistribute, and even modify it. However, the original document or source code MUST be included in the new work if more than 100 copies of the same are published.

Chapter 2: Audiovisual recording[edit | edit source]

This chapter details process of audiovisual recording the use of languages.

Module 1: Basics of audio-visual recording[edit | edit source]

An overview of what are aimed from the recording process and how to go about it.

Prerequisites[edit | edit source]

1. Be honest and ask your interviewee to be honest
Language is a very sensitive element of a society. When any known/unknown mistakes like mispronunciations get recorded and shared publicly, native speakers might take an offense. So, please check with your interviewee to ensure that you document any unintended mistakes in the description part of the video/audio while publishing. You might not always be able to delete portions of such unintended mistakes but you can always admit that there is any unintended mistake that got recorded. Similarly, if the interviewee is not a native speaker and is trying to learn a language, you should mention clearly about that. The real native speakers will welcome such honesty.
2. Imagine yourself out in the field interviewing someone speaking a language that you don’t probably understand
Think of the challenges that you might face—the loss in translation, the lack of your understanding of their cultural/linguistics nuances. Are you going to use a language that is mutually intelligible by you both or get the questions translated or just have a translator along with you to assist?
3. Plan in advance and practice well
Planning for a documentation starts with knowing your interviewee(s) well. Do some research about their language, culture, and may be a few most used phrases in their language that you can say to amaze them while interviewing them. People generally appreciate when someone alien makes an effort to speak in their language. Use a spreadsheet or even an app to have a rough and agile plan. Things might change while interviewing and you need to be prepared for the same. Also, have a plan B in case anything fails. If you’re someone who gets a cold feet while meeting a stranger, write down and practice your questions with a friend/family member or in front of a mirror.
4. Know your hardware and software
As you are going to rely on your recording equipment and software (you will learn about them in the next module), it’s important that you know well about them. But how well is well? Well, as long as you know the ins and outs of your gears and some troubleshoot in case of emergency. For instance, if you’re planning to use your phone for the audio and video recording, check what apps are best for your workflow. It’s advisable to use apps (e.g. Filmic Pro for iOS devices) that show the audio levels on screen while recording so you know for sure that the audio is indeed being recorded.
5. Keep a notebook/note-taking app to capture some important data
Physical/digital note-taking while recording always helps during post-production. Also, you need to capture some metadata (more in Module 3) for which you can use the note or use a printed template. But please keep in mind that the noise you might make while writing might get recorded so choose your pen carefully.
6. Ensure you get to record in a quiet place
The most challenging aspect of any recording in a quiet place for clean audio and and well-lit place for good quality video. Check below to know what to avoid:
Noise sources Possible solutions
Ambient noise (Audio)
  1. Talk to the interviewee before recording to check what could be the least noisy place where you're going to record
  2. If you can, get a lavalier microphone (also known as lav mic, lapel mic, clip mic, etc.) so that you get a nice clean sound as it is placed close to the interviewee's face
LED and other home electric lights (Video) Most home lights, when captured in a camera, look flickering and disturbing. When you'll learn more about the solution for such issues in the next module, avoid home lighting and use lights that are recommended (more here) for filing if you can afford. Alternatively, if you're filming during the day, you can sit close to a window with the subject's face lit with the natural lighting.

Interview process[edit | edit source]

  • Friendliness and empathy: The best emotion is captured when your interviewee trusts you the most. Try to be empathetic and friendly, relate to them in a human level and keep a check on their comfort level. They would open up to share something that they care about only when they think they can trust you. Trust is built over time. How do you bring it in a short interview?
  • Ice braker questions: You can always ask some trivial ice-breaking questions in the beginning and slowly move towards asking more personal questions.
  • Body language: In a physical interview, your body language matters much more than a telephonic or voice/video call. Positive body posture can entirely set the mood of the subject. So a thumb rule is be a good listener and show curiosity to learn from the interviewee. But when you're interviewing someone speaking a endangered language that is alien to you, you still can start with the same body posture. Even though you won't understand the vocabulary, being empathetic and trying to relate by observing the interview's emotional flow. You could reflect that by the right kind of camera moves.
  • Motion is emotion: Documenting a language is not just about placing a camera on a tripod and interview someone though that's a good starting point. But you need to capture the life of someone on the camera if you're capturing them saying about their life. If a picture means a thousand words, a video means a million! So, take some ample amount of time to shoot some b-rolls. For instance, if your interviewee has narrated about a bedtime story during the interview, capture some relevant shots—like kids sitting around an old person, or parents with kids. B-rolls are generally short so shoot really tiny videos (30 seconds - 1 minute max.) and cover a wider range of areas because you never know where you can use them. You can use the b-rolls as cut shots.

Module 2. Hardware and software for recording, and recording process[edit | edit source]

Audio recording[edit | edit source]

A home studio setup consisting of a computer installed with a free and open source audio recording/editing software like Audacity, a professional microphone, and a monitoring headphone. Read more in our Pronunciation Toolkit.

Different scenarios:

  1. Home studio: If you're recording at home, try to create a minimal setup You need a microphone to be able to record the audio. If you can, I would suggest to record in a small home studio setup like the picture above (consists of a USB microphone, a computer, and a monitor headphone).
  2. Field recording with a recorder or phone: The recording setup will largely vary if you are meeting someone outside your home for a field recording. In that case you will need to carry an audio recorder or a smartphone (some sort of recording app installed in it) with earphones. If you’re using a portable recorder make sure you cover the top of the mic with a soft cotton cloth or fake fur to a) avoid dust going inside, and b) the sound of the wind during outdoor recording. Use a rubber band to tighten the base and never touch the cloth/fur while recording. Mics can capture small little movements and completely distort the audio.
  3. Recording from phone: Earphones that come with the phones generally work both for phones and computers as compared to the default microphone provided along with . However, avoid sitting in an open space as there is a high probability of a lot of noise being captured unless if you are using a shotgun microphone.
  4. Audio editing software: If editing from a computer, Audacity, a free and open source audio editing software is the first choice for many seasoned recording artists. It is robust, easy to use and can be used in multiple platforms. If you are using your phone or tablet to record and edit the audio, then, use your native recording app or try to find a good free alternative in your respective app store. Ideally the recording/editing app should be allowing you to record in a decent lossless quality (minimum requirement is 44100 Hz, above 16 bit PCM i.e. 24 or 32 bit, above 220 kbps; check your settings to find these). Save the audio in .WAV or .FLAC (Audacity supports both). If your recorder/phone does not support these formats, try to use an app/online converter like this (MP3→FLAC or M4A→FLAC) to convert the audio into .FLAC.

Video recording[edit | edit source]

Which camera to use

Frankly speaking, the video is less important here as compared to the audio. With low quality video, viewers would still be able to manage if the audio is loud and clear. So if you are keen on investing, invest on a good quality microphone that can either be connected with the camera or can be used as a secondary recorder. But do not trust your camera’s default microphone. They can literally jeopardize your hard work. As far as the camera goes, you can literally use any camera that allows you to record in a decent quality i.e. above 720p (1280×720 px)—from your phone to a point and shoot camera to a dSLR.

    a) Using a camera: Use a shotgun microphone that can be connected directly into your camera so that you don’t need to invest much on audio syncing during post production.
    b) Using a phone for recording video: These days most phones come with high quality hardware that are capable of recording good video. But the real key to recording quality video in a phone lies in stabilizing the shot while recording. You can only do that by investing in a small tripod (they are generally really cheap and do the job) that can hold your phone. For this particular project, tripods will be the best.

How to edit the videos: You need to compress the video using a free software like Handbrake, and upload that into YouTube or something similar without making it public. We will download it and ask you to delete so that you don’t have to worry about the amount of space it will take in your hard drive.

Chapter 3: Metadata collection and publication[edit | edit source]

Annotation, subtitling of audio/video, translation of transcription and other content Download Content Release form (editable document in .odt and .docx, fillable form in .pdf); Metadata Documentation Sheet in .ods, .xlsx)

Annotation is the process of collecting additional information that might help provide background to any particular situation. For instance, a particular alcoholic beverage in an indigenous community is offered to the local deity first before drinking. A video that shows people consuming and the subtitles/captioning with the conversation that they are having might not provide enough context. Such nuances are generally added in text or audio along with a timestamp (e.g. refer to 01:36: Lakshmi and Babu are showing a gesture of respect to each other before drinking "rasi"). Audio/video content will surely need subtitles in largely spoken languages like English for a wider coverage. Transcriptions are generally created to have a verbatim version of the interview. Ideally, you need to work post-interview with a native speaker to create the transcription to ensure there is no loss of information in the process. However, transcription is not a easily digestible. So you need to create summaries for each section of the interview which will capture the highlights and sometimes details (for instance a game play or story).

Chapter 4: Accessibility[edit | edit source]

Accessibility considerations are to ensure that everyone can access the published digital media with no/moderate hassle. The underlying principle with accessibility is ensuring that none is excluded and making conscious effort to avoid any critical issues to people with disability. Use of subtitles/captions in audio and video, using typefaces/fonts that in the visual media that have proper contrast, size and alignment considerations, and use of colors that are friendly to the eyes of people with color blindness are some of the most important consideration. To check whether the media you have published is accessible or not, you could use the below checklist.

Yes/no, How to Recommendations
A. Video captioning

Do your audio/video have subtitles/caption?

Yes Closed captioning (CC) is more preferred for web applications as the caption is not "burned in" (hardcoded) on the video but is displayed separately. It also helps for translation of captions if you could release it as Timed text formats such as SubRip (files ending with a .srt suffix). Open captioning means that the captions appear as images that are "burned in" on the video. You can only watch it whereas you can select different language versions available in case of Closed captioning.
No Adding captions to videos is a very essential requirement when it comes to linguistic documentation. There are many ways to add captions. For computers, a highly recommended software is Aegisub (user manual) as it supports all major platforms (Windows, Mac and other Unix operating systems). Many modern video editors also support captioning. If you are collaborating with remote translators then Amara is a recommended option. It is an Open Source video subtitling platform (learn how to use it from here). Popular platforms like Internet Archive, Vimeo and YouTube are supported on Amara. YouTube also supports an in-built Closed Captioning. We strongly recommend the comprehensive guides that BBC has created (short version here, long version here) to learn how to create accessible captioning.
B. Audio/video transcriptions

Do you upload a transcription file separately along with your audio documentations?

Yes Verbatim transcriptions often retain stutters and fillers such as "umm..", "hmm.." that are a part of human speech. As the primary purpose of transcriptions is accessibility, verbatim transcriptions help. Non-verbatim transcriptions either omit stutters and fillers entirely or they are replaced with explanatory text. You might have seen in (English-language) movie subtitles how they write [MUSIC][A 1] when there is a background music playing. Similarly, you can use different explanatory texts based on the context. (see below for how to transcribe)
How to Please see the Transcripts resource page on W3C for more recommendations. Here is a step-by-step guide to create audio-to-text transcription that might be useful in some cases.
No Written languages: You must consider adding transcriptions to your audio and video. Simply put, transcriptions the text version of what is heard in an audio or video. They are very essential for people with full/partial blindness as they use screen reader software to convert text into audio and listen to the audio version to be able to access the content. Transcriptions are also helpful when a particular word is not very clearly pronounced. It is important to note that many written languages might not have yet a speech synthesis software but language documentations have a long lifeline. So, if you transcribe today and upload the transcription, it might be useful some day. It is often uploaded separately as a text file along with an audio file. YouTube shows the transcription separately when the option is selected on the right side of the video (only when the video is captioned). Spoken/oral languages: As oral languages do not have a writing system, you might consider translating the content first into a well known language that is relevant in your context, and make the transcription available.
C. Color contrast How to High contrast text is easily readable by people with low vision. So, it is always preferred over any aesthetics corrections. In your titles/captions, credits in the case of videos, documents shared along with audio/video, and web displays (websites, blogs, articles), try to use high contrast text. Extremely light-shaded text over a light-shaded background (e.g. grey over a sky background like this) are hard to read for many.

Additional information[edit | edit source]

Acknowledgements[edit | edit source]

OpenSpeaks has been enriched from a range of major projects, readings and interactions. It might not be possible to attribute all in a chronological order but some of the individuals and organizations include, but is not limited to:

Competing interests[edit | edit source]

The author haa no competing interest.

Ethics statement[edit | edit source]

This project draws direct/indirect learning from documentary films "Gyani Maiya" (2019), "Mage Porob" (2019) and "Remosam" (2019) that were made in collaboration respectively with the Kusunda community of Nepal, and Ho community and Bonda community of India. The participating individual members of these communities were interviewed with consent abided by the consent guidelines outlined in this project and the National Geographic Society release. Traditional community ethics were abided in all places while working together with indigenous groups and a high standard of moral and ethical standard was adhered to otherwise.

Notes[edit | edit source]

  1. The vocabulary, format and style for transcriptions vary from platform to platform. For instance, some use [NAME OF SONG IN BACKGROUND] whereas others use icons such as ♬ NAME OF SONG IN BACKGROUND ♬ for representing the same thing.

References[edit | edit source]

  1. Seyfeddinipur, Mandana; Rau, Felix (2020-09). "Keeping it real: Video data in language documentation and language archiving". Language Documentation & Conservation 14: 503–519. ISSN 1934-5275. 
  2. Panigrahi, Subhashish (2020-05-11). "Promoting coronavirus education through indigenous languages". Global Voices. Retrieved 2021-05-05.