Digital Archive

Housed as a series of collections at the Internet Archive, ELA’s collections includes approximately 5 TB (and growing) of primarily born-digital recordings (video, audio, transcriptions, translations, lexical data, and fieldwork sessions).

Since its founding in 2010, the Endangered Language Alliance (ELA) has been recording a wide range of unique materials with speakers of over 100 minority, endangered, and Indigenous languages. The core of the collection consists of recordings reflecting the immigrant and diaspora communities in and around New York City, a world capital of linguistic diversity. In 2018, ELA began formalizing its collections as the Archive of the Languages of New York (ALNY) with the dual aim of ensuring the preservation of and providing access to this ever-growing collection.

ELA’s collections includes approximately 5 TB (and growing) of primarily born-digital recordings (video, audio, transcriptions, translations, lexical data, and fieldwork sessions). Among the materials are oral histories, historical narratives, songs, folktales, and a variety of other linguistic materials representing both a range of distinctive communities from around the world and the linguistic life of one of the world’s most diverse cities. Many recordings were made in New York over the past decade, while others came out of fieldwork by ELA staff or partners in communities in Belize, Mexico, Nepal, Tajikistan, Turkey, Indonesia, and numerous other sites around the world. A small but valuable portion of the recordings were made on tape beginning the advent of digital recording and have subsequently been digitized.

ELA’s collections foreground the linguistic and cultural contributions of communities that are underrepresented linguistically, culturally, politically, and otherwise, with the ultimate aim of making them maximally accessible and useful both to community members themselves and to a wider public. The significance of this humanities collection goes beyond linguistics with its offerings such as unique recipes in the Indigenous Mixtec language of Mexico, arumahani a cappella songs by traditional masters from the linked Garifuna communities in Belize and New York, oral histories from the Himalayan diaspora in Queens, folktales from storytellers in the Pamir mountains of Tajikistan, narratives of cultural survival from the Tsou people of Taiwan, and much more.

For many languages, ELA’s collections represent the only recordings or materials available—or at least the only materials that are public, high quality, and have been through some degree of annotation and analysis. This is true for languages as Seke (Nepal), Ishkashimi (Tajikistan), and several Iranian Jewish languages including Judeo-Kashani and Judeo-Shirazi.

In other cases, ELA’s collections represent the most complete corpora available anywhere for particular languages— examples include Wakhi, Zaza, Koda, Loke, Gurung, Neo-Mandaic, Bishnupriya Manipuri— or else represent a unique subset of materials that does not exist elsewhere. Examples of this include Ladino recordings about Sephardic Jewish history in New York, recordings of the broadcasts of the NYC Indigenous radio stations Alcal and Kichwa Hatari, materials related to health and community among Indigenous Mexican New Yorkers, over 500 diary entries in a dozen languages about the COVID-19 pandemic, and more.

ALNY is a work in progress, but currently being set up as a series of public-facing collections as part of the Internet Archive, a long-standing non-profit initiative. These collections are organized by lanugage and by project, but searchable across a range of metadata information following the Dublin Core-OLAC standard used by many other language archives. When fully public with ELA’s website and other digital efforts, the archive will be discoverable, searchable, and open to scholars and communities. It will also be integrated with Kratylos, an innovative software tool for analyzing and crowdsourcing linguistic data, which is currently being built at ELA with the support of a National Science Foundation grant.

For more immediate access to a curated set of recordings, visit ELA’s Youtube channel with over 1000 videos in dozens of languages.