-
Cross-Linguistic Data Formats, advancing data sharing and re-use in comparative linguistics
- Author(s):
- Sebastian Bank, Robert Forkel, Russell D. Gray, Simon J. Greenhill, Harald Hammarström, Martin Haspelmath, Gereon A. Kaiping, Johann-Mattis LIst (see profile) , Christoph Rzymski
- Date:
- 2018
- Subject(s):
- Linguistics, Historical linguistics, Computational linguistics
- Item Type:
- Article
- Tag(s):
- data managment, computer-assisted language comparison
- Permanent URL:
- http://dx.doi.org/10.17613/kpp5-ms42
- Abstract:
- The amount of available digital data for the languages of the world is constantly increasing. Unfortunately, most of the digital data are provided in a large variety of formats and therefore not amenable for comparison and re-use. The Cross-Linguistic Data Formats initiative proposes new standards for two basic types of data in historical and typological language comparison (word lists, structural datasets) and a framework to incorporate more data types (e.g. parallel texts, and dictionaries). The new specification for cross-linguistic data formats comes along with a software package for validation and manipulation, a basic ontology which links to more general frameworks, and usage examples of best practices.
- Metadata:
- xml
- Published as:
- Journal article Show details
- Pub. DOI:
- https://doi.org/10.1038/sdata.2018.205
- Publisher:
- Springer Nature
- Pub. Date:
- 2018-10-16
- Journal:
- Scientific Data
- Volume:
- 5
- ISSN:
- 2052-4463
- Status:
- Published
- Last Updated:
- 5 years ago
- License:
- All Rights Reserved
Downloads
Item Name: forkel-et-al-2018-cross-linguistic-data-formats.pdf
Download View in browser Activity: Downloads: 370
-
Cross-Linguistic Data Formats, advancing data sharing and re-use in comparative linguistics