• Scalable Named Entity Identification in Classical Studies

    Project Director(s):
    Gregory R. Crane
    Author(s):
    Gregory R. Crane
    Date:
    2011
    Group(s):
    Data Rescue
    Item Type:
    White paper
    Institution:
    Tufts University
    Tag(s):
    NEH White papers, Advancing Knowledge: The IMLS/NEH Digital Partnership, NEH Preservation and Access, Classics
    Permanent URL:
    http://dx.doi.org/10.17613/M6Z07G
    Abstract:
    The Perseus Project and the Collections and Archives of Tufts University propose to develop infrastructure for finding references to particular people and places from classical antiquity in several ancient and modern languages in primary and secondary source collections. We will offer and publish open-source, stand alone services and Fedora repository disseminators for searching, browsing, and visualizing entities within the Tufts Digital Library. Under a creative commons license, we will publish knowledge sources such as: linguistic data to identify forms of the most common 60,000 proper classical names in seven languages; knowledge base of the 30,000 people and places most prominent in texts; indices associating c. 200,000 passages with particular entities and an association network of 500,000 tagged names for named entity identification systems; automatically generated index of classical people and places identified in a 1 billion-word testbed of both scholarly and general cultural documents.
    Notes:
    Construction of a testbed of scholarly and cultural documents on the ancient world and the development of digital, open-source tools to enable researchers and librarians to utilize contextual materials available in text-based collections.
    Metadata:
    Status:
    Published
    Last Updated:
    6 years ago
    License:
    Attribution-NonCommercial

    Downloads

    Item Name: pdf pk-50022-07.pdf
      Download View in browser
    Activity: Downloads: 92