• Quantifying Commentator Interest (as a feature of Biblical text)

    Author(s):
    Joshua Waxman (see profile)
    Date:
    2021
    Group(s):
    CSDH-SCHN 2021: Making the Network
    Subject(s):
    Philology, Digital humanities, Natural language processing (Computer science)
    Item Type:
    Lecture
    Tag(s):
    Biblical studies, Natural language processing
    Permanent URL:
    http://dx.doi.org/10.17613/v6xy-3g14
    Abstract:
    The Hebrew Biblical corpus is composed of text of different genres. There are narrative sections, genealogical accounts, poetry, commands as to daily and seasonal ritual practices, legal systems, details of sacrifices, and instructions on how to construct the Tabernacle. The content, and often style, of each of these differs from the others, and might be distinguished by features such as tf-idf vectors, sentence length, and grammatical structure. This corpus has been the focus of many Biblical commentators. Focusing just on a few classical Jewish Biblical commentators (Rashi, Ibn Ezra, Ramban, Rashbam, Seforno) who composed commentary across the Pentateuch, we quickly observe that these commentators often pay attention to different subsets of Biblical verses. This attention or "interest" is influenced by the commentator's purpose and vision. For instance, Rashi is concerned with presenting a consistent and comprehensive interpretation drawn from earlier traditional sources, even where these are not the most literal straightforward readings. However, there is not much to say regarding genealogical verses. Ibn Ezra is concerned with the literal interpretation as well as grammatical analyses of difficult words and grammatical constructions, and won't be so concerned with details of sacrifices, either because he agrees with Rashi (who preceded him), or because he does not read as much into every word choice. To quantify commentator interest, for each chapter, we divide the # of verses discussed by a particular commentator by the total verses in that chapter. Each chapter is thus represented by a vector of commentator interest ratios. We label each chapter with multiple tags corresponding to content type, as different spans of verses might correspond to different content, for instance a narrative section ending with a genealogical list. We trained binary Logistic Regression classifiers for each genre using these vectors as features, and see some good results.
    Metadata:
    Status:
    Published
    Last Updated:
    2 years ago
    License:
    All-Rights-Granted

    Downloads

    Item Name: pptx quantifying-commentator-interest.pptx
      Download
    Activity: Downloads: 26