• From data to visualisation: Dante’s Divine Comedy as a case study.

    Ginestra Ferraro (see profile)
    Information visualization, Natural language processing (Computer science)
    Item Type:
    Conference paper
    Conf. Title:
    Conf. Org.:
    Conf. Loc.:
    Ottawa, Virtual
    Conf. Date:
    20-24 July 2020
    Dante Alighieri, modular design, text mining, Data visualization, Natural language processing
    Permanent URL:
    A journey from Hell to Heaven, investigating the computational opportunities of automating text analysis and producing data visualisations. This poster presents the results of the exploratory work for a reusable tool to generate data visualisations based on automatic text analysis. Its non-functional requirements respond mainly to flexibility (accept different text inputs) and optimisation (produce rich visualisations with minimal set up). The current version accounts for modules (i.e. software components) designed around one selected test case, namely Dante Alighieri’s Divine Comedy , but serves as a blueprint for further modules to be plugged in. The visual outputs allow users to interact with both the content and the metadata. The application performs computational text analysis to produce data visualisations representing the following structural, stylistic and semantic features of the text: 1. schematic representation of the poem’s structure and rhythm; 2. distribution of keywords; 3. visual representation of the sentiment analysis). The application has been developed modularly (Martin and Martin 2006), following the separation of concerns design principle (Dijkstra 1982) to allow for flexibility and scalability. Natural language processing (NLP) and machine learning techniques have been applied to process and transform the data. The Naive Bayes Classifier (Perkins 2010) technique has been chosen due to its performance and simple implementation. The poster demonstrates achievements of this proof of concept and development ideas for the future. The main success lies in its modular development, making it amenable to further development3 (algorithm refinements, visualisation workflows, stylometric analysis). More languages and different text structures will be integrated and a wider range of output visualisations offered, while making use of the same core functionalities for ingesting and processing data.
    Last Updated:
    4 years ago


    Item Name: pdf 288-dh2020-from-data-to-visualization-16-9.pdf
      Download View in browser
    Activity: Downloads: 193