Output

This system generates several outputs: some are designed to be immediately displayed to the user (e.g. for diagnostic purposes) and are based onto XAML, an XML dialect; other produce richly formatted and detailed HTML files for reporting; and finally all the data are stored in a standard packed-XML format, ready to be read by the next subsystem.

Observers

Here we find the fourth big subsystem, which reads the data packed in XML and literally observes them to detect any kind of phenomena we may be interested into, at any level: prosodies (e.g. muta cum liquida, redoubled consonants, accents distribution...), syntax (word types distribution and connections), metrics (verse instances, “laws”, caesurae, bridges, etc.). New observers can be added at any time. Actually, I have implemented 23 observers which collect some 200 data for each single line. This means that for a corpus like the one I analyzed using the previous generation of this system I could get more than 1 million data. All these data in their full detail are stored by observers in a relational database.

Once this database has been filled with observations, another software component is used to allow users to query it in whatever form they prefer. Users can ask the program to provide the details (with text line by line) for each combination of any phenomena, or generate synthetic reports by aggregating and filtering data as requested. The program outputs here range from formatted HTML with fully highlighted text to detailed data reports in standard formats like XML or CSV. You can interactively try a subset of the query functions for a minimal sample corpus here.

Finally, these files are imported by third-party software specialized for statistical analysis and charting, like e.g. MS Excel. Here we can test data for their significance, and emit hypotheses about the explanation of phenomena. In turn, this will probably require us to ask for further data to see our data in a different perspective: we can thus return any time to the previous program and ask new questions, which can further enlighten obscure points and newly arising questions. The system thus grants a fully interactive analysis process, a true laboratory for metrical and linguistical analysis.

Highlights

powered by ParaScroller