Output
This system generates several outputs: some are designed to be immediately displayed
to the user (e.g. for diagnostic purposes) and are based onto XAML, an XML dialect;
other produce richly formatted and detailed HTML files for reporting; and finally
all the data are stored in a standard packed-XML format, ready to be read by the
next subsystem.
Observers
Here we find the fourth big subsystem, which reads the data packed in XML and literally
observes them to detect any kind of phenomena we may be interested into, at any
level: prosodies (e.g. muta cum liquida, redoubled consonants, accents distribution...),
syntax (word types distribution and connections), metrics (verse instances, “laws”,
caesurae, bridges, etc.). New observers can be added at any time. Actually, I have
implemented 23 observers which collect some 200 data for each single line. This
means that for a corpus like the one I analyzed using the previous generation of
this system I could get more than 1 million data. All these data in their full detail
are stored by observers in a relational database.
Once this database has been filled with observations, another software component
is used to allow users to query it in whatever form they prefer. Users can ask the
program to provide the details (with text line by line) for each combination of
any phenomena, or generate synthetic reports by aggregating and filtering data as
requested. The program outputs here range from formatted HTML with fully highlighted
text to detailed data reports in standard formats like XML or CSV. You can interactively
try a subset of the query functions for a minimal sample corpus
here.
Finally, these files are imported by third-party software specialized for statistical
analysis and charting, like e.g. MS Excel. Here we can test data for their significance,
and emit hypotheses about the explanation of phenomena. In turn, this will probably
require us to ask for further data to see our data in a different perspective: we
can thus return any time to the previous program and ask new questions, which can
further enlighten obscure points and newly arising questions. The system thus grants
a fully interactive analysis process, a true laboratory for metrical and linguistical
analysis.