Classical Projects Overview
My projects in Classics cover several interconnected fields. Here you can get a
very essential overview of some of them whose aspects are somewhat treated in this
site.
In this picture you can see four main software components families and their links
to some of the topics covered in this site (yellow clouds). Three main areas of
interest and work gave rise to these components during the past years: Classical
metrics, historical linguistics, and the digital publications in a very broad sense,
started for several commercial projects for some Italian publishers (paper and cd-rom
publications, mainly in the area of dictionaries and Classics), and then applied
to the collection of XML-based highly opened and structured data of ancient inscriptions,
literary passages and non-inscribed objects.
The XML-based corpora (see Cadmus) are
related to several epigraphical-centric projects, which must also include a huge
quantity of meta- or extra-textual data variously interconnected for specialized
purposes and require quantitative and qualitative expansions during all the data
collection stage. These corpora are just Unicode XML files and implement a similar
structure, specialized for each collection of data: at this time Cadmus handles
collections of inscriptions, literary passages and non-inscribed objects. As for
in-place search since edit time Cadmus uses a Lucene-based textual and metatextual
engine, connected to a more general textual corpora search framework applied also
to other projects.
Some helpers for Classical texts antiquities
can be placed in the editor or in the publication software to allow for smarter
searches or editing aids.
Metrics and morphology are covered by the language software components
family (Chiron). This large family includes components
which offer two main services: a full metrical
analysis of Classical texts and the automatic
historical inflection of all the lemmata of Latin or Greek. For obvious reasons
several of these components are shared between phonology and morphology, and their
architecture is thought to be open to extension (so that for instance the syllabification
algorithm can be easily applied to other languages like
Italian).
Finally, all these components share a set of textual encoding converters which can deal
with a lot of specialized or arbitrary encoding systems, like
Beta code,
SAMPA,
Unicode and arbitary font-based encodings (often
used in conjunction with Greek text: see e.g.
Theuth). These converters allow
to manipulate texts from various sources for data import/export and also to generate
formatted output in 'smart' ways (you can have a read at some sections of
this paper to learn more).