Other Applications
As already stated, all the projects I'm developing conjure to a big picture and
share a big portion of their code. The work behind a project like the inflectional
software is huge, but once it will be completed we can expect it to apply to an
even broader scenario. For instance, besides the intrinsic scientific interest of
such a challenge, here are some possible applications of this software:
- in a digital edition it will be possible to automatically
perform a full lemmatization of the whole corpus: for instance, from forms like
laudābat, laudētis, laudāti erimus, etc. a software which can inflect ANY Latin
word will be able to recognize the unique base lemma laudō. Further, given my historical
approach the benefit will be extended also to relatively uncommon forms (archaic,
dialectal, vulgar, late, etc.). It's worth notice that this benefit is not limited
to the creation of 'static' indexes: in a truly digital edition it will be possible
to allow searches of a complexity never attained before with traditional tools.
Probably most readers are using digital tools which allow searching one or more
word forms: in the luckiest cases such tools offer also a 'wildcards' system to
cope with the complexity of heavily inflectional languages like Greek and Latin,
so you can e.g. look for something like ros* to get all the word forms beginning
with ros-: this way you can find the word rosa even if inflected in forms like rosae,
rosam, rosarum, rosis, etc., but the downside is that the results will also include
unwanted forms (for instance also rostra). Very few tools adopt smarter technologies
like regular expressions, but even then it's impossible to cope with all the complex
formal changes of a single word. With an inflectional software instead the search
tool will be much smarter, and allow you to just look for all the inflected forms
of rosa. Even further, much more complex searches will be possible, like looking
for all the forms of rosa in genitive or accusative plus all the infectum forms
of laudo, or all the nouns in genitive followed by verbs in infinitive, and so forth...
it's easy to imagine how far such searches can be pushed.
- specialized applications of the previous point can be imagined for particular
editions: for instance, from an epigraphical edition we could automatically build
a complete index of abbreviations with their solutions and even frequency of occurrence
in any form: the software will be able to understand that all the (abbreviated)
forms like f(īlius), f(īlī), f(īliī), f(īliae), f(iliīs), f(īliārum) etc. are to
be reconducted to the same lemma and thus belong to the same abbreviation F, and
will collect the frequency of all the inflected forms and sum it to get the total
frequency of the abbreviation.
- for texts with a difficult transmission, a full listing of all the inflected forms
of all the Greek or Latin words from their oldest forms up to the classical era
(and eventually beyond) could also prove useful as a practical tool in helping the
editor in finding the best candidate for integrating lacunae or deciding about corrupted
passages.
- a more trivial (yet sometimes useful) application could also be offered in contexts
where spell-checking capabilities are required: of course, such applications require
a full list of all the inflected forms of a language.
- besides the intrinsic linguistic and philological interest of such a computer historical
grammar, its applications in other areas of the system are endless: just imagine
which level of refinement could be attained by the
metrics analyzer component once
it gets access to the morphological and syntactical structure of the text being
analyzed...