[libvoikko] error handling in the grammar checker

Francis Tyers ftyers at prompsit.com
Wed Oct 2 11:30:14 EEST 2013

El dc 02 de 10 de 2013 a les 10:07 +0300, en/na Sjur Moshagen va
> 1. okt. 2013 kl. 21:53 skrev Sjur Moshagen <sjurnm at mac.com>:
> >> <code>grm-wrong-case</code>
> >> <legacyCode>6</legacyCode>
> >> <descriptions>
> >> <shortDescription xml::lang="fi"/>
> >> <longDescription xml::lang="fi"/>
> >> <shortDescription xml::lang="en"/>
> >> <longDescription xml::lang="en"/>
> >> </descriptions>
> > 
> > Should the format be enhanced with variables or pointers to the error 
> string in the original text? Cf the error messages found in  the MS 
> Office grammar checkers, which often refer to a word 
> form (in context), and give a textual explanation for why the given 
> suggestion would be a reasonable correction of the error. Hypothetical example:
> > 
> > "Check the word forms ABC and DEF. ABC seems to be a subject in 
> plural, but the main verb DEF is in singular. Either change the 
> subject or the verb:
> > 
> > ABC -> ABG
> > DEF -> DEH"
> > 
> > It might be that this example is a bit elaborate, but I expect 
> that many users of the Sámi grammar checker will not have Sámi as 
> their first language, and giving a correct and explicit error 
> message would be very useful.
> Kaldera has something like the following for their grammar checkers:
> <e n="grm-wrong-case-hc">
>     <s>Wrong case</s>
>     <l>The word "%(err)" is in hypothetative, but the subjunctive expects a counterfactative.</l>
> </e>
> (s=short, l=long)
> Changing a few element and attribute names and adding support for 
> multiple languages and legacy id's this could look like:
> <error id="grm-wrong-case-hc" legacyid="6">
>     <errtitle xml:lang="en">Wrong case</errtitle>
>     <errtext xml:lang="en">The word "%(err)" is in hypothetative, 
> but the subjunctive expects a counterfactative.</errtext>
> </error>
> (to me the short and long texts in Kaldera look more like a title and 
> a relatively short explanation - the long explanation would be much longer:) )
> Would the above structure be something to adopt? Is the %(err) notation ok?

How about:

<error id="grm-wrong-case-hc" legacyid="6">
    <title xml:lang="en">Wrong case</title>
    <title xml:lang="es">Caso equivocado</title>
    <description xml:lang="en">The word "%(err)" is in hypothetative,
but the subjunctive expects a counterfactative.</description>

The %(err) notation is fine, but we could also go with something like $1
$2 ... if we wanted to be different, and to support >1 word.


More information about the Libvoikko mailing list