[libvoikko] error handling in the grammar checker

Francis Tyers ftyers at prompsit.com
Tue Oct 8 14:18:10 EEST 2013


El dt 01 de 10 de 2013 a les 21:42 +0300, en/na Harri Pitkänen va
escriure:
> On Tuesday 01 October 2013 16:24:17 Francis Tyers wrote:
> > I'd like to get ideas on how people think it would be best to deal with
> > abstracting the grammar/error.hpp grammar/error.cpp code. At the moment
> > there are some hardcoded messages for Finnish. For North Sámi, we have
> > (so far) around 140 error tags: http://pastebin.com/YKGbCxzx
> 
> So one thing to keep in mind is that currently the API for libvoikko has these 
> functions for dealing with the error codes:
> 
> 
> int voikkoGetGrammarErrorCode(const struct VoikkoGrammarError * error);
> 
> const char * voikko_error_message_cstr(int error_code, const char * language);
> 
> 
> Notably the second function that is used to get the human readable error 
> description does not receive any (even indirect) reference to VoikkoHandle. 
> Thus it cannot be used if the codes have different meaning for different 
> grammar checker implementations.
> 
> I'm open to extending this by adding new functions and deprecating these two. 
> We cannot remove them entirely but it is possible to maintain compatibility by 
> allocating a single code to represent all implementation specific errors. It 
> would need to have a fixed description such as "Language specific grammar 
> error".
> 
> > Rather than just including these in the C++ code, I think it might be
> > better to abstract out into a file which contains the error codes and
> > messages. This could be XML, or tab separated or however.
> 
> The API above makes it essentially impossible to read the strings from a file 
> at runtime. We return a "const char *" and allow the caller to expect that the 
> pointer will always point to a valid string. The only way to do this without 
> breaking anything would be to read the string and just let it leak...
> 
> But we could use such XML file during compilation to build a static data 
> structure to hold the error codes and descriptions. This would definitely be 
> an improvement over the current situation and would avoid changing the public 
> API for now. I'm sure that it needs to be changed at some point though.

I've started the error file here, it will be expanded as the grammar
checker is developed.

https://victorio.uit.no/langtech/trunk/langs/sme/tools/grammarcheckers/errors.xml

However, I think that in the long-term (e.g. before a first release), it
would be good to change the API to allow the strings to come from a
bundled XML file. Otherwise, we will essentially need to have either:

1) All of the errors for all of the languages bundled with libvoikko --
probably with unique codes, or
2) Some kind of shared library that voikko loads from the language
package.

Fran




More information about the Libvoikko mailing list