[libvoikko] error handling in the grammar checker

Francis Tyers ftyers at prompsit.com
Wed Oct 2 11:54:21 EEST 2013

El dt 01 de 10 de 2013 a les 21:42 +0300, en/na Harri Pitkänen va
> On Tuesday 01 October 2013 16:24:17 Francis Tyers wrote:
> > I'd like to get ideas on how people think it would be best to deal with
> > abstracting the grammar/error.hpp grammar/error.cpp code. At the moment
> > there are some hardcoded messages for Finnish. For North Sámi, we have
> > (so far) around 140 error tags: http://pastebin.com/YKGbCxzx
> So one thing to keep in mind is that currently the API for libvoikko has these 
> functions for dealing with the error codes:
> int voikkoGetGrammarErrorCode(const struct VoikkoGrammarError * error);
> const char * voikko_error_message_cstr(int error_code, const char * language);
> Notably the second function that is used to get the human readable error 
> description does not receive any (even indirect) reference to VoikkoHandle. 
> Thus it cannot be used if the codes have different meaning for different 
> grammar checker implementations.
> I'm open to extending this by adding new functions and deprecating these two. 
> We cannot remove them entirely but it is possible to maintain compatibility by 
> allocating a single code to represent all implementation specific errors. It 
> would need to have a fixed description such as "Language specific grammar 
> error".

This sounds like a good idea. perhaps something like '42' for the
"Language specific grammar error" ? :)

> > Rather than just including these in the C++ code, I think it might be
> > better to abstract out into a file which contains the error codes and
> > messages. This could be XML, or tab separated or however.
> The API above makes it essentially impossible to read the strings from a file 
> at runtime. We return a "const char *" and allow the caller to expect that the 
> pointer will always point to a valid string. The only way to do this without 
> breaking anything would be to read the string and just let it leak...
> But we could use such XML file during compilation to build a static data 
> structure to hold the error codes and descriptions. This would definitely be 
> an improvement over the current situation and would avoid changing the public 
> API for now. I'm sure that it needs to be changed at some point though.

This sounds like a reasonable compromise. It will mean that the linguist
working on the rules does not have to learn to code in C++ :)


More information about the Libvoikko mailing list