[libvoikko] hfst-ospell API

Harri Pitkänen hatapitk at iki.fi
Mon Aug 9 22:31:17 EEST 2010


On Monday 09 August 2010, Sam Hardwick wrote:
> This email is strictly about the spelling library for HFST's
> optimized-lookup transducer format, so if you're not interested in it,
> read no further. By the way, I thought I wrote about this to the list in
> July, but later noticed that it isn't in the archive - is mail only
> accepted from subscribers?

Yes, unfortunately we needed to set the mailing list this way to block spam. 
If I some day have time to set up a good spam filter so that moderating mail 
from non-subscribers becomes possible we may change this. For more information 
about our lists, see

  http://voikko.sourceforge.net/mail.html

> The library is not quite ready to be released yet, but if someone would
> like to give input on the API in particular, that would be very welcome.

Here are some random thought that came to my mind while reading the header and 
example code:

- ospell.h contains "#include <getopt.h>" which is really user interface code. 
It seems that the include could just be removed and added to main.h instead.

- Which headers are supposed to form the stable API that external applications 
would use? I guess both are supposed to become stable, hfst-ol.h is for all 
applications that need to use optimized-lookup transducers and ospell.h is 
just for spell checkers?

- Generally the headers look good, namespace clean etc.

- Many methods are defined in the header file hfst-ol.h. Once the header is 
released you may not be able to change the definitions without making the 
library backwards incompatible with earlier versions. It may be sensible to do 
this for performance sensitive parts (to allow inlining the methods) but at 
least the constructor of TransducerHeader is not likely to be called very 
often. I think for many of these could be defined in hfst-ol.cc. Alternatively 
the library could be made header-only and hfst-ol.cc removed completely.

- Declaring some data members and methods as "const" could improve performance 
in some places. For example FlagDiacriticOperation seems immutable but there 
are no hints to help the compiler to notice this.

- Instead of working around the compiler warning about not using the return 
value from fread() it would be better to check it to see if the file was 
actually read and fail cleanly if not.


Are there plans to change the HFST backend code in libvoikko to use this 
library in the near future? It would be interesting to see the performance 
difference.

Harri



More information about the Libvoikko mailing list