[libvoikko] hfst-ospell API
Harri Pitkänen
hatapitk at iki.fi
Mon Aug 9 22:31:17 EEST 2010
On Monday 09 August 2010, Sam Hardwick wrote:
> This email is strictly about the spelling library for HFST's
> optimized-lookup transducer format, so if you're not interested in it,
> read no further. By the way, I thought I wrote about this to the list in
> July, but later noticed that it isn't in the archive - is mail only
> accepted from subscribers?
Yes, unfortunately we needed to set the mailing list this way to block spam.
If I some day have time to set up a good spam filter so that moderating mail
from non-subscribers becomes possible we may change this. For more information
about our lists, see
http://voikko.sourceforge.net/mail.html
> The library is not quite ready to be released yet, but if someone would
> like to give input on the API in particular, that would be very welcome.
Here are some random thought that came to my mind while reading the header and
example code:
- ospell.h contains "#include <getopt.h>" which is really user interface code.
It seems that the include could just be removed and added to main.h instead.
- Which headers are supposed to form the stable API that external applications
would use? I guess both are supposed to become stable, hfst-ol.h is for all
applications that need to use optimized-lookup transducers and ospell.h is
just for spell checkers?
- Generally the headers look good, namespace clean etc.
- Many methods are defined in the header file hfst-ol.h. Once the header is
released you may not be able to change the definitions without making the
library backwards incompatible with earlier versions. It may be sensible to do
this for performance sensitive parts (to allow inlining the methods) but at
least the constructor of TransducerHeader is not likely to be called very
often. I think for many of these could be defined in hfst-ol.cc. Alternatively
the library could be made header-only and hfst-ol.cc removed completely.
- Declaring some data members and methods as "const" could improve performance
in some places. For example FlagDiacriticOperation seems immutable but there
are no hints to help the compiler to notice this.
- Instead of working around the compiler warning about not using the return
value from fread() it would be better to check it to see if the file was
actually read and fail cleanly if not.
Are there plans to change the HFST backend code in libvoikko to use this
library in the near future? It would be interesting to see the performance
difference.
Harri
More information about the Libvoikko
mailing list