[libvoikko] hfst-ospell API

Sam Hardwick sam.hardwick at gmail.com
Tue Aug 10 20:36:14 EEST 2010

Harri Pitkänen wrote:
> - ospell.h contains "#include <getopt.h>" which is really user interface code. 
> It seems that the include could just be removed and added to main.h instead.

Right, good catch, I moved it to the demo utility.

> - Which headers are supposed to form the stable API that external applications 
> would use? I guess both are supposed to become stable, hfst-ol.h is for all 
> applications that need to use optimized-lookup transducers and ospell.h is 
> just for spell checkers?

Actually I'm slightly on the fence about this. The spelling code works
completely differently than the lookup code, and while the spelling
"way" can also do lookup, it's not (yet?) super fast for it. It's
possible that the hfst_ol namespace will have a separate transducers
types for those different use cases (less convenient for users), a
single transducer with different functions and data structures for
different uses (messy, tricky) or that I will manage to make one way to
do things the best for all uses.

> - Many methods are defined in the header file hfst-ol.h. Once the header is 
> released you may not be able to change the definitions without making the 
> library backwards incompatible with earlier versions. It may be sensible to do 
> this for performance sensitive parts (to allow inlining the methods) but at 
> least the constructor of TransducerHeader is not likely to be called very 
> often. I think for many of these could be defined in hfst-ol.cc. Alternatively 
> the library could be made header-only and hfst-ol.cc removed completely.

Yeah, it would probably be best to move things to .cc - the current
division of code doesn't represent any particular design wisdom, it's
more historical / accidental / coding style.

> - Declaring some data members and methods as "const" could improve performance 
> in some places. For example FlagDiacriticOperation seems immutable but there 
> are no hints to help the compiler to notice this.

Right, I haven't developed the habit of sprinkling const in useful
places, but that's definitely a good idea and I'll look into it.

> - Instead of working around the compiler warning about not using the return 
> value from fread() it would be better to check it to see if the file was 
> actually read and fail cleanly if not.

Very true.

> Are there plans to change the HFST backend code in libvoikko to use this 
> library in the near future? It would be interesting to see the performance 
> difference.

Indeed, but keep in mind this is a bit prerelease...

Sam Hardwick

More information about the Libvoikko mailing list