[libvoikko] HFST speller lexicon spec - draft 0.2

Flammie Pirinen flammie at iki.fi
Sat Nov 6 20:50:02 EET 2010

2010-10-13, Sjur Moshagen sanoi:

> It would also be nice to know whether there are any concerns for
> adding support for this specification to hfst-ospell.

At the moment I think the spec is good to go and I only have practical
engineering issues at hand. 

For the first I am not quite sure as to
where to find any free zip library even for the subset of features now
specified; zlib claims support for the algorithm but not the container
format for example[1]. Ideally there would be some small library
available on all systems for this use as to not get any more
dependencies for hfst-ospell.

Another practical problem is, that last weeks I've been experimenting
with different morphologies, and increase in size will have quite fast
decrease in processing efficiency; morphologies beyond 100 megs of
transducer size will already take up up to minute to load up (in OO.o,
15 seconds with hfst-ospell and 30 with voikkospell on the same system).
If zipping the files will increase this time, it is a serious problem,
as end users will not easily tolerate openoffice freezing at startup
(current ooovoikko will do just that, with no clues to user of what is

All in all I think what I'd aim for is to push my current
hfst-ospell patch against voikko and current hfst-ospell to public and
then start developing the hfst-ospell lib and voikko's hfst part to
match the lexicon spec.

By the way, on voikko part of the world, can I expect that spellers can
be tossed to $voikkodir/3/*.zhfst?  

Flammie, computer scientist bachelor, linguist master, free software
Finnish localiser, and more! <http://www.iki.fi/flammie/>

More information about the Libvoikko mailing list