[libvoikko] Sámi/HFST

Harri Pitkänen hatapitk at iki.fi
Mon Jun 7 10:12:58 EEST 2010


On Monday 07 June 2010, Andris Pavenis wrote:
> You were near. A bit larger swap partition recommended...
> 
> Tried on Linux x86_64 (Fedora 13): memory amount for hfst-compose-intersect
> process grow up to 2110Mb (no problem at all when 8Gb RAM present). I'm
> not sure I did all correctly though.

Indeed, after shutting down X there was enough memory to allow hfst-compose-
intersect to complete. Unfortunately the resulting transducer was still in 
wrong format.

Looking more closely it seems that I should have done "make hwfst TARGET=sme" 
instead of "make hfst ...". That seems to require even more memory though. The 
next thing I tried was to modify HfstSpeller to use unweighted transducers 
instead of weighted ones by replacing all references to HWFST namespace with 
HFST and recompiling libvoikko. That solved the problem and I was able to do 
some basic spell checking:

$ /home/harri/apps/bin/voikkospell -d fi-x-sme
gabba
C: gabba
reindeer
W: reindeer

In voikko-fi_FI.pro I have the following:

info: Voikko-Dictionary-Format: 2
info: Language-Code: fi_FI
info: Language-Variant: sme
info: Description: Kokeellinen pohjoissaamen morfologia
info: Morphology-Backend: null
info: Speller-Backend: hfst

Harri



More information about the Libvoikko mailing list