[libvoikko] Sámi/HFST
Harri Pitkänen
hatapitk at iki.fi
Mon Jun 7 10:12:58 EEST 2010
On Monday 07 June 2010, Andris Pavenis wrote:
> You were near. A bit larger swap partition recommended...
>
> Tried on Linux x86_64 (Fedora 13): memory amount for hfst-compose-intersect
> process grow up to 2110Mb (no problem at all when 8Gb RAM present). I'm
> not sure I did all correctly though.
Indeed, after shutting down X there was enough memory to allow hfst-compose-
intersect to complete. Unfortunately the resulting transducer was still in
wrong format.
Looking more closely it seems that I should have done "make hwfst TARGET=sme"
instead of "make hfst ...". That seems to require even more memory though. The
next thing I tried was to modify HfstSpeller to use unweighted transducers
instead of weighted ones by replacing all references to HWFST namespace with
HFST and recompiling libvoikko. That solved the problem and I was able to do
some basic spell checking:
$ /home/harri/apps/bin/voikkospell -d fi-x-sme
gabba
C: gabba
reindeer
W: reindeer
In voikko-fi_FI.pro I have the following:
info: Voikko-Dictionary-Format: 2
info: Language-Code: fi_FI
info: Language-Variant: sme
info: Description: Kokeellinen pohjoissaamen morfologia
info: Morphology-Backend: null
info: Speller-Backend: hfst
Harri
More information about the Libvoikko
mailing list