[libvoikko] Weighted VFST transducers
Harri Pitkänen
hatapitk at iki.fi
Thu Jul 23 20:09:07 EEST 2015
Hi!
Libvoikko has supported weighted VFST transducers for a few weeks. This
is useful mostly for providing spelling suggestions, the previous
version of our "plain" (non-Finnish) VFST speller backend did not
support these at all.
You can convert a HFST speller to VFST format using the following
commands:
hfst-fst2txt acceptor.default.hfst | sort -n | voikkovfstc -w log -o
spl.vfst
hfst-fst2txt errmodel.default.hfst | sort -n | voikkovfstc -w log -o
err.vfst
Place these two files (spl.vfst and err.vfst) under
~/.voikko/2/mor-something/
You will also need voikko-fi_FI.pro with the following content:
info: Voikko-Dictionary-Format: 2
info: Language-Code: xy
info: Language-Variant: something
info: Description: Some description
info: Morphology-Backend: null
info: Speller-Backend: vfst
info: Suggestion-Backend: vfst
This should be all that is needed. Please note that due to different
internal representation of weights in HFST and VFST formats you may not
get your spelling suggestions in the exactly same order. The difference
did not appear significant to me when I tested it.
Morphological analysis with weighted transducers is also possible:
Create an analyzer, save it as mor.vfst and set Morphology-Backend to
vfst instead of null.
Thanks to UiT The Arctic University of Norway for sponsoring the work
on these new features.
Harri
More information about the Libvoikko
mailing list