[libvoikko] Weighted VFST transducers

Harri Pitkänen hatapitk at iki.fi
Thu Jul 23 20:09:07 EEST 2015


Hi!

Libvoikko has supported weighted VFST transducers for a few weeks. This 
is useful mostly for providing spelling suggestions, the previous 
version of our "plain" (non-Finnish) VFST speller backend did not 
support these at all.

You can convert a HFST speller to VFST format using the following 
commands:

  hfst-fst2txt acceptor.default.hfst | sort -n | voikkovfstc -w log -o 
spl.vfst
  hfst-fst2txt errmodel.default.hfst | sort -n | voikkovfstc -w log -o 
err.vfst

Place these two files (spl.vfst and err.vfst) under 
~/.voikko/2/mor-something/
You will also need voikko-fi_FI.pro with the following content:

  info: Voikko-Dictionary-Format: 2
  info: Language-Code: xy
  info: Language-Variant: something
  info: Description: Some description
  info: Morphology-Backend: null
  info: Speller-Backend: vfst
  info: Suggestion-Backend: vfst

This should be all that is needed. Please note that due to different 
internal representation of weights in HFST and VFST formats you may not 
get your spelling suggestions in the exactly same order. The difference 
did not appear significant to me when I tested it.

Morphological analysis with weighted transducers is also possible: 
Create an analyzer, save it as mor.vfst and set Morphology-Backend to 
vfst instead of null.

Thanks to UiT The Arctic University of Norway for sponsoring the work 
on these new features.

Harri


More information about the Libvoikko mailing list