[libvoikko] Sámi/HFST

Flammie Pirinen flammie at iki.fi
Thu Jun 24 15:08:42 EEST 2010


2010-06-08, Harri Pitkänen sanoi:

> This means we might have a reasonably good speller now. How were you
> planning to implement the spelling suggestions? If there is a
> transducer available that can produce the suggestions, it would be
> nice if it could be made available in binary form in case building it
> requires more than 2 GB of memory.

If you are still missing the error models, I tossed the transducer
version of error model in hunspell se_FI.aff file of ubuntu
distribution to <http://www.helsinki.fi/~tapirine/tmp/se_FI.err.hfst>.
It's an automatically converted version allowing one error of kind
specified in KEY, REP or TRY directives of th se_FI.aff file in middle
of correctly spelled runs; since it's a finite state automaton you can
pretty much do anything to extend it, e.g. a model for accepting three
typoes or replacements is done by applying hfst-repeat --from=1 --to=3.
And there's the source code for this error model there as well:
<http://www.helsinki.fi/~tapirine/se_FI.err.lexc>, quite simple as you
can tell.

-- 
Flammie, computer scientist bachelor, linguist master, free software
Finnish localiser, and more! <http://www.iki.fi/flammie/>



More information about the Libvoikko mailing list