[libvoikko] Possible encoding bug in latest voikkospell

Sjur Moshagen sjurnm at mac.com
Thu Sep 25 11:03:54 EEST 2014


$ voikkospell -s -d se
giellla
W: giellla
S: giella
S: giellal
S: giellala
S: giellula
S: giell?


hfst-ospell:

$ echo giellla | hfst-ospell -S /usr/local/lib/voikko/3/se.zhfst
"giellla" is NOT in the lexicon:
Corrections for "giellla":
giella    1.000000
giellal    1.100000
giellala    1.100000
giellula    1.100000
giellá    1.950195

env:
LC_ALL=no_NO.UTF-8

$ echo giellla | voikkospell -s -d se | iconv -f Latin1 -t UTF-8
W: giellla
S: giella
S: giellal
S: giellala
S: giellula
S: giellá

I had expected one of three:
1. either: use UTF-8 as default
2. or: use UTF-8 as the only encoding
3. or: pick the encodiong to use from the environment

$ voikkospell --version
voikkospell version 3.7.1
libvoikko version 3.7.1

Sjur



More information about the Libvoikko mailing list