[libvoikko] Strange bug in the interface between hfst-ospell and libvoikko
Sjur Moshagen
sjurnm at mac.com
Thu Dec 12 17:16:14 EET 2013
12. des. 2013 kl. 15:52 skrev Harri Pitkänen <hatapitk at iki.fi>:
> On Thursday 12 December 2013 01:57:07 Sjur Moshagen wrote:
>> Configuration:
>> * svn HEAD of hfst-ospell
>> * newest revision of the master branch of libvoikko
>
> I cannot reproduce this bug with this configuration on Linux. Does switching
> to different XML parser make any difference?
Yes:
$ voikkospell -s -p tools/spellcheckers/fstbased/hfst/ -d sms
E: Initialization of Voikko failed: Specified dictionary variant was not found
$ voikkospell -s -p tools/spellcheckers/fstbased/hfst/ -d und
jih
W: jih
S: ai
S: iin
S: ja
S: sij
S: ǩii
^C
$ voikkospell -l -p tools/spellcheckers/fstbased/hfst/
und-x-standard:
That is, the speller lexicon isn’t recognized using the proper language code anymore, and instead I have to use the code ‘und’. But using und, the speller still behaves correctly.
More importantly, both SMA and FAO works now (using ‘und’ as the language code):
$ cd ../sma/
$ voikkospell -l -p tools/spellcheckers/fstbased/hfst/
und-x-standard:
$ voikkospell -s -p tools/spellcheckers/fstbased/hfst/ -d und
jih
W: jih
S: jïh
S: Ujih
S: mih
S: sih
S: Lih
^C
$ cd ../fao/
$ voikkospell -l -p tools/spellcheckers/fstbased/hfst/
und-x-standard:
$ voikkospell -s -p tools/spellcheckers/fstbased/hfst/ -d und
ikkki
W: ikkki
S: ikki
S: Erkki
S: Kilkki
S: Piikki
S: Virkki
So - while both xml libraries are somewhat broken, tinyxml seems to be more broken.
This is all tested on MacOSX 10.6.
I colleage of mine tested the two xml libraries on Linux, and could not reproduce my bug with TinyXML. He could reproduce the ‘und’ bug though, and consistently for both xml libraries.
Sjur
More information about the Libvoikko
mailing list