[libvoikko] Strange bug in the interface between hfst-ospell and libvoikko

Sjur Moshagen sjurnm at mac.com
Thu Dec 12 17:16:14 EET 2013


12. des. 2013 kl. 15:52 skrev Harri Pitkänen <hatapitk at iki.fi>:

> On Thursday 12 December 2013 01:57:07 Sjur Moshagen wrote:
>> Configuration:
>> * svn HEAD of hfst-ospell
>> * newest revision of the master branch of libvoikko
> 
> I cannot reproduce this bug with this configuration on Linux. Does switching 
> to different XML parser make any difference?

Yes:

$ voikkospell -s -p tools/spellcheckers/fstbased/hfst/ -d sms
E: Initialization of Voikko failed: Specified dictionary variant was not found

$ voikkospell -s -p tools/spellcheckers/fstbased/hfst/ -d und
jih
W: jih
S: ai
S: iin
S: ja
S: sij
S: ǩii
^C

$ voikkospell -l -p tools/spellcheckers/fstbased/hfst/
und-x-standard:

That is, the speller lexicon isn’t recognized using the proper language code anymore, and instead I have to use the code ‘und’. But using und, the speller still behaves correctly.

More importantly, both SMA and FAO works now (using ‘und’ as the language code):

$ cd ../sma/
$ voikkospell -l -p tools/spellcheckers/fstbased/hfst/
und-x-standard:
$ voikkospell -s -p tools/spellcheckers/fstbased/hfst/ -d und
jih
W: jih
S: jïh
S: Ujih
S: mih
S: sih
S: Lih
^C
$ cd ../fao/
$ voikkospell -l -p tools/spellcheckers/fstbased/hfst/
und-x-standard:
$ voikkospell -s -p tools/spellcheckers/fstbased/hfst/ -d und
ikkki
W: ikkki
S: ikki
S: Erkki
S: Kilkki
S: Piikki
S: Virkki

So - while both xml libraries are somewhat broken, tinyxml seems to be more broken.

This is all tested on MacOSX 10.6.

I colleage of mine tested the two xml libraries on Linux, and could not reproduce my bug with TinyXML. He could reproduce the ‘und’ bug though, and consistently for both xml libraries.

Sjur



More information about the Libvoikko mailing list