[libvoikko] Another language for voikko: Avar!

Francis Tyers spectre at ivixor.net
Sun Mar 9 14:43:05 EET 2014


Hey guys!

Avar (along with Plains Cree) was recently added to LibreOffice. The
latest nightly build[1] from the contains the new languages. I built
libreoffice-voikko with the 4.3 on my Linux machine and everything works
according to plan: 

 http://i.imgur.com/CoaDk6L.png

Check it out :D (Yes, the spelling correction is probably wrong ;)

HFST Team: It would be cool if HFST did not crash/exit on finding a
non-alphabetic symbol in the error model.

 terminate called after throwing an instance of
'hfst_ol::AlphabetTranslationException'
   what():  I

The problem here is that in many of the languages of the Caucasus you
have this "paločka" (U+04C0) which is pretty much never on keyboard
layouts, so people use either U+0406 (Cyrillic 'I') or U+0049 (Latin
'I'). The Latin and Cyrillic characters are unlikely to be in the
automaton for Avar as they aren't really used. I think that something
similar happens with Komi (e.g. the 'ö'). 

It would be good to be able to release spellcheckers with a kind of
spellrelax where the Latin characters do not cause spelling errors
(really this isn't a spelling error, it's an encoding error).

Any thoughts on how to do this ? -- Most of the errors you see in that
text are because of this problem.

Fran

1.
http://dev-builds.libreoffice.org/daily/master/Linux-rpm_deb-x86_64@46-TDF/current/




More information about the Libvoikko mailing list