[libvoikko] Skolt Sami upper-case bug in libvoikko?

Trosterud Trond trond.trosterud at uit.no
Wed Mar 11 19:47:40 EET 2015


> 11. mars 2015 kl. 17:01 skrev Harri Pitkänen <hatapitk at iki.fi>:
>> 
>> There were no case handling rules for Latin Extended-B block that contains ǩ 
>> and Ǩ. I have now added rules for characters from 0x01DE to 0x01EF. Let me know if this is sufficient for you or if other character ranges need to be 
>> supported.


If I understand you correctly the upper limit is 0x01EF.

The characters 0x0218-0x021B are in use for Romanian. This is not exactly our focus, but we have a (bad) analyser (and speller) for it.

0x021E-0x021F are for Finnish Romani. Not on our plate.

There is a small/capital Skolt Saami pair where one is in Latin B and the other one outside of it:

U+0292 ʒ (small) = U+01B7 Ʒ

Hmm, then there is cyrillic, but since that works, it means that you must have U+0400 - U+04FF already (we certainly do not have all pairs, but we do have much more than the Russian ones).

Trond.


More information about the Libvoikko mailing list