[libvoikko] Skolt Sami upper-case bug in libvoikko?

Harri Pitkänen hatapitk at iki.fi
Thu Mar 12 17:49:59 EET 2015

On Wednesday 11 March 2015 17:47:40 Trosterud Trond wrote:
> If I understand you correctly the upper limit is 0x01EF.

Not everything below 0x01EF is converted but I believe most of the holes in 
there are for punctuation or control characters.

> The characters 0x0218-0x021B are in use for Romanian. This is not exactly
> our focus, but we have a (bad) analyser (and speller) for it.
> 0x021E-0x021F are for Finnish Romani. Not on our plate.

Added 0x01F8 - 0x021F (continuous range of upper-lower pairs).

> There is a small/capital Skolt Saami pair where one is in Latin B and the
> other one outside of it:
> U+0292 ʒ (small) = U+01B7 Ʒ

Added this as a special case.

> Hmm, then there is cyrillic, but since that works, it means that you must
> have U+0400 - U+04FF already (we certainly do not have all pairs, but we do
> have much more than the Russian ones).

Yes, most of those are supported already. Some are missing but don't look like 
letters to me.


