[libvoikko] Skolt Sami upper-case bug in libvoikko?
hatapitk at iki.fi
Thu Mar 12 17:49:59 EET 2015
On Wednesday 11 March 2015 17:47:40 Trosterud Trond wrote:
> If I understand you correctly the upper limit is 0x01EF.
Not everything below 0x01EF is converted but I believe most of the holes in
there are for punctuation or control characters.
> The characters 0x0218-0x021B are in use for Romanian. This is not exactly
> our focus, but we have a (bad) analyser (and speller) for it.
> 0x021E-0x021F are for Finnish Romani. Not on our plate.
Added 0x01F8 - 0x021F (continuous range of upper-lower pairs).
> There is a small/capital Skolt Saami pair where one is in Latin B and the
> other one outside of it:
> U+0292 ʒ (small) = U+01B7 Ʒ
Added this as a special case.
> Hmm, then there is cyrillic, but since that works, it means that you must
> have U+0400 - U+04FF already (we certainly do not have all pairs, but we do
> have much more than the Russian ones).
Yes, most of those are supported already. Some are missing but don't look like
letters to me.
More information about the Libvoikko