[libvoikko] [Apertium-stuff] Lttoolbox (Apertium) morphology backend
Harri Pitkänen
hatapitk at iki.fi
Mon Mar 1 17:36:12 EET 2010
On Sunday 28 February 2010 22:07:12 Francis Tyers wrote:
> Nah, that doesn't do it :/
>
> The new method should probably just be a copy of the old one, only that
> checks to see if all the input has been consumed.
I came up with an ugly workaround. For all input strings we compare the output
from biltransWithoutQueue with the output for the same input with last
character removed. If removing a character does not change the result, we can
be quite confident that the original input was invalid.
Of course this makes the analysis twice as slow as it used to be and the
method is not 100% accurate at least theoretically. But it seems to work. A
new method in Lttoolbox is still needed if we want to do this properly.
I also fixed a bug in core libvoikko speller code that, due to the way our
Malaga backend is implemented, never showed up in Finnish spell checking but
caused lots of incorrectly rejected words when using Lttoolbox. Now it seems
to me that the remaining issues with Icelandic spell checking are either due
to words not being in the lexicon or wrong tokenization in OOo. If you find
anything that still needs fixing, let me know.
Harri
More information about the Libvoikko
mailing list