[libvoikko] [Apertium-stuff] Lttoolbox (Apertium) morphology backend
Francis Tyers
ftyers at prompsit.com
Mon Mar 1 19:24:47 EET 2010
El dl 01 de 03 de 2010 a les 17:36 +0200, en/na Harri Pitkänen va
escriure:
> On Sunday 28 February 2010 22:07:12 Francis Tyers wrote:
> > Nah, that doesn't do it :/
> >
> > The new method should probably just be a copy of the old one, only that
> > checks to see if all the input has been consumed.
>
> I came up with an ugly workaround. For all input strings we compare the output
> from biltransWithoutQueue with the output for the same input with last
> character removed. If removing a character does not change the result, we can
> be quite confident that the original input was invalid.
>
> Of course this makes the analysis twice as slow as it used to be and the
> method is not 100% accurate at least theoretically. But it seems to work. A
> new method in Lttoolbox is still needed if we want to do this properly.
I will look into it. There may be some code in the apertium-service
module that does this by wrapping FILE* streams into c++ memstreams? But
I don't remember exactly. Pasquale ?
> I also fixed a bug in core libvoikko speller code that, due to the way our
> Malaga backend is implemented, never showed up in Finnish spell checking but
> caused lots of incorrectly rejected words when using Lttoolbox. Now it seems
> to me that the remaining issues with Icelandic spell checking are either due
> to words not being in the lexicon or wrong tokenization in OOo. If you find
> anything that still needs fixing, let me know.
Ok great,
I wasn't able to get it up and running in openoffice -- the SDK in
Debian seems to be non-trivial to set up :(, but could you send another
screenshot ?
Fran
More information about the Libvoikko
mailing list