[libvoikko] [Apertium-stuff] Lttoolbox (Apertium) morphology backend

Francis Tyers ftyers at prompsit.com
Mon Mar 1 19:24:47 EET 2010


El dl 01 de 03 de 2010 a les 17:36 +0200, en/na Harri Pitkänen va
escriure:
> On Sunday 28 February 2010 22:07:12 Francis Tyers wrote:
> > Nah, that doesn't do it :/
> >
> > The new method should probably just be a copy of the old one, only that
> > checks to see if all the input has been consumed.
> 
> I came up with an ugly workaround. For all input strings we compare the output 
> from biltransWithoutQueue with the output for the same input with last 
> character removed. If removing a character does not change the result, we can 
> be quite confident that the original input was invalid.
> 
> Of course this makes the analysis twice as slow as it used to be and the 
> method is not 100% accurate at least theoretically. But it seems to work. A 
> new method in Lttoolbox is still needed if we want to do this properly.

I will look into it. There may be some code in the apertium-service
module that does this by wrapping FILE* streams into c++ memstreams? But
I don't remember exactly. Pasquale ?

> I also fixed a bug in core libvoikko speller code that, due to the way our 
> Malaga backend is implemented, never showed up in Finnish spell checking but 
> caused lots of incorrectly rejected words when using Lttoolbox. Now it seems 
> to me that the remaining issues with Icelandic spell checking are either due 
> to words not being in the lexicon or wrong tokenization in OOo. If you find 
> anything that still needs fixing, let me know.

Ok great,

I wasn't able to get it up and running in openoffice -- the SDK in
Debian seems to be non-trivial to set up :(, but could you send another
screenshot  ? 

Fran




More information about the Libvoikko mailing list