[libvoikko] Lttoolbox (Apertium) morphology backend

Francis Tyers ftyers at prompsit.com
Sat Feb 27 23:01:15 EET 2010


El ds 27 de 02 de 2010 a les 23:55 +0200, en/na Harri Pitkänen va
escriure:
> On Wednesday 24 February 2010, Harri Pitkänen wrote:
> > If I remember correctly, Apertium is already in Debian. I could take a
> >  look  next week and see what can be done.
> 
> Well, I didn't wait until next week but implemented the backend already. You 
> can try this by using the sources in SVN:

Wow great!!

> - On Debian unstable install packages lttoolbox, liblttoolbox3-3.1-0, 
> liblttoolbox3-3.1-0-dev
> - Check out SVN sources
> - ./autogen.sh
> - ./configure --prefix=/some/dir --enable-lttoolbox
> - make install
> - Copy stuff from http://www.puimula.org/htp/testing/apertium/ to
>   ~/.voikko/2/mor-apertium/
> 
> Now you should have a dictionary variant "apertium" available. The test 
> dictionary was copied from 
> http://wiki.apertium.org/wiki/Lttoolbox#Using_as_a_library
> and it recognizes words "car" and "cars". Event the spelling suggestions work:
> 
> $ /some/dir/bin/voikkospell -d apertium -s
> car
> C: car
> cara
> W: cara
> S: cars
> S: car
> 
> 
> I must say I was quite impressed on how easy this was. No compilation 
> problems, no license hassles (Apertium uses exactly the same license as 
> libvoikko), easy to use API. API documentation was a bit hard to find though.

:)

> I also tried to build some of the real word dictionaries but could not figure 
> out which one would work and how it should be used.

You can grab the dictionary from:

http://apertium.svn.sourceforge.net/svnroot/apertium/trunk/apertium-br-fr/apertium-br-fr.br-fr.dix

and

https://apertium.svn.sourceforge.net/svnroot/apertium/trunk/apertium-is-en/apertium-is-en.is.dix

For Breton and Icelandic.

The .bin files are compiled by:

$ lt-comp lr <dix> <bin>

e.g. lt-comp lr apertium-br-fr.br-fr.dix br.bin

As far as we know these are platform independent compiled binaries. I've
tried them on several architectures and they "just work", not 100% sure
about Windows though.

Fran

PS. (to Apertium folk) Now we have this, we should perhaps think about
marking standard / non-standard forms some way, or being able to compile
dictionaries as spellers (e.g. without <g> and <b/> multiword entries).




More information about the Libvoikko mailing list