[libvoikko] Lttoolbox (Apertium) morphology backend
Francis Tyers
ftyers at prompsit.com
Sat Feb 27 23:01:15 EET 2010
El ds 27 de 02 de 2010 a les 23:55 +0200, en/na Harri Pitkänen va
escriure:
> On Wednesday 24 February 2010, Harri Pitkänen wrote:
> > If I remember correctly, Apertium is already in Debian. I could take a
> > look next week and see what can be done.
>
> Well, I didn't wait until next week but implemented the backend already. You
> can try this by using the sources in SVN:
Wow great!!
> - On Debian unstable install packages lttoolbox, liblttoolbox3-3.1-0,
> liblttoolbox3-3.1-0-dev
> - Check out SVN sources
> - ./autogen.sh
> - ./configure --prefix=/some/dir --enable-lttoolbox
> - make install
> - Copy stuff from http://www.puimula.org/htp/testing/apertium/ to
> ~/.voikko/2/mor-apertium/
>
> Now you should have a dictionary variant "apertium" available. The test
> dictionary was copied from
> http://wiki.apertium.org/wiki/Lttoolbox#Using_as_a_library
> and it recognizes words "car" and "cars". Event the spelling suggestions work:
>
> $ /some/dir/bin/voikkospell -d apertium -s
> car
> C: car
> cara
> W: cara
> S: cars
> S: car
>
>
> I must say I was quite impressed on how easy this was. No compilation
> problems, no license hassles (Apertium uses exactly the same license as
> libvoikko), easy to use API. API documentation was a bit hard to find though.
:)
> I also tried to build some of the real word dictionaries but could not figure
> out which one would work and how it should be used.
You can grab the dictionary from:
http://apertium.svn.sourceforge.net/svnroot/apertium/trunk/apertium-br-fr/apertium-br-fr.br-fr.dix
and
https://apertium.svn.sourceforge.net/svnroot/apertium/trunk/apertium-is-en/apertium-is-en.is.dix
For Breton and Icelandic.
The .bin files are compiled by:
$ lt-comp lr <dix> <bin>
e.g. lt-comp lr apertium-br-fr.br-fr.dix br.bin
As far as we know these are platform independent compiled binaries. I've
tried them on several architectures and they "just work", not 100% sure
about Windows though.
Fran
PS. (to Apertium folk) Now we have this, we should perhaps think about
marking standard / non-standard forms some way, or being able to compile
dictionaries as spellers (e.g. without <g> and <b/> multiword entries).
More information about the Libvoikko
mailing list