[libvoikko] Test cases for libvoikko/HFST needed

Francis Tyers ftyers at prompsit.com
Mon Jan 18 20:16:11 EET 2010


El dl 18 de 01 de 2010 a les 19:50 +0200, en/na Harri Pitkänen va
escriure:
> On Monday 18 January 2010, Sjur Moshagen wrote:
> > Den 17. jan. 2010 kl. 19.53 skrev Flammie Pirinen:
> > > Harri Pitkänen kirjoitti 17.1.2010 kello 19.19:
> > >> In order to make sure that the work needed for supporting multiple
> > >> languages
> > >> is not priorised too high I would like to have actual dictionaries
> > >> available
> > >> for at least two languages. Finnish is of course already supported,
> > >> so just
> > >> one other language would be enough. What I need is
> > >>
> > >> - Morphology licensed fully under a free license. Quality does not
> > >> matter
> > >> much, but it should be actively maintained and either usable or at
> > >> least
> > >> slowly becoming usable.
> > >
> > > I hope that at some point of year I will be able to create tools to
> > > compile traditional hunspell and maybe aspell or ispell dictionaries
> > > to HFST transducers which could be ideal for testing. Waiting that I
> > > think there should be some amount of lexc/twolc/xfst style
> > > morphologies available. The sámi languages Francis mentioned are one
> > > good resource.
> > 
> > Strongly supported, especially since we can provide comparisons with
> >  already released spellers using a closed-source speller engine. Just to be
> >  clear: it is only the speller engine that is closed source - all
> >  linguistic source code relating to the Sámi analysers are licensed under
> >  GPL.
> > 
> > As some of you probably are aware, the company behind the closed-source
> >  speller engine went bankrupt last year, so having an open-source
> >  alternative is rather important to us. We would be very happy to help out
> >  and contribute whatever we can to make the Sámi Morph/HFST/LibVoikko 3
> >  combo a successful one.
> 
> There are some things that would speed up the development where you could most 
> likely help:
> 
> - Improve HFST public headers so that building libvoikko against HFST becomes 
> possible without removing quality checks from our build system. It should be 
> possible to include HFST headers in a compilation unit using
>   g++ -Wall -Werror -pedantic
> 
> - Make sure that HFST can be built on Windows using MS Visual C++.
> 
> - Improve src/spellchecker/HfstSpeller.cpp to work with flag diacritics (Tommi 
> said he will try to fix this) and implement checking of correct 
> capitalisation.
> 
> - Write test cases for HfstSpeller.
> 
> - Provide Debian packages for HFST and Sámi morphology.

I am happy to help with the Debian side of things -- I already maintain
the Apertium packages for Debian. Is this necessary immediately, or
would it be better to be done when the rest is completed ?

Fran




More information about the Libvoikko mailing list