[libvoikko] Test cases for libvoikko/HFST needed
Francis Tyers
ftyers at prompsit.com
Sun Jan 17 19:35:18 EET 2010
El dg 17 de 01 de 2010 a les 19:19 +0200, en/na Harri Pitkänen va
escriure:
> Development of libvoikko 3.0 will start soon. Support for languages other than
> Finnish is one of the goals for this release but there are others as well,
> such as better multi threading and extended hyphenator API.
>
> In order to make sure that the work needed for supporting multiple languages
> is not priorised too high I would like to have actual dictionaries available
> for at least two languages. Finnish is of course already supported, so just
> one other language would be enough. What I need is
>
> - Morphology licensed fully under a free license. Quality does not matter
> much, but it should be actively maintained and either usable or at least
> slowly becoming usable.
>
> - Analyzer implementation for libvoikko. I guess current HfstAnalyzer will
> work with small changes. Separate speller implementation can be provided (or
> HfstSpeller can be used).
>
> - A few automated test cases for ensuring that at least basic spell checking
> works correctly. This is because I may not understand the language at all.
> PyUnit tests are fine, something similar to
>
> def testSpell(self):
> self.failUnless(self.voikko.spell(u"määrä"))
> self.failIf(self.voikko.spell(u"määä"))
>
> in python/libvoikkotests.py within libvoikko sources is enough.
>
> Everything mentioned above, including all tools needed for building the
> morphology should be in such shape that it would be possible to build Debian
> packages for them and distribute them legally as free software. Unstable APIs
> are not a problem at this point unless the changes are happening daily.
I suggest North Sámi and Lule Sámi.
They can be downloaded from
https://victorio.uit.no/langtech/trunk/gt
, compile with HFST, are available under the GPL, and are actively
maintained.
Fran
More information about the Libvoikko
mailing list