[libvoikko] Test cases for libvoikko/HFST needed
Harri Pitkänen
hatapitk at iki.fi
Sun Jan 17 19:19:51 EET 2010
Development of libvoikko 3.0 will start soon. Support for languages other than
Finnish is one of the goals for this release but there are others as well,
such as better multi threading and extended hyphenator API.
In order to make sure that the work needed for supporting multiple languages
is not priorised too high I would like to have actual dictionaries available
for at least two languages. Finnish is of course already supported, so just
one other language would be enough. What I need is
- Morphology licensed fully under a free license. Quality does not matter
much, but it should be actively maintained and either usable or at least
slowly becoming usable.
- Analyzer implementation for libvoikko. I guess current HfstAnalyzer will
work with small changes. Separate speller implementation can be provided (or
HfstSpeller can be used).
- A few automated test cases for ensuring that at least basic spell checking
works correctly. This is because I may not understand the language at all.
PyUnit tests are fine, something similar to
def testSpell(self):
self.failUnless(self.voikko.spell(u"määrä"))
self.failIf(self.voikko.spell(u"määä"))
in python/libvoikkotests.py within libvoikko sources is enough.
Everything mentioned above, including all tools needed for building the
morphology should be in such shape that it would be possible to build Debian
packages for them and distribute them legally as free software. Unstable APIs
are not a problem at this point unless the changes are happening daily.
Harri
More information about the Libvoikko
mailing list