[libvoikko] Test cases for libvoikko/HFST needed

Francis Tyers ftyers at prompsit.com
Sun Jan 17 19:35:18 EET 2010

El dg 17 de 01 de 2010 a les 19:19 +0200, en/na Harri Pitkänen va
> Development of libvoikko 3.0 will start soon. Support for languages other than 
> Finnish is one of the goals for this release but there are others as well, 
> such as better multi threading and extended hyphenator API.
> In order to make sure that the work needed for supporting multiple languages 
> is not priorised too high I would like to have actual dictionaries available 
> for at least two languages. Finnish is of course already supported, so just 
> one other language would be enough. What I need is
> - Morphology licensed fully under a free license. Quality does not matter 
> much, but it should be actively maintained and either usable or at least 
> slowly becoming usable.
> - Analyzer implementation for libvoikko. I guess current HfstAnalyzer will 
> work with small changes. Separate speller implementation can be provided (or 
> HfstSpeller can be used).
> - A few automated test cases for ensuring that at least basic spell checking 
> works correctly. This is because I may not understand the language at all. 
> PyUnit tests are fine, something similar to
>         def testSpell(self):
>                 self.failUnless(self.voikko.spell(u"määrä"))
>                 self.failIf(self.voikko.spell(u"määä"))
> in python/libvoikkotests.py within libvoikko sources is enough.
> Everything mentioned above, including all tools needed for building the 
> morphology should be in such shape that it would be possible to build Debian 
> packages for them and distribute them legally as free software. Unstable APIs 
> are not a problem at this point unless the changes are happening daily.

I suggest North Sámi and Lule Sámi. 

They can be downloaded from


, compile with HFST, are available under the GPL, and are actively


More information about the Libvoikko mailing list