[libvoikko] Lttoolbox (Apertium) morphology backend

Francis Tyers ftyers at prompsit.com
Sun Feb 28 21:04:27 EET 2010


El dg 28 de 02 de 2010 a les 21:40 +0200, en/na Harri Pitkänen va
escriure:
> On Sunday 28 February 2010, Francis Tyers wrote:
> > > I don't know Icelandic at all and therefore can't tell whether some of
> > > the  words are accepted or rejected incorrectly.
> > 
> > Nice, it looks good. Some of the capitalised words should be recognised
> > corrected, at least 'Bretlandi' and 'Norðmenn' .
> 
> I tried to fix the checking of capitalized words but started to run into 
> problems. It seems that the library API works in somewhat surprising (at least 
> to me) ways when you enter a word that starts with a capital letter and ends 
> with garbage.
> 
> The implementation is here
> http://voikko.svn.sourceforge.net/viewvc/voikko/trunk/libvoikko/src/morphology/LttoolboxAnalyzer.cpp?revision=3182&view=markup
> 
> and test cases here
> http://voikko.svn.sourceforge.net/viewvc/voikko/trunk/libvoikko/python/ApertiumIcelandicTest.py?revision=3183&view=markup
> 
> I was able to get all test cases expect the one with TODO in method name 
> implemented. How would you suggest fixing the code so that all tests would 
> pass? Of course a patch would be most welcome :)

Hmm, strangely enough, when I try an unknown word I get similar strange
output:

$ ./test mor.bin 
^Reykjanghfghesi$ -->
^Reykja<vblex><actv><inf>/Reykja<vblex><actv><pri><p3><pl>/Reykur<n><m><pl><gen><ind>$

It seems that in the 'biltrans' mode, the 'standard' sections are
treated as inconditional. e.g. it just returns the longest match in all
cases. 

I will think some more about this. 

Fran




More information about the Libvoikko mailing list