[libvoikko] [Apertium-stuff] Lttoolbox (Apertium) morphology backend

Francis Tyers ftyers at prompsit.com
Sun Feb 28 22:07:12 EET 2010


El dg 28 de 02 de 2010 a les 21:47 +0100, en/na Jacob Nordfalk va
escriure:
> 
> 
> 2010/2/28 Francis Tyers <ftyers at prompsit.com>
>         El dg 28 de 02 de 2010 a les 21:18 +0100, en/na Jacob Nordfalk
>         va
>         escriure:
>         
>         >
>         >
>         > 2010/2/28 Francis Tyers <ftyers at prompsit.com>
>         >         El dg 28 de 02 de 2010 a les 21:40 +0200, en/na
>         Harri Pitkänen
>         >         va
>         >         escriure:
>         >         > On Sunday 28 February 2010, Francis Tyers wrote:
>         >         > > > I don't know Icelandic at all and therefore
>         can't tell
>         >         whether some of
>         >         > > > the  words are accepted or rejected
>         incorrectly.
>         >         > >
>         >         > > Nice, it looks good. Some of the capitalised
>         words should
>         >         be recognised
>         >         > > corrected, at least 'Bretlandi' and 'Norðmenn' .
>         >         >
>         >
>         >         > I tried to fix the checking of capitalized words
>         but started
>         >         to run into
>         >         > problems. It seems that the library API works in
>         somewhat
>         >         surprising (at least
>         >         > to me) ways when you enter a word that starts with
>         a capital
>         >         letter and ends
>         >         > with garbage.
>         >         >
>         >         > The implementation is here
>         >         >
>         >
>         http://voikko.svn.sourceforge.net/viewvc/voikko/trunk/libvoikko/src/morphology/LttoolboxAnalyzer.cpp?revision=3182&view=markup
>         >         >
>         >         > and test cases here
>         >         >
>         >
>         http://voikko.svn.sourceforge.net/viewvc/voikko/trunk/libvoikko/python/ApertiumIcelandicTest.py?revision=3183&view=markup
>         >         >
>         >         > I was able to get all test cases expect the one
>         with TODO in
>         >         method name
>         >         > implemented. How would you suggest fixing the code
>         so that
>         >         all tests would
>         >         > pass? Of course a patch would be most welcome :)
>         >
>         >         Hmm, strangely enough, when I try an unknown word I
>         get
>         >         similar strange
>         >         output:
>         >
>         >         $ ./test mor.bin
>         >         ^Reykjanghfghesi$ -->
>         >
>         ^Reykja<vblex><actv><inf>/Reykja<vblex><actv><pri><p3><pl>/Reykur<n><m><pl><gen><ind>$
>         >
>         >         It seems that in the 'biltrans' mode, the 'standard'
>         sections
>         >         are
>         >         treated as inconditional. e.g. it just returns the
>         longest
>         >         match in all
>         >         cases.
>         >
>         >         I will think some more about this.
>         >
>         >
>         > Biltrans must actually work like this.
>         > I dont understand why you would use biltrans in an analyser.
>         
>         
>         Because biltrans takes a string, not a FILE*
>         
>         >
>         > In biltrans partial match are allowed. The symbols (and
>         letters) after
>         > the match is called the queue.
>         > For example, the input symbol house<n><sg>
>         > Matches in the bidix house<n>   ->  domo<n>    and the queue
>         is <sg>
>         > The result is domo<n><sg>
>         
> 
> 
> The above is behaviour of biltransWithQueue() 
> 

Nah, that doesn't do it :/

The new method should probably just be a copy of the old one, only that
checks to see if all the input has been consumed.

Fran





More information about the Libvoikko mailing list