[libvoikko] Two identical suggestions in the suggestion list bug
    Sjur Moshagen 
    sjurnm at mac.com
       
    Thu Sep  3 08:26:30 EEST 2015
    
    
  
Hello,
I have found a small bug in the interaction between libvoikko and hfst, where the end result is that the user sees two identical suggestions in the suggestion list. Not a major thing, but nice to get fixed.
What happens, is that hfst-ospell produces suggestions with different capitalisation:
$ echo Adjitt | hfst-ospell -S build/spellers/tools/spellcheckers/fstbased/hfst/se.zhfst 
"Adjitt" is NOT in the lexicon:
Corrections for "Adjitt":
Addit    14.101562
Ádjit-    15.506594
Ádjit    15.506594
ádjit    15.506594
Now, since the input had initial upper case, libvoikko changes the case of the last suggestion to follow the input. The result is two identical suggestions:
$ echo Adjitt | voikkospell -s -d se -p build/spellers/tools/spellcheckers/fstbased/hfst/
W: Adjitt
S: Addit
S: Ádjit-
S: Ádjit
S: Ádjit
It seems reasonable to suggest only initial uppercase words when the input is also with initial uppercase, but if two suggestions that are lexically different becomes identical due to uppercasing, the last one should be discarded.
Sjur
    
    
More information about the Libvoikko
mailing list