[libvoikko] libvoikko 4.0, voikko-fi 2.0 and libreoffice-voikko 5.0

Sjur Moshagen sjurnm at mac.com
Thu Jan 7 22:48:18 EET 2016


7. jan. 2016 kl. 21:24 skrev Harri Pitkänen <hatapitk at iki.fi>:
> 
> So I think you are now talking about spell checkers?

Yes, or more generally proofing tools.

> I agree that developers 
> who are using libvoikko for spell checking (and perhaps grammar checking and 
> hyphenation) should do this. But a significant number of developers who build 
> libvoikko from source are only interested in morphological analysis. This may 
> not be obvious on this list as morphological analysis is still practically a 
> Finnish only feature and I have largely failed to get these developers to 
> subscribe to our mailing lists. But my feeling is that from those developing 
> custom software using libvoikko about 50 % or so work on search, indexing or 
> more exotic applications.

That did indeed not occur to me :) I have always known that morphological analysis was possible, but I have only used the proofing tools part of the tool set myself, thus other usage scenarios did not occur to me :/

> How about the following:
> 
> - If --with-dictionary-path has been specified we show the value at the end of 
> configuration script:
> 
> Libvoikko was configured with the following options
>  * VFST support:                   yes
>  *   Experimental VFST features:   no
>  * HFST support:                   yes
>  * Malaga support:                 no
>  * Experimental VISLCG3 support:   no
>  * Experimental Lttoolbox support: no
>  * Morphology compilers:           yes
>  * Simple client programs:         yes
>  * Fallback dictionary path:       /the/path:/that/was/specified
> 
> - If --with-dictionary-path has not been specified we show the following:
> 
> Libvoikko was configured with the following options
>  * VFST support:                   yes
>  *   Experimental VFST features:   no
>  * HFST support:                   yes
>  * Malaga support:                 no
>  * Experimental VISLCG3 support:   no
>  * Experimental Lttoolbox support: no
>  * Morphology compilers:           yes
>  * Simple client programs:         yes
>  * Fallback dictionary path:       (none)
>  NOTE!  As of libvoikko 4.0 NO HARDCODED FALLBACK DICTIONARY PATH
>  NOTE!  IS USED unless one is specified using --with-dictionary-path.
>  NOTE!  If you intend to use libvoikko for writing aids in
>  NOTE!  end user applications we strongly suggest adding the following:
>  NOTE!  OS X: --with-dictionary-path=/usr/lib/voikko:/Library/Spelling/voikko
>  NOTE!  Linux: --with-dictionary-path=/usr/lib/voikko

That is good. It would be even better if the command line tools could be enhanced with an option to query this value, e.g. something like:

$ voikkospell --dict-path
/the/path:/that/was/specified

And --verbose + -l would list the dictionary files with their full path:

$ voikkospell -l --verbose
smn-x-standard (/path/to/dict.file): Giellatekno/Divvun/UiT fst-based speller for Inari Sami

or something like that.

This way it is possible both to always get a full list of directories searched, and it will also help developers/linguists in cases of dictionary conflicts (several dictionaries for the same language - a surprisingly common error source:)

>> I suggest that some default locations are reinstated in libvoikko, with the
>> following minimum values:
>> 
>> Linux:
>> - ~/.voikko
>> <please suggest more, I have too little knowledge>
>> 
>> OSX:
>> - ~/Library/Spelling/voikko
>> - /Library/Spelling/voikko
>> - <whatever is suggested for Linux>
>> 
>> Windows:
>> - %APPDATA%\voikko
>> - …possibly more
> 
> Do note that the Windows registry stuff and Linux / OS X directories under 
> user's home directory are not controlled by this option. So they will work as 
> before regardless of what you set here.

Are you saying that:

~/.voikko
~/Library/Spelling/voikko
%APPDATA%\voikko

are always searched (on the relevant platforms, of course)? That I didn’t know - it helps a lot in supporting multilingual users.

> Only --disable-external-dicts will 
> disable all of these. These are less problematic that stuff under /usr or 
> /Library because nothing goes to these locations without user actively 
> approving it.

Mm.

> So the instructions I wrote in my sample configure message above would, if 
> followed, lead to same configuration we had as default in 3.8. But it is 
> indeed worth considering if we should suggest adding /usr/share/voikko on 
> Linux and perhaps something more?

Linux and unix-like systems. I have no opinions on other locations.

>> Given something like this, and given that we can always assume these to be
>> searched, it will be much easier to provide a decent user experience for
>> our multilingual users.
> 
> I hope that the above will ensure that the user experience remains the same as 
> before but developer experience will improve.

Thanks for the answer and clarifications. I agree with you - altogether this looks like a good way forward.

Sjur



More information about the Libvoikko mailing list