[libvoikko] libvoikko 4.0, voikko-fi 2.0 and libreoffice-voikko 5.0

Harri Pitkänen hatapitk at iki.fi
Thu Jan 7 21:24:44 EET 2016


On Thursday 07 January 2016 16:18:21 Sjur Moshagen wrote:
> The main purpose of this is to allow users to install additional languages
> without installing multiple versions of whatever speller tool they have
> installed already. Installing multiple versions of e.g. LO-voikko is not
> working anyway, the new one will replace the old one.
> 
> Because of the above requirement, we need to define and stick to a common
> set of (platform specific) dictionary directories, which are always
> searched. End users should not have to fine-read the readme file to be able
> to install multiple languages or to find out that it is not possible in the
> first place. In this respect
> 
> https://github.com/voikko/corevoikko/issues/13
> <https://github.com/voikko/corevoikko/issues/13>
> 
> is not a solution, since we now depend on each developer to remember to set
> the search path, and do it consistently for all platforms, and identically
> by all developers. This will be error prone, and will most likely create a
> frustrating user experience for multilingual users.

So I think you are now talking about spell checkers? I agree that developers 
who are using libvoikko for spell checking (and perhaps grammar checking and 
hyphenation) should do this. But a significant number of developers who build 
libvoikko from source are only interested in morphological analysis. This may 
not be obvious on this list as morphological analysis is still practically a 
Finnish only feature and I have largely failed to get these developers to 
subscribe to our mailing lists. But my feeling is that from those developing 
custom software using libvoikko about 50 % or so work on search, indexing or 
more exotic applications.

> While the old solution
> was not perfect, and made less and less sense (and was in some cases
> wrong), it would at least give a bare minimum of default locations you
> could trust. Now there is none.

It has been a confusing thing for many developers that /usr/lib/voikko on many 
Linux systems contains a Finnish dictionary that only partially supports 
morphological analysis. But if I ask them to remove it that would break their 
desktop applications. So I still argue that no single trusted location exists 
because there are use cases that just cannot be served by any single 
dictionary.

So while I'm not yet convinced that we should go back to the old behavior I 
agree that every developer should be guided to think about this and set this 
option to suit their use case. How about the following:

- If --with-dictionary-path has been specified we show the value at the end of 
configuration script:

Libvoikko was configured with the following options
  * VFST support:                   yes
  *   Experimental VFST features:   no
  * HFST support:                   yes
  * Malaga support:                 no
  * Experimental VISLCG3 support:   no
  * Experimental Lttoolbox support: no
  * Morphology compilers:           yes
  * Simple client programs:         yes
  * Fallback dictionary path:       /the/path:/that/was/specified

- If --with-dictionary-path has not been specified we show the following:

Libvoikko was configured with the following options
  * VFST support:                   yes
  *   Experimental VFST features:   no
  * HFST support:                   yes
  * Malaga support:                 no
  * Experimental VISLCG3 support:   no
  * Experimental Lttoolbox support: no
  * Morphology compilers:           yes
  * Simple client programs:         yes
  * Fallback dictionary path:       (none)
  NOTE!  As of libvoikko 4.0 NO HARDCODED FALLBACK DICTIONARY PATH
  NOTE!  IS USED unless one is specified using --with-dictionary-path.
  NOTE!  If you intend to use libvoikko for writing aids in
  NOTE!  end user applications we strongly suggest adding the following:
  NOTE!  OS X: --with-dictionary-path=/usr/lib/voikko:/Library/Spelling/voikko
  NOTE!  Linux: --with-dictionary-path=/usr/lib/voikko

> I suggest that some default locations are reinstated in libvoikko, with the
> following minimum values:
> 
> Linux:
> - ~/.voikko
> <please suggest more, I have too little knowledge>
> 
> OSX:
> - ~/Library/Spelling/voikko
> - /Library/Spelling/voikko
> - <whatever is suggested for Linux>
> 
> Windows:
> - %APPDATA%\voikko
> - …possibly more

Do note that the Windows registry stuff and Linux / OS X directories under 
user's home directory are not controlled by this option. So they will work as 
before regardless of what you set here. Only --disable-external-dicts will 
disable all of these. These are less problematic that stuff under /usr or 
/Library because nothing goes to these locations without user actively 
approving it.

So the instructions I wrote in my sample configure message above would, if 
followed, lead to same configuration we had as default in 3.8. But it is 
indeed worth considering if we should suggest adding /usr/share/voikko on 
Linux and perhaps something more?

> Given something like this, and given that we can always assume these to be
> searched, it will be much easier to provide a decent user experience for
> our multilingual users.

I hope that the above will ensure that the user experience remains the same as 
before but developer experience will improve.

Harri


More information about the Libvoikko mailing list