[libvoikko] HFST/ooovoikko slow startup

Harri Pitkänen hatapitk at iki.fi
Thu Nov 11 20:11:36 EET 2010

On Thursday 11 November 2010, Flammie Pirinen wrote:
> I've certainly noticed that OOo unloads and loads the dictionary during
> the use occasionally, since these delays are really noticeable. During
> my testing with Finnish HFST stuff I didn't see it either, since it is
> mostly unnoticeable in current version I've used (I attached
> voikko-hfst-ospell.patch for reference, I'll commit it after
> hfst-ospell library is released?)

Ok. I made some changes to ooovoikko to slightly improve the debugging output 
related to initialization. If you take the latest source from SVN and build it 


you get a debug version of the extension. After installing it and starting OOo 
from the console you can see the debug logging on console. Initialization of 
libvoikko happens between these two lines:

VOIKKO_DEBUG: PropertyManager::initLibvoikkoWithVariant: voikkoInit
VOIKKO_DEBUG: PropertyManager::initLibvoikko: libvoikko initalized

It appears that there are no extra loading or unloading at startup. So this is 
not where the problem is. Reloading is done on purpose when certain settings 
are changed and this can cause delays during use but it should not happen very 
often and especially not here.

> On slightly related story, if you want to test this specific thing I've
> mentioned already, I suppose it's ok to demonstrate it already; it's
> the greenlandic in divvun's svn
> <https://victorio.uit.no/langtech/trunk/st/kal> with the other patch I
> attached. Requires HFST 2, the optimized stuff and foma.

Thanks, looks interesting.

> Ah, I see. That's of course understandable, the way it starts up
> usually is that the normal OOo progress bar loads just nicely and takes
> some long time and after you write some few character it really
> freezes. I think if this freezing was within progress bar loading time
> it would be no problem, but anyways, the ideal solution is to fix the
> root cause of the slowdowns in our code.

I still have no clue as to why loading takes longer in OOo than in 
voikkospell. Hopefully once I get the Greenlandic speller or similar running I 
will figure this out.

One thing I noticed from your Greenlandic patch is that the configuration file 
enables all four HFST backends. In OOo you would not need HFST morphology 
backend at all, disabling it by setting

  Morphology-Backend: null

could help decreasing the load time. I think libvoikko needs to be changed so 
that backend initialization is delayed until it is actually needed. This would 
allow enabling these rarely used backends without unnecessarily increasing the 
time it takes to start up when only spellers is needed.

Fully delayed loading does have the problem that applications generally expect 
to be notified about errors during initialization. If we delay the actual 
initialization too far it could become a problem. Perhaps the solution would 
be to only do some simple sanity checks on startup (do we have all the 
required files, do they have valid signatures etc.) and do the actual loading 
(reading the transducers to memory) when they are used for the first time.

          voikkoInit  -->
                       HfstAnalyzer::constructor --> (check if we have a valid
                                                      morphological tranducer)
                                   (OK) <--
                       HfstSpeller::constructor  --> (check if we have a valid
                                   (OK) <--
                       [same for other backends]
                  (OK) <--

          voikkoSpellCstr -->
                                                 --> (load the acceptor)
                                                 --> (check spelling)
                                   (result) <--
                 (result) <--

          voikkoSpellCstr -->
                                                 --> (check spelling)
                                   (result) <--
                 (result) <--

But again, let's see first if there is a way to make the loading so fast that 
none of this actually matters.


More information about the Libvoikko mailing list