[libvoikko] HFST/ooovoikko slow startup
Harri Pitkänen
hatapitk at iki.fi
Thu Nov 11 20:11:36 EET 2010
On Thursday 11 November 2010, Flammie Pirinen wrote:
> I've certainly noticed that OOo unloads and loads the dictionary during
> the use occasionally, since these delays are really noticeable. During
> my testing with Finnish HFST stuff I didn't see it either, since it is
> mostly unnoticeable in current version I've used (I attached
> voikko-hfst-ospell.patch for reference, I'll commit it after
> hfst-ospell library is released?)
Ok. I made some changes to ooovoikko to slightly improve the debugging output
related to initialization. If you take the latest source from SVN and build it
with
make VOIKKO_DEBUG=FULL
you get a debug version of the extension. After installing it and starting OOo
from the console you can see the debug logging on console. Initialization of
libvoikko happens between these two lines:
VOIKKO_DEBUG: PropertyManager::initLibvoikkoWithVariant: voikkoInit
VOIKKO_DEBUG: PropertyManager::initLibvoikko: libvoikko initalized
It appears that there are no extra loading or unloading at startup. So this is
not where the problem is. Reloading is done on purpose when certain settings
are changed and this can cause delays during use but it should not happen very
often and especially not here.
> On slightly related story, if you want to test this specific thing I've
> mentioned already, I suppose it's ok to demonstrate it already; it's
> the greenlandic in divvun's svn
> <https://victorio.uit.no/langtech/trunk/st/kal> with the other patch I
> attached. Requires HFST 2, the optimized stuff and foma.
Thanks, looks interesting.
> Ah, I see. That's of course understandable, the way it starts up
> usually is that the normal OOo progress bar loads just nicely and takes
> some long time and after you write some few character it really
> freezes. I think if this freezing was within progress bar loading time
> it would be no problem, but anyways, the ideal solution is to fix the
> root cause of the slowdowns in our code.
I still have no clue as to why loading takes longer in OOo than in
voikkospell. Hopefully once I get the Greenlandic speller or similar running I
will figure this out.
One thing I noticed from your Greenlandic patch is that the configuration file
enables all four HFST backends. In OOo you would not need HFST morphology
backend at all, disabling it by setting
Morphology-Backend: null
could help decreasing the load time. I think libvoikko needs to be changed so
that backend initialization is delayed until it is actually needed. This would
allow enabling these rarely used backends without unnecessarily increasing the
time it takes to start up when only spellers is needed.
Fully delayed loading does have the problem that applications generally expect
to be notified about errors during initialization. If we delay the actual
initialization too far it could become a problem. Perhaps the solution would
be to only do some simple sanity checks on startup (do we have all the
required files, do they have valid signatures etc.) and do the actual loading
(reading the transducers to memory) when they are used for the first time.
APPLICATION LIBVOIKKO CORE LIBVOIKKO BACKEND
voikkoInit -->
HfstAnalyzer::constructor --> (check if we have a valid
morphological tranducer)
(OK) <--
HfstSpeller::constructor --> (check if we have a valid
acceptor)
(OK) <--
[same for other backends]
(OK) <--
voikkoSpellCstr -->
HfstSpeller::spell
--> (load the acceptor)
--> (check spelling)
(result) <--
(result) <--
voikkoSpellCstr -->
HfstSpeller::spell
--> (check spelling)
(result) <--
(result) <--
But again, let's see first if there is a way to make the loading so fast that
none of this actually matters.
Harri
More information about the Libvoikko
mailing list