[libvoikko] HFST backend performance observations
Harri Pitkänen
hatapitk at iki.fi
Thu Sep 29 19:16:41 EEST 2011
To see if there is something that could be done to improve the speed of
spell checking I ran a list of 1000 words through voikkospell with Finnish
HFST backend and took a callgrind profile from the session. The call graph
is here:
http://www.puimula.org/htp/tmp/voikko-hfst-fi.pdf
You can see that the main program has two major branches: initialization
and actual spell checking (spelling suggestions were not generated in this
run). Initialization (voikkoInit) is called once and it has total
inclusive cost of 897 million instuctions (MI). Actual spell checking
happens using function handleWordSingleThread and it is naturally called
1000 times (once for each word). Inclusive total cost of these calls is
388 MI. Total instruction count of the whole run was 1287 MI (the
inclusive cost of the main program). So that's how you read the graph.
Now we can use the graph to look for things that seem to have higher than
expected cost and concentrate on fixing them. What immediately jumps out
is the cost of "operator new" near the bottom of the graph. Its inclusive
cost is 606 MI which is about half of the entire cost of the program.
Based on this we can say that voikkospell/HFST is spending significant
part of its running time doing memory management. Looking at the inward
arrows we see that allocations are done both during the initialization and
during spell checking.
- Initialization of course needs memory, but would it be possible to
allocate it in larger chunks? I have not read the HFST code very closely
but I would assume that many of the basic data structures could be
allocated in larger arrays instead of doing 2.5 million individual
allocations as it happens now. This might even save some memory.
- There are about one million allocations made during spell checking. That
is 1000 allocations for each word. This seems to be way too much since the
only things for which memory needs to be allocated during lookup should be
output strings and some sort of configuration history for backtracking (if
depth-first search is used). I have not read the code yet to see how the
lookup actually works. I have however written such mostly-allocation-free
lookup code once so I know it is doable:
http://www.puimula.org/htp/tmp/TransducerRunner.cpp
That file is part of a program I wrote for my master's thesis. As I
understand the only difference to HFST is that we need to handle flag
diacritics instead of the register operations used in my sample code but
that hopefully should not be very large difference.
I will now need to go offline for a week or so. But this starts to look
very promising...
Harri
More information about the Libvoikko
mailing list