[libvoikko] HFST backend performance observations
Sam Hardwick
sam.hardwick at gmail.com
Fri Sep 30 10:25:17 EEST 2011
I'm resending this message because it appears not to have gotten through
- apologies if it ends up being a duplicate.
On 09/29/2011 07:16 PM, Harri Pitkänen wrote:
> - Initialization of course needs memory, but would it be possible to
> allocate it in larger chunks? I have not read the HFST code very closely
> but I would assume that many of the basic data structures could be
> allocated in larger arrays instead of doing 2.5 million individual
> allocations as it happens now. This might even save some memory.
The memory allocations are not really due to initialization, but the way
states are handled. The speller is always in a triple (error-state,
lexicon-state, lexicon-flag-state), and these states are generated and
placed on a queue. When they're processed, they get removed. This causes
a certain amount of allocating and deallocating small amounts of memory.
This is likely to be a bottleneck, and would (I suppose) be remedied by
writing our own memory handling for this process.
Sam Hardwick
More information about the Libvoikko
mailing list