[libvoikko] Status of North Sámi (SME) hfst speller

Harri Pitkänen hatapitk at iki.fi
Thu Sep 29 14:53:28 EEST 2011


> I would assume that the difference VmRSS-VmData should be about the same
> for all HFST spellers and VmData might be directly proportional to the
> size of the zhfst file. Finnish zhfst is 5.5 Mb so if SME zhfst is 3.0 Mb
> I could estimate that for HFST/SME the numbers would be
>
> VmData: 80000 kB
> VmRSS:  70000 kB

I downloaded the actual SME speller and it seems that the real numbers are
slightly better:

VmRSS:     62960 kB
VmData:    66240 kB

Looks pretty good. If the error model could be improved and optimized a
bit and if someone finds a way to make the HFST lookup slightly faster it
seems (to me at least) that the major issues would be solved.

Harri




More information about the Libvoikko mailing list