[libvoikko] How to contribute to create Finnish VFST dictionary
Harri Pitkänen
hatapitk at iki.fi
Mon Sep 26 19:43:07 EEST 2016
Hi!
Taito Horiuchi kirjoitti 2016-09-23 12:28:
> I could not find the instruction how to contribute to create Finnish
> VFST dictionary.
> I could not even find where it is maintained.
The Finnish VFST dictionary is called voikko-fi and it is part of
corevoikko repository at
https://github.com/voikko/corevoikko
> Can somebody tell me:
>
> 1) How to create your own dictionary.
Do you intend to create your own Finnish dictionary by extending
voikko-fi? If this is what you want to do you can start here:
https://github.com/voikko/corevoikko/tree/master/voikko-fi
Build the dictionary by running "make vvfst" and install it locally by
running (for example) "make vvfst-install DESTDIR=~/.voikko". Once you
get that working you can try making your modifications. Most of the
vocabulary is under "vocabulary/joukahainen.xml" and the rest of the
morphology is built from files under "vvfst" subdirectory.
Or do you want to create a dictionary for another language? Then you can
pick any tool you like to create a finite state morphology for the
language. The only requirement is that the transducer can be exported to
AT&T format. You can then convert it into a VFST dictionary by using the
"voikkovfstc" tool.
> 2) How to contribute to create dictionary by adding new word.
Finnish dictionary (voikko-fi): The word list is maintained using web
application "Joukahainen". The process of adding new words is described
at
http://joukahainen.puimula.org/docs/
For other languages: there are different tools and processes for
different languages. You can ask here if you are interested in some
specific language.
Hopefully I understood your questions correctly. If I missed something
please let me know and I will happily provide more details.
Harri
More information about the Libvoikko
mailing list