[libvoikko] HFST support to libvoikko in Debian and Ubuntu?

Sjur Moshagen sjurnm at mac.com
Wed Feb 24 14:30:28 EET 2016


> 24. feb. 2016 kl. 13:59 skrev Timo Jyrinki <timo.jyrinki at gmail.com>:
> 2016-02-19 13:49 GMT+02:00 Tino Didriksen <mail at tinodidriksen.com>:
>>> As for HFST support, I can start enabling it or people can submit git
>>> requests to it later at
>>> http://anonscm.debian.org/cgit/collab-maint/libvoikko.git/ - when
>>> everything needed and useful dictionaries are in Debian.
>> Definitely need zhfst support, as both Giellatekno and Apertium produce
>> zhfst spellers.
> I have the enablement done in libvoikko packaging and it would be
> ready for Debian experimental at this point. But there a couple of
> factors that add to complexity.
> Firstly, I'd be interested in knowing if the end goal would be to have
> zhfst spellers enabled by default for eg Northern Sami in Debian and
> Ubuntu, similar to Voikko for Finnish installation? That would be a
> worthy goal of course, since it would improve the user experience for
> users of Sami languages.

This is definitely the goal for the Divvun group at UiT, it has been a long-term goal for quite a while:) At the moment at least three Sámi languages are ready (from a linguistic point of view, at least) to be included in distributions like Debian and Ubuntu: North, Lule and South Sámi. We will want to add more languages later.

> Secondly, I need to consider Ubuntu too when it comes to enabling new
> functionality in libvoikko. It's more complicated because Voikko is
> part of the 'main' repository (Canonical supported packages) due to
> being in the default Finnish installation. That means that all direct
> dependencies and build dependencies need to be in 'main' repository
> too. If I would now upload a new libvoikko to Debian, it couldn't be
> built in Ubuntu. And I don't want to fork libvoikko in Ubuntu to build
> without HFST.

I agree with the non-forking line of thinking.

> Are there people on the HFST side who could join Ubuntu efforts a bit
> too? In essence, it's mostly nothing because everything would come
> directly from Debian, but initially it would mean:
> - Going through MainInclusionProcess
> (https://wiki.ubuntu.com/MainInclusionProcess) for both hfst and
> hfst-ospell. Old MIR bugs can be looked at when uncertain. Both are
> relatively mature projects, and supporting people using any language
> is one of the original goals of Ubuntu so I believe there should be no
> problems completing the process given all requirements are otherwise
> met.
> - MIRing would require a Launchpad team to be created, for example
> 'hfst-team' that can be subscribed to the (theoretical) Ubuntu bugs
> against hfst or hfst-ospell. I would gladly join the team and also
> move foma bug subcription over there from the current not so optimal
> team.

I don’t know about the Hfst people. but Børre and I would like to join from the Divvun group, and Børre will help out. He is already on Ubuntu.

> Thirdly, just another "would be nice to know" detail, is there a plan
> when hfst-ospell would move to Debian unstable? Otherwise it won't get
> to be available in Ubuntu.
> If there are people who could help, please start by creating a team at
> https://launchpad.net/people/+newteam and getting at least a couple of
> people joined there (me included). That would be a good start for now,
> and the MIR bugs can be created later since they won't be handled
> before Ubuntu 16.10 development opens in April anyway, and the
> hfst-ospell also gets synced to Ubuntu only at that point.


> If there are no such people, it will need more consideration. One
> option would be to think about splitting HFST support into a separate
> libvoikko plugin that would not get installed for Finnish installation
> and that could live in eg libvoikko-hfst package.

I believe we want to avoid this. We want one libvoikko package with hfst supported for all platforms for which libvoikko is available. The Giella and Apertium languages all in all is approaching 100, and all of them are potentially speller languages. Arguably, many of them need a lot of work before they can be released, but the point is that we have releasable languages, and continous work on several other languages - we want as many of these languages to be releasable as soon as possible.

> I don't want to be on the Ubuntu side of HFST completely alone, and
> one reason is simply because I can't consider myself a "team" that is
> subscribed to the bugs and I'd like to fix the foma subcription to be
> better too.

I understand. I hope my answer is encouraging.


More information about the Libvoikko mailing list