[libvoikko] different output via voikkogc and libreoffice-voikko

Francis Tyers ftyers at prompsit.com
Tue Oct 1 19:00:26 EEST 2013


El dt 01 de 10 de 2013 a les 17:37 +0300, en/na Harri Pitkänen va
escriure:
> On Tuesday 01 October 2013 13:54:46 Francis Tyers wrote:
> > As far as I'm concerned, it should be ok. There are the following
> > caveats:
> > 
> > * The code is pretty ropey at the moment. It works, but no-one apart
> > from me has tested it as working. There is some stuff hardcoded there
> > (in the V4DictionaryLoader) that we are likely to want to change.
> 
> No problem. You can always change this and other stuff that is within 
> VISLCG3_CPP_FILES without any risk of breaking other parts of the library.
> 
> > * There is a lot of debugging stuff in there that you might want to
> > remove during the merge.
> > 
> > * The grammar checking now depends on the latest SVN of hfst-ospell to
> > work. This is because previous versions did not have a lookup() method.
> 
> I think this is fine.
> 
> > There is one show stopper:
> > 
> > * Grammar checking for Finnish is currently broken. I should fix this.
> > 
> > So, I think the best thing would be for me to fix Finnish, and then do
> > the merge. Feel free to take a look at the code and give me suggestions,
> > my code has rarely been described as elegant.
> 
> Sounds good. The code looks fine too. I'll have another look when it is time 
> to merge it.

The Finnish is kind of fixed now. It doesn't segfault at least, and the
negative verb check rule works. \o/

http://bpaste.net/show/9PoJJfBci7xwGAySIwQu/

I have a problem though. A lot of the checks in FinnishRuleEngine/ rely
on stuff from voikko_options_t. The new abstraction doesn't let them
have it, they only get a GcCache to work with. IMHO these variables:

	int ignore_dot;
	int ignore_numbers;
	int ignore_uppercase;
	int ignore_nonwords;
	int accept_first_uppercase;
	int accept_all_uppercase;
	int accept_extra_hyphens;
	int accept_missing_hyphens;
	int accept_titles_in_gc;
	int accept_unfinished_paragraphs_in_gc;
	int accept_bulleted_lists_in_gc;

should be put into the GrammarChecker class. I'm not quite sure if this
will break other stuff. But the main thing is that I think it would be
good if this was kept closer to the code that is using it.

The other thing which I missed out from my last email is that currently
the abstraction of the grammar::Analysis class is not working, so in
cache.cpp I have to instantiate HfstAnalysis or FinnishAnalysis
directly. I'll work on this.

Fran




More information about the Libvoikko mailing list