[libvoikko] Relation between spelling and grammar checkers

Francis Tyers ftyers at prompsit.com
Sun Sep 22 19:00:53 EEST 2013


El dg 22 de 09 de 2013 a les 18:58 +0300, en/na Harri Pitkänen va
escriure:
> On Sunday 22 September 2013 08:35:18 Francis Tyers wrote:
> > > 19. sep. 2013 kl. 21:55 skrev Harri Pitkänen <hatapitk at iki.fi>:
> > > > 2) We can also require that all grammar checkers must also provide a
> > > > spell
> > > > 
> > > >    checker. This will simplify the logic: format 4 would always hide a
> > > >    format 3 dictionary if they have the same language tag.
> > > >    * The variant issue would still be present. We might have a spell
> > > >    checker
> > > >    
> > > >      in format 3 for some variant and only standard dictionary in format
> > > >      4.
> > I'm in favour of (2) as well.
> 
> It seems like all votes were for (2). This would allow splitting the current 
> (really messy) DictionaryLoader cleanly into format version specific, 
> independent loaders:
> 
>  DictionaryLoader (the base class containing some common helper methods)
>   -> V2DictionaryLoader (current implementation for Finnish)
>   -> V3DictionaryLoader (zhfst spellers)
>   -> V4DictionaryLoader (new spelling and grammar checkers)
> 
> Unless you have already started to implement the loader (there are no changes 
> towards that in Git) I could do the refactoring early next week. Would that be 
> OK? Then it should be much easier for you to implement your loader.

I haven't started yet, no, and yes that would be a great help, thanks :)

> > The question then is, should the spell
> > checker be the morphological analyser used as input to the grammar
> > checker, or should it be another file? So far we have:
> > 
> > 1) Descriptive morphological analyser
> > 2) Disambiguation file
> > 3) Grammar checker rule file
> > 4) Suggestion file
> > 5) Some kind of index/manifest
> > 
> > Should we add 'normative acceptor' to that ?
> 
> I have no real opinion on this. Might be reasonable unless the total size of 
> the package grows too large.

I think the transducers are around 6-7M each, so it probably wouldn't be
too much, but let's see what Sjur says...

F.




More information about the Libvoikko mailing list