[libvoikko] Relation between spelling and grammar checkers
sjurnm at mac.com
Sun Sep 22 22:20:36 EEST 2013
22. sep. 2013 kl. 19:00 skrev Francis Tyers <ftyers at prompsit.com>:
>>> The question then is, should the spell
>>> checker be the morphological analyser used as input to the grammar
>>> checker, or should it be another file? So far we have:
>>> 1) Descriptive morphological analyser
>>> 2) Disambiguation file
>>> 3) Grammar checker rule file
>>> 4) Suggestion file
>>> 5) Some kind of index/manifest
>>> Should we add 'normative acceptor' to that ?
>> I have no real opinion on this. Might be reasonable unless the total size of
>> the package grows too large.
> I think the transducers are around 6-7M each, so it probably wouldn't be
> too much, but let's see what Sjur says…
Even though a spell checker fst included in a grammar checker package can be a bit sloppier than a regular spell checker fst (leaving some error detection cases to the CG rules), it still needs to be normative in a broad sense. This is in contrast to the morphological analyser used as input to the grammar checker - it needs to be *descriptive* in the broadest possible sense, including all sorts of out-of-norm dialectal forms and common misspellings, such that one can arrive at a correct morphological (and thus syntactic) analysis even when misspelled.
The exact interaction between the two (speller and GC) is still something to be worked on, but for now I would say we need both fst's.
More information about the Libvoikko