[libvoikko] hfst-ospell API
Sam Hardwick
sam.hardwick at gmail.com
Mon Aug 9 18:46:40 EEST 2010
This email is strictly about the spelling library for HFST's
optimized-lookup transducer format, so if you're not interested in it,
read no further. By the way, I thought I wrote about this to the list in
July, but later noticed that it isn't in the archive - is mail only
accepted from subscribers?
The library is not quite ready to be released yet, but if someone would
like to give input on the API in particular, that would be very welcome.
The current version is on hfst's svn on sourceforge ( trunk/hfst-ospell ).
There's a README:
Preliminary hfst-ospell library and toy commandline tester
(for static compilation at this time)
Usage:
#include ospell.h
and compile your project with
ospell.cc hfst-ol.cc
The library lives in a namespace called hfst_ol. Pass (weighted!) Transducer
pointers to the Speller constructor, eg.:
FILE * error_source = fopen(error_filename, "r");
FILE * lexicon = fopen(lexicon_filename, "r");
hfst_ol::Transducer * error;
hfst_ol::Transducer * lexicon;
try {
error = new hfst_ol::Transducer(error_source);
lexicon = new hfst_ol::Transducer(lexicon);
} catch (hfst_ol::TransducerParsingException& e)
{
/* problem with transducer file, usually completely
different type of file - there's no magic number
in the header to check for this */
}
hfst_ol::Speller * speller;
try {
speller = hfst_ol::Speller(&error, &lexicon);
} catch (hfst_ol::AlphabetTranslationException& e) {
/* problem with translating between the two alphabets */
}
And use the functions
// returns true if line is found in lexicon
bool hfst_ol::Speller::check(char * line);
// CorrectionQueue is a priority queue, sorted by weight
hfst_ol::CorrectionQueue hfst_ol::Speller::correct(char * line);
to communicate with it. See main.cc for a concrete usage example. It
provides a
demo utility with the following help message:
--------------------------------------------------------------------------------
Usage: hfst-ospell [OPTIONS] ERRORSOURCE LEXICON
Run a composition of ERRORSOURCE and LEXICON on standard input and
print corrected output
-h, --help Print this help message
-V, --version Print version information
-v, --verbose Be verbose
-q, --quiet Don't be verbose (default)
-s, --silent Same as quiet
Report bugs to hfst-bugs at ling.helsinki.fi
Sam Hardwick
More information about the Libvoikko
mailing list