[libvoikko] hfst-ospell API

Sam Hardwick sam.hardwick at gmail.com
Mon Aug 9 18:46:40 EEST 2010


This email is strictly about the spelling library for HFST's
optimized-lookup transducer format, so if you're not interested in it,
read no further. By the way, I thought I wrote about this to the list in
July, but later noticed that it isn't in the archive - is mail only
accepted from subscribers?

The library is not quite ready to be released yet, but if someone would
like to give input on the API in particular, that would be very welcome.
The current version is on hfst's svn on sourceforge ( trunk/hfst-ospell ).

There's a README:


Preliminary hfst-ospell library and toy commandline tester

(for static compilation at this time)

Usage:

	#include ospell.h

and compile your project with

	ospell.cc hfst-ol.cc

The library lives in a namespace called hfst_ol. Pass (weighted!) Transducer
pointers to the Speller constructor, eg.:

	FILE * error_source = fopen(error_filename, "r");
	FILE * lexicon = fopen(lexicon_filename, "r");
	hfst_ol::Transducer * error;
	hfst_ol::Transducer * lexicon;
	try {
		error = new hfst_ol::Transducer(error_source);
		lexicon = new hfst_ol::Transducer(lexicon);
	} catch (hfst_ol::TransducerParsingException& e)
		{
			/* problem with transducer file, usually completely
			different type of file - there's no magic number
			in the header to check for this */
		}
	hfst_ol::Speller * speller;
	try {
	speller = hfst_ol::Speller(&error, &lexicon);
	} catch (hfst_ol::AlphabetTranslationException& e) {
	/* problem with translating between the two alphabets */
	}


And use the functions

	// returns true if line is found in lexicon
	bool hfst_ol::Speller::check(char * line);

	// CorrectionQueue is a priority queue, sorted by weight
	hfst_ol::CorrectionQueue hfst_ol::Speller::correct(char * line);

to communicate with it. See main.cc for a concrete usage example. It
provides a
demo utility with the following help message:

--------------------------------------------------------------------------------

Usage: hfst-ospell [OPTIONS] ERRORSOURCE LEXICON
Run a composition of ERRORSOURCE and LEXICON on standard input and
print corrected output

  -h, --help                  Print this help message
  -V, --version               Print version information
  -v, --verbose               Be verbose
  -q, --quiet                 Don't be verbose (default)
  -s, --silent                Same as quiet


Report bugs to hfst-bugs at ling.helsinki.fi


Sam Hardwick



More information about the Libvoikko mailing list