This template class builds an n-gram database. The first template argument (string_tmpl) specifies the type of a key (string), the second template argument (value_tmpl) specifies the type of a value associated with a key, and the third template argument (ngram_generator_tmpl) customizes generation of feature sets (n-grams) from keys.
This class is inherited by writer_base, which adds the functionality of managing a master string table (list of strings).
| string_tmpl | The type of a string. | |
| value_tmpl | The value type. This is required to be an integer type. | |
| ngram_generator_tmpl | The type of an n-gram generator. |
Public Types | |
| typedef string_tmpl | string_type |
| The type representing a string. | |
| typedef value_tmpl | value_type |
| The type of values associated with key strings. | |
| typedef ngram_generator_tmpl | ngram_generator_type |
| The function type for generating n-grams from a key string. | |
| typedef string_type::value_type | char_type |
| The type representing a character. | |
Public Member Functions | |
| ngramdb_writer_base (const ngram_generator_type &gen) | |
| Constructs an object. | |
| virtual | ~ngramdb_writer_base () |
| Destructs an object. | |
| void | clear () |
| Clears the database. | |
| bool | empty () |
| Checks whether the database is empty. | |
| int | max_size () const |
| Returns the maximum length of keys in the n-gram database. | |
| bool | fail () const |
| Checks whether an error has occurred. | |
| std::string | error () const |
| Returns an error message. | |
| bool | insert (const string_type &key, const value_type &value) |
| Inserts a string to the n-gram database. | |
| bool | store (const std::string &base) |
| Stores the n-gram database to files. | |
Protected Types | |
| typedef std::vector< string_type > | ngrams_type |
| The type of an array of n-grams. | |
| typedef std::vector< value_type > | values_type |
| The vector type of values associated with an n-gram. | |
|
typedef std::map< string_type, values_type > | hashdb_type |
| The type implementing an index (associations from n-grams to values). | |
| typedef std::vector< hashdb_type > | indices_type |
| The vector of indices for different n-gram sizes. | |
Protected Attributes | |
| indices_type | m_indices |
| The vector of indices. | |
| const ngram_generator_type & | m_gen |
| The n-gram generator. | |
| std::stringstream | m_error |
| The error message. | |
| simstring::ngramdb_writer_base< string_tmpl, value_tmpl, ngram_generator_tmpl >::ngramdb_writer_base | ( | const ngram_generator_type & | gen | ) | [inline] |
Constructs an object.
| gen | The n-gram generator. |
| bool simstring::ngramdb_writer_base< string_tmpl, value_tmpl, ngram_generator_tmpl >::empty | ( | ) | [inline] |
Checks whether the database is empty.
true if the database is empty, false otherwise. | int simstring::ngramdb_writer_base< string_tmpl, value_tmpl, ngram_generator_tmpl >::max_size | ( | ) | const [inline] |
Returns the maximum length of keys in the n-gram database.
| bool simstring::ngramdb_writer_base< string_tmpl, value_tmpl, ngram_generator_tmpl >::fail | ( | ) | const [inline] |
Checks whether an error has occurred.
true if an error has occurred. | std::string simstring::ngramdb_writer_base< string_tmpl, value_tmpl, ngram_generator_tmpl >::error | ( | ) | const [inline] |
Returns an error message.
| bool simstring::ngramdb_writer_base< string_tmpl, value_tmpl, ngram_generator_tmpl >::insert | ( | const string_type & | key, | |
| const value_type & | value | |||
| ) | [inline] |
Inserts a string to the n-gram database.
| key | The key string. | |
| value | The value associated with the string. |
| bool simstring::ngramdb_writer_base< string_tmpl, value_tmpl, ngram_generator_tmpl >::store | ( | const std::string & | base | ) | [inline] |
Stores the n-gram database to files.
| name | The prefix of file names. |
true if the database is successfully stored, false otherwise.