Symbol table operations

This describes functions for symbol table operations.

Each FST arc has an input (ilabel) and output (olabel) label. Symbol tables can be used to map between these labels and actual strings (which may be bytes, Unicode codepoints, phones, words, etc.). See the symbol table documentation for more information.

fst::MergeSymbols

The function fst::MergeSymbols takes two mutable FST arguments and an enum specifying how the tables are to be merged:

  • MERGE_INPUT_SYMBOLS: merges the input tables of the input FSTs.
  • MERGE_OUTPUT_SYMBOLS: merges the output tables of the input FSTs.
  • MERGE_INPUT_AND_OUTPUT_SYMBOLS: merges both input and output tables of the input FSTs (i.e., for intersection, union, etc.).
  • MERGE_LEFT_OUTPUT_AND_RIGHT_INPUT_SYMBOLS: merges the left-hand side's input symbols with the right-hand side's output symbols (e.g., for composition).

Asked to merge two tables (which themselves may be null), the algorithm proceeds as follows:

  • If the flag --fst_compat_symbols=false is set, do no work.
  • If one or both both tables are null, do no work.
  • If the tables' ("labeled") checksums match, do not work.
  • Otherwise, merge the two tables by adding symbols from the first to the second table, and the second table to the first.

Only in the last case is there the possibility of a labeling conflict (i.e., the two tables map separate labels to the same symbol, or separate symbols to the same label). In the case of conflict, the second FST may require relabeling. The fst::MergeSymbols function does this automatically so long as the flag --fst_relabel_symbol_conflicts is set to true (the default). However, if relabeling is required to resolve a conflict but this flag is set to false, fst::MergeSymbols logs a warning and returns false to indicate failure.

Symbol table merging in Pynini

The above function is used extensively in Pynini to ensure symbol table compatibility for core rational operations like composition, intersection, and union. This is done automatically, except for the following special cases:

  • If the flag --fst_compat_symbols=false is set, then symbol tables are simply assumed to be compatible.
  • If the flag --fst_relabel_symbol_conflicts=false is set, then symbol tables are merged unlesss there is a conflict, in which case the higher-level operation will fail and raise FstOpError.
Topic revision: r1 - 2017-06-29 - KyleGorman
 
This site is powered by the TWiki collaboration platform Powered by PerlCopyright © 2008-2017 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback