1NGRAMCOUNT(1)                    User Commands                   NGRAMCOUNT(1)
2
3
4

NAME

6       ngramcount - manual page for ngramcount 1.3.14
7

DESCRIPTION

9       Count n-grams from input file.
10
11              Usage: ngramcount [--options] [in.far [out.fst]]
12
13       PROGRAM FLAGS:
14
15       --add_to_symbol_unigram_count: type = double, default = 0
16
17              Adds this amount to the unigram count of each word in the symbol
18              table --alpha: type = double, default = 1 Weight for  first  FST
19              --backoff_label:  type  =  int64_t,  default  =  0 Backoff label
20              --beta: type = double, default = 1 Weight for second (and subse‐
21              quent)  FST(s) --check_consistency: type = bool, default = false
22              Check model consistency --context_pattern: type  =  std::string,
23              default  = "" Pattern of contexts to count --epsilon_as_backoff:
24              type = bool, default = false Treat epsilon in the input Fsts  as
25              backoff --method: type = std::string, default = "counts" One of:
26              "counts", "histograms", "count_of_counts", "count_of_histograms"
27              --norm_eps:  type  = double, default = 0.001 Normalization check
28              epsilon --normalize: type = bool, default = false Normalize  re‐
29              sulting  model  --order: type = int64_t, default = 3 Set maximal
30              order of ngrams to be counted --output_fst: type = bool, default
31              =  true  Output counts as fst (otherwise strings) --require_sym‐
32              bols: type = bool, default = true Require  symbol  tables?  (de‐
33              fault:  yes)  --round_to_int: type = bool, default = false Round
34              all counts to integers
35
36       LIBRARY FLAGS:
37
38       Flags from: compile-strings.cc
39
40       --far_field_separator: type = std::string, default = "
41              "
42
43              Set of characters used as a separator between printed fields
44
45       Flags from: flags.cc
46
47       --help: type = bool, default = false
48
49              show usage information --helpshort: type = bool, default = false
50              show  brief  usage information --tmpdir: type = std::string, de‐
51              fault = "/tmp" temporary directory --v: type = int32_t,  default
52              = 0 verbosity level
53
54       Flags from: fst.cc
55
56       --fst_align: type = bool, default = false
57
58              Write FST data aligned where appropriate --fst_default_cache_gc:
59              type = bool, default = true Enable garbage collection  of  cache
60              --fst_default_cache_gc_limit:  type = int64_t, default = 1048576
61              Cache   byte   size    that    triggers    garbage    collection
62              --fst_read_mode:  type  =  std::string, default = "read" Default
63              file reading mode for  mappable  files  --fst_verify_properties:
64              type  =  bool,  default = false Verify FST properties queried by
65              TestProperties --save_relabel_ipairs: type  =  std::string,  de‐
66              fault  =  ""  Save  input  relabel  pairs  to  file --save_rela‐
67              bel_opairs: type = std::string, default = "" Save output relabel
68              pairs to file
69
70       Flags from: ngram-output.cc
71
72       --end_symbol: type = std::string, default = "</s>"
73
74              Class label for sentence end --start_symbol: type = std::string,
75              default = "<s>" Class label for sentence start
76
77       Flags from: symbol-table.cc
78
79       --fst_compat_symbols: type = bool, default = true
80
81              Require symbol tables to match when appropriate --fst_field_sep‐
82              arator: type = std::string, default = "         " Set of charac‐
83              ters used as a separator between printed fields
84
85       Flags from: util.cc
86
87       --fst_error_fatal: type = bool, default = true
88
89              FST errors are fatal; o.w. return objects flagged as bad:  e.g.,
90              FSTs:   kError   property  set,  FST  weights:  not  a  Member()
91              --ngram_error_fatal: type = bool, default =  true  NGram  errors
92              are  fatal  if  true;  otherwise returns objects flagged as bad:
93              e.g., NGramModel::Error() is true
94
95       Flags from: weight.cc
96
97       --fst_weight_parentheses: type = std::string, default = ""
98
99              Characters enclosing the first weight  of  a  printed  composite
100              weight  (e.g., pair weight, tuple weight and derived classes) to
101              ensure proper I/O of nested composite weights; must have size  0
102              (none) or 2 (open and close parenthesis) --fst_weight_separator:
103              type = std::string, default = ","  Character  separator  between
104              printed composite weights; must be a single character
105
106
107
108ngramcount 1.3.14                February 2022                   NGRAMCOUNT(1)
Impressum