1NGRAMCOUNT(1) User Commands NGRAMCOUNT(1)
2
3
4
6 ngramcount - manual page for ngramcount 1.3.14
7
9 Count n-grams from input file.
10
11 Usage: ngramcount [--options] [in.far [out.fst]]
12
13 PROGRAM FLAGS:
14
15 --add_to_symbol_unigram_count: type = double, default = 0
16
17 Adds this amount to the unigram count of each word in the symbol
18 table --alpha: type = double, default = 1 Weight for first FST
19 --backoff_label: type = int64_t, default = 0 Backoff label
20 --beta: type = double, default = 1 Weight for second (and subse‐
21 quent) FST(s) --check_consistency: type = bool, default = false
22 Check model consistency --context_pattern: type = std::string,
23 default = "" Pattern of contexts to count --epsilon_as_backoff:
24 type = bool, default = false Treat epsilon in the input Fsts as
25 backoff --method: type = std::string, default = "counts" One of:
26 "counts", "histograms", "count_of_counts", "count_of_histograms"
27 --norm_eps: type = double, default = 0.001 Normalization check
28 epsilon --normalize: type = bool, default = false Normalize re‐
29 sulting model --order: type = int64_t, default = 3 Set maximal
30 order of ngrams to be counted --output_fst: type = bool, default
31 = true Output counts as fst (otherwise strings) --require_sym‐
32 bols: type = bool, default = true Require symbol tables? (de‐
33 fault: yes) --round_to_int: type = bool, default = false Round
34 all counts to integers
35
36 LIBRARY FLAGS:
37
38 Flags from: compile-strings.cc
39
40 --far_field_separator: type = std::string, default = "
41 "
42
43 Set of characters used as a separator between printed fields
44
45 Flags from: flags.cc
46
47 --help: type = bool, default = false
48
49 show usage information --helpshort: type = bool, default = false
50 show brief usage information --tmpdir: type = std::string, de‐
51 fault = "/tmp" temporary directory --v: type = int32_t, default
52 = 0 verbosity level
53
54 Flags from: fst.cc
55
56 --fst_align: type = bool, default = false
57
58 Write FST data aligned where appropriate --fst_default_cache_gc:
59 type = bool, default = true Enable garbage collection of cache
60 --fst_default_cache_gc_limit: type = int64_t, default = 1048576
61 Cache byte size that triggers garbage collection
62 --fst_read_mode: type = std::string, default = "read" Default
63 file reading mode for mappable files --fst_verify_properties:
64 type = bool, default = false Verify FST properties queried by
65 TestProperties --save_relabel_ipairs: type = std::string, de‐
66 fault = "" Save input relabel pairs to file --save_rela‐
67 bel_opairs: type = std::string, default = "" Save output relabel
68 pairs to file
69
70 Flags from: ngram-output.cc
71
72 --end_symbol: type = std::string, default = "</s>"
73
74 Class label for sentence end --start_symbol: type = std::string,
75 default = "<s>" Class label for sentence start
76
77 Flags from: symbol-table.cc
78
79 --fst_compat_symbols: type = bool, default = true
80
81 Require symbol tables to match when appropriate --fst_field_sep‐
82 arator: type = std::string, default = " " Set of charac‐
83 ters used as a separator between printed fields
84
85 Flags from: util.cc
86
87 --fst_error_fatal: type = bool, default = true
88
89 FST errors are fatal; o.w. return objects flagged as bad: e.g.,
90 FSTs: kError property set, FST weights: not a Member()
91 --ngram_error_fatal: type = bool, default = true NGram errors
92 are fatal if true; otherwise returns objects flagged as bad:
93 e.g., NGramModel::Error() is true
94
95 Flags from: weight.cc
96
97 --fst_weight_parentheses: type = std::string, default = ""
98
99 Characters enclosing the first weight of a printed composite
100 weight (e.g., pair weight, tuple weight and derived classes) to
101 ensure proper I/O of nested composite weights; must have size 0
102 (none) or 2 (open and close parenthesis) --fst_weight_separator:
103 type = std::string, default = "," Character separator between
104 printed composite weights; must be a single character
105
106
107
108ngramcount 1.3.14 February 2022 NGRAMCOUNT(1)