1PUBLIC-INBOX-TUNING(7)     public-inbox user manual     PUBLIC-INBOX-TUNING(7)
2
3
4

NAME

6       public-inbox-tuning - tuning public-inbox
7

DESCRIPTION

9       public-inbox intends to support a wide variety of hardware.  While we
10       strive to provide the best out-of-the-box performance possible, tuning
11       knobs are an unfortunate necessity in some cases.
12
13       1.  New inboxes: public-inbox-init -V2
14
15       2.  Optional Inline::C use
16
17       3.  Performance on rotational hard disk drives
18
19       4.  Btrfs (and possibly other copy-on-write filesystems)
20
21       5.  Performance on solid state drives
22
23       6.  Read-only daemons
24
25       7.  Other OS tuning knobs
26
27       8.  Scalability to many inboxes
28
29   New inboxes: public-inbox-init -V2
30       If you're starting a new inbox (and not mirroring an existing one), the
31       -V2 requires DBD::SQLite, but is orders of magnitude more scalable than
32       the original "-V1" format.
33
34   Optional Inline::C use
35       Our optional use of Inline::C speeds up subprocess spawning from large
36       daemon processes.
37
38       To enable Inline::C, either set the "PERL_INLINE_DIRECTORY" environment
39       variable to point to a writable directory, or create
40       "~/.cache/public-inbox/inline-c" for any user(s) running public-inbox
41       processes.
42
43       If libgit2 development files are installed and Inline::C is enabled
44       (described above), per-inbox "git cat-file --batch" processes are
45       replaced with a single perl(1) process running
46       "PublicInbox::Gcf2::loop" in read-only daemons.  libgit2 use will be
47       available in public-inbox 1.7.0+
48
49       More (optional) Inline::C use will be introduced in the future to lower
50       memory use and improve scalability.
51
52       Note: Inline::C is required for lei(1), but not public-inbox-*
53
54   Performance on rotational hard disk drives
55       Random I/O performance is poor on rotational HDDs.  Xapian indexing
56       performance degrades significantly as DBs grow larger than available
57       RAM.  Attempts to parallelize random I/O on HDDs leads to pathological
58       slowdowns as inboxes grow.
59
60       While "-V2" introduced Xapian shards as a parallelization mechanism for
61       SSDs; enabling "publicInbox.indexSequentialShard" repurposes sharding
62       as mechanism to reduce the kernel page cache footprint when indexing on
63       HDDs.
64
65       Initializing a mirror with a high "--jobs" count to create more shards
66       (in "-V2" inboxes) will keep each shard smaller and reduce its kernel
67       page cache footprint.  Keep in mind excessive sharding imposes a
68       performance penalty for read-only queries.
69
70       Users with large amounts of RAM are advised to set a large value for
71       "publicinbox.indexBatchSize" as documented in public-inbox-index(1).
72
73       "dm-crypt" users on Linux 4.0+ are advised to try the
74       "--perf-same_cpu_crypt" "--perf-submit_from_crypt_cpus" switches of
75       cryptsetup(8) to reduce I/O contention from kernel workqueue threads.
76
77   Btrfs (and possibly other copy-on-write filesystems)
78       btrfs(5) performance degrades from fragmentation when using large
79       databases and random writes.  The Xapian + SQLite indices used by
80       public-inbox are no exception to that.
81
82       public-inbox 1.6.0+ disables copy-on-write (CoW) on Xapian and SQLite
83       indices on btrfs to achieve acceptable performance (even on SSD).
84       Disabling copy-on-write also disables checksumming, thus "raid1" (or
85       higher) configurations may be corrupt after unsafe shutdowns.
86
87       Fortunately, these SQLite and Xapian indices are designed to
88       recoverable from git if missing.
89
90       Disabling CoW does not prevent all fragmentation.  Large values of
91       "publicInbox.indexBatchSize" also limit fragmentation during the
92       initial index.
93
94       Avoid snapshotting subvolumes containing Xapian and/or SQLite indices.
95       Snapshots use CoW despite our efforts to disable it, resulting in
96       fragmentation.
97
98       filefrag(8) can be used to monitor fragmentation, and "btrfs filesystem
99       defragment -fr $INBOX_DIR" may be necessary.
100
101       Large filesystems benefit significantly from the "space_cache=v2" mount
102       option documented in btrfs(5).
103
104       Older, non-CoW filesystems are generally work well out-of-the-box for
105       our Xapian and SQLite indices.
106
107   Performance on solid state drives
108       While SSD read performance is generally good, SSD write performance
109       degrades as the drive ages and/or gets full.  Issuing "TRIM" commands
110       via fstrim(8) or similar is required to sustain write performance.
111
112       Users of the Flash-Friendly File System F2FS
113       <https://en.wikipedia.org/wiki/F2FS> may benefit from optimizations
114       found in SQLite 3.21.0+.  Benchmarks are greatly appreciated.
115
116   Read-only daemons
117       public-inbox-httpd(1), public-inbox-imapd(1), and public-inbox-nntpd(1)
118       are all designed for C10K (or higher) levels of concurrency from a
119       single process.  SMP systems may use "--worker-processes=NUM" as
120       documented in public-inbox-daemon(8) for parallelism.
121
122       The open file descriptor limit ("RLIMIT_NOFILE", "ulimit -n" in sh(1),
123       "LimitNOFILE=" in systemd.exec(5)) may need to be raised to accommodate
124       many concurrent clients.
125
126       Transport Layer Security (IMAPS, NNTPS, or via STARTTLS) significantly
127       increases memory use of client sockets, sure to account for that in
128       capacity planning.
129
130   Other OS tuning knobs
131       Linux users: the "sys.vm.max_map_count" sysctl may need to be increased
132       if handling thousands of inboxes (with public-inbox-extindex(1)) to
133       avoid out-of-memory errors from git.
134
135       Other OSes may have similar tuning knobs (patches appreciated).
136
137   Scalability to many inboxes
138       public-inbox-extindex(1) allows any number of public-inboxes to share
139       the same Xapian indices.
140
141       git 2.33+ startup time is orders-of-magnitude faster and uses less
142       memory when dealing with thousands of alternates required for thousands
143       of inboxes with public-inbox-extindex(1).
144
145       Frequent packing (via git-gc(1)) both improves performance and reduces
146       the need to increase "sys.vm.max_map_count".
147

CONTACT

149       Feedback encouraged via plain-text mail to
150       <mailto:meta@public-inbox.org>
151
152       Information for *BSDs and non-traditional filesystems especially
153       welcome.
154
155       Our archives are hosted at <https://public-inbox.org/meta/>,
156       <http://4uok3hntl7oi7b4uf4rtfwefqeexfzil2w6kgk2jn5z2f764irre7byd.onion/meta/>,
157       and other places
158
160       Copyright all contributors <mailto:meta@public-inbox.org>
161
162       License: AGPL-3.0+ <https://www.gnu.org/licenses/agpl-3.0.txt>
163
164
165
166public-inbox.git                  1993-10-02            PUBLIC-INBOX-TUNING(7)
Impressum