1CARBON-C-RELAY(1) General Commands Manual CARBON-C-RELAY(1)
2
3
4
6 carbon-c-relay - graphite relay, aggregator and rewriter
7
8 https://travis-ci.org/grobian/carbon-c-relay
9
11 carbon-c-relay -f config-file [ options ... ]
12
14 carbon-c-relay accepts, cleanses, matches, rewrites, forwards and ag‐
15 gregates graphite metrics by listening for incoming connections and re‐
16 laying the messages to other servers defined in its configuration. The
17 core functionality is to route messages via flexible rules to the de‐
18 sired destinations.
19
20 carbon-c-relay is a simple program that reads its routing information
21 from a file. The command line arguments allow to set the location for
22 this file, as well as the amount of dispatchers (worker threads) to use
23 for reading the data from incoming connections and passing them onto
24 the right destination(s). The route file supports two main constructs:
25 clusters and matches. The first define groups of hosts data metrics can
26 be sent to, the latter define which metrics should be sent to which
27 cluster. Aggregation rules are seen as matches. Rewrites are actions
28 that directly affect the metric at the point in which they appear in
29 the configuration.
30
31 For every metric received by the relay, cleansing is performed. The
32 following changes are performed before any match, aggregate or rewrite
33 rule sees the metric:
34
35 ○ double dot elimination (necessary for correctly functioning consis‐
36 tent hash routing)
37
38 ○ trailing/leading dot elimination
39
40 ○ whitespace normalisation (this mostly affects output of the relay
41 to other targets: metric, value and timestamp will be separated by
42 a single space only, ever)
43
44 ○ irregular char replacement with underscores (_), currently irregu‐
45 lar is defined as not being in [0-9a-zA-Z-_:#], but can be overrid‐
46 den on the command line. Note that tags (when present and allowed)
47 are not processed this way.
48
49
50
52 These options control the behaviour of carbon-c-relay.
53
54 ○ -v: Print version string and exit.
55
56 ○ -d: Enable debug mode, this prints statistics to stdout and prints
57 extra messages about some situations encountered by the relay that
58 normally would be too verbose to be enabled. When combined with -t
59 (test mode) this also prints stub routes and consistent-hash ring
60 contents.
61
62 ○ -s: Enable submission mode. In this mode, internal statistics are
63 not generated. Instead, queue pressure and metrics drops are re‐
64 ported on stdout. This mode is useful when used as submission relay
65 which´ job is just to forward to (a set of) main relays. Statistics
66 about the submission relays in this case are not needed, and could
67 easily cause a non-desired flood of metrics e.g. when used on each
68 and every host locally.
69
70 ○ -t: Test mode. This mode doesn´t do any routing at all, but instead
71 reads input from stdin and prints what actions would be taken given
72 the loaded configuration. This mode is very useful for testing re‐
73 lay routes for regular expression syntax etc. It also allows to
74 give insight on how routing is applied in complex configurations,
75 for it shows rewrites and aggregates taking place as well. When -t
76 is repeated, the relay will only test the configuration for valid‐
77 ity and exit immediately afterwards. Any standard output is sup‐
78 pressed in this mode, making it ideal for start-scripts to test a
79 (new) configuration.
80
81 ○ -f config-file: Read configuration from config-file. A configura‐
82 tion consists of clusters and routes. See CONFIGURATION SYNTAX for
83 more information on the options and syntax of this file.
84
85 ○ -l log-file: Use log-file for writing messages. Without this op‐
86 tion, the relay writes both to stdout and stderr. When logging to
87 file, all messages are prefixed with MSG when they were sent to
88 stdout, and ERR when they were sent to stderr.
89
90 ○ -p port: Listen for connections on port port. The port number is
91 used for both TCP, UDP and UNIX sockets. In the latter case, the
92 socket file contains the port number. The port defaults to 2003,
93 which is also used by the original carbon-cache.py. Note that this
94 only applies to the defaults, when listen directives are in the
95 config, this setting is ignored.
96
97 ○ -w workers: Use workers number of threads. The default number of
98 workers is equal to the amount of detected CPU cores. It makes
99 sense to reduce this number on many-core machines, or when the
100 traffic is low.
101
102 ○ -b batchsize: Set the amount of metrics that sent to remote servers
103 at once to batchsize. When the relay sends metrics to servers, it
104 will retrieve batchsize metrics from the pending queue of metrics
105 waiting for that server and send those one by one. The size of the
106 batch will have minimal impact on sending performance, but it con‐
107 trols the amount of lock-contention on the queue. The default is
108 2500.
109
110 ○ -q queuesize: Each server from the configuration where the relay
111 will send metrics to, has a queue associated with it. This queue
112 allows for disruptions and bursts to be handled. The size of this
113 queue will be set to queuesize which allows for that amount of met‐
114 rics to be stored in the queue before it overflows, and the relay
115 starts dropping metrics. The larger the queue, more metrics can be
116 absorbed, but also more memory will be used by the relay. The de‐
117 fault queue size is 25000.
118
119 ○ -L stalls: Sets the max mount of stalls to stalls before the relay
120 starts dropping metrics for a server. When a queue fills up, the
121 relay uses a mechanism called stalling to signal the client (writ‐
122 ing to the relay) of this event. In particular when the client
123 sends a large amount of metrics in very short time (burst),
124 stalling can help to avoid dropping metrics, since the client just
125 needs to slow down for a bit, which in many cases is possible (e.g.
126 when catting a file with nc(1)). However, this behaviour can also
127 obstruct, artificially stalling writers which cannot stop that eas‐
128 ily. For this the stalls can be set from 0 to 15, where each stall
129 can take around 1 second on the client. The default value is set to
130 4, which is aimed at the occasional disruption scenario and max ef‐
131 fort to not loose metrics with moderate slowing down of clients.
132
133 ○ -C CAcertpath: Read CA certs (for use with TLS/SSL connections)
134 from given path or file. When not given, the default locations are
135 used. Strict verfication of the peer is performed, so when using
136 self-signed certificates, be sure to include the CA cert in the de‐
137 fault location, or provide the path to the cert using this option.
138
139 ○ -T timeout: Specifies the IO timeout in milliseconds used for
140 server connections. The default is 600 milliseconds, but may need
141 increasing when WAN links are used for target servers. A relatively
142 low value for connection timeout allows the relay to quickly estab‐
143 lish a server is unreachable, and as such failover strategies to
144 kick in before the queue runs high.
145
146 ○ -c chars: Defines the characters that are next to [A-Za-z0-9] al‐
147 lowed in metrics to chars. Any character not in this list, is re‐
148 placed by the relay with _ (underscore). The default list of al‐
149 lowed characters is -_:#.
150
151 ○ -m length: Limits the metric names to be of at most length bytes
152 long. Any lines containing metric names larger than this will be
153 discarded.
154
155 ○ -M length Limits the input to lines of at most length bytes. Any
156 excess lines will be discarded. Note that -m needs to be smaller
157 than this value.
158
159 ○ -H hostname: Override hostname determined by a call to gethost‐
160 name(3) with hostname. The hostname is used mainly in the statis‐
161 tics metrics carbon.relays.<hostname>.<...> sent by the relay.
162
163 ○ -B backlog: Sets TCP connection listen backlog to backlog connec‐
164 tions. The default value is 32 but on servers which receive many
165 concurrent connections, this setting likely needs to be increased
166 to avoid connection refused errors on the clients.
167
168 ○ -U bufsize: Sets the socket send/receive buffer sizes in bytes, for
169 both TCP and UDP scenarios. When unset, the OS default is used. The
170 maximum is also determined by the OS. The sizes are set using set‐
171 sockopt with the flags SO_RCVBUF and SO_SNDBUF. Setting this size
172 may be necessary for large volume scenarios, for which also -B
173 might apply. Checking the Recv-Q and the receive errors values from
174 netstat gives a good hint about buffer usage.
175
176 ○ -E: Disable disconnecting idle incoming connections. By default the
177 relay disconnects idle client connections after 10 minutes. It does
178 this to prevent resources clogging up when a faulty or malicious
179 client keeps on opening connections without closing them. It typi‐
180 cally prevents running out of file descriptors. For some scenarios,
181 however, it is not desirable for idle connections to be discon‐
182 nected, hence passing this flag will disable this behaviour.
183
184 ○ -D: Deamonise into the background after startup. This option re‐
185 quires -l and -P flags to be set as well.
186
187 ○ -P pidfile: Write the pid of the relay process to a file called
188 pidfile. This is in particular useful when daemonised in combina‐
189 tion with init managers.
190
191 ○ -O threshold: The minimum number of rules to find before trying to
192 optimise the ruleset. The default is 50, to disable the optimiser,
193 use -1, to always run the optimiser use 0. The optimiser tries to
194 group rules to avoid spending excessive time on matching expres‐
195 sions.
196
197
198
200 The config file supports the following syntax, where comments start
201 with a # character and can appear at any position on a line and sup‐
202 press input until the end of that line:
203
204 ``` cluster name <forward | any_of | failover [useall] | carbon_ch |
205 fnv1a_ch | jump_fnv1a_ch [replication count] [dynamic] >
206 <host[:port][=instance] [proto udp | tcp] [type linemode] [transport
207 plain | gzip | lz4 | snappy [ssl]]> ... ;
208
209 cluster name file [ip] </path/to/file> ... ;
210
211 match <* | expression ...> [validate expression else log | drop] send
212 to <cluster ... | blackhole> [stop] ;
213
214 rewrite expression into replacement ;
215
216 aggregate expression ... every interval seconds expire after expiration
217 seconds [timestamp at start | middle | end of bucket] compute sum |
218 count | max | min | average | median | percentile<% | variance | std‐
219 dev> write to metric [compute ...] [send to <cluster ...>] [stop] ;
220
221 send statistics to <cluster ...> [stop] ; statistics [submit every in‐
222 terval seconds] [reset counters after interval] [prefix with prefix]
223 [send to <cluster ...>] [stop] ;
224
225 listen type linemode [transport plain | gzip | lz4 | snappy [ssl pem‐
226 cert]] «interface[:port] | port> proto udp | tcp> ... </ptah/to/file
227 proto unix> ... ;
228
229 include </path/to/file/or/glob> ; ```
230
231 CLUSTERS
232 Multiple clusters can be defined, and need not to be referenced by a
233 match rule. All clusters point to one or more hosts, except the file
234 cluster which writes to files in the local filesystem. host may be an
235 IPv4 or IPv6 address, or a hostname. Since host is followed by an op‐
236 tional : and port, for IPv6 addresses not to be interpreted wrongly,
237 either a port must be given, or the IPv6 address surrounded by brack‐
238 ets, e.g. [::1]. Optional transport and proto clauses can be used to
239 wrap the connection in a compression or encryption later or specify the
240 use of UDP or TCP to connect to the remote server. When omitted the
241 connection defaults to an unwrapped TCP connection. type can only be
242 linemode at the moment.
243
244 DNS hostnames are resolved to a single address, according to the pref‐
245 erence rules in RFC 3484 https://www.ietf.org/rfc/rfc3484.txt. The
246 any_of, failover and forward clusters have an explicit useall flag that
247 enables expansion for hostnames resolving to multiple addresses. Using
248 this option Each address returned becomes a cluster destination.
249
250 There are two groups of cluster types, simple forwarding clusters and
251 consistent hashing clusters.
252
253 ○ forward and file clusters
254
255 The forward and file clusters simply send everything they receive
256 to the defined members (host addresses or files). When a cluster
257 has multiple members, all incoming metrics are sent to /all/ mem‐
258 bers, basically duplicating the input metric stream over all mem‐
259 bers.
260
261 ○ any_of cluster
262
263 The any_of cluster is a small variant of the forward cluster, but
264 instead of sending the input metrics to all defined members, it
265 sends each incoming metric to only one of defined members. The pur‐
266 pose of this is a load-balanced scenario where any of the members
267 can receive any metric. As any_of suggests, when any of the members
268 become unreachable, the remaining available members will immedi‐
269 ately receive the full input stream of metrics. This specifically
270 mean that when 4 members are used, each will receive approximately
271 25% of the input metrics. When one member becomes unavailable (e.g.
272 network interruption, or a restart of the service), the remaining 3
273 members will each receive about 33% of the input. When designing
274 cluster capacity, one should take into account that in the most ex‐
275 treme case, the final remaining member will receive all input traf‐
276 fic.
277
278 An any_of cluster can in particular be useful when the cluster
279 points to other relays or caches. When used with other relays, it
280 effectively load-balances, and adapts immediately over inavailabil‐
281 ity of targets. When used with caches, the behaviour of the any_of
282 router to send the same metrics consistently to the same destina‐
283 tion helps caches to have a high hitrate on their internal buffers
284 for the same metrics (if they use them), but still allows for a
285 rolling-restart of the caches when e.g. on the same machine.
286
287 ○ failover cluster
288
289 The failover cluster is like the any_of cluster, but sticks to the
290 order in which servers are defined. This is to implement a pure
291 failover scenario between servers. All metrics are sent to at most
292 1 member, so no hashing or balancing is taking place. A failover
293 cluster with two members will only send metrics to the second mem‐
294 ber if the first becomes unavailable.
295
296 ○ carbon_ch cluster
297
298 The carbon_ch cluster sends the metrics to the member that is re‐
299 sponsible according to the consistent hash algorithm, as used in
300 the original carbon python relay, or multiple members if replica‐
301 tion is set to more than 1. When dynamic is set, failure of any of
302 the servers does not result in metrics being dropped for that
303 server, but instead the undeliverable metrics are sent to any other
304 server in the cluster in order for the metrics not to get lost.
305 This is most useful when replication is 1.
306
307 The calculation of the hashring, that defines the way in which met‐
308 rics are distributed, is based on the server host (or IP address)
309 and the optional instance of the member. This means that using car‐
310 bon_ch two targets on different ports but on the same host will map
311 to the same hashkey, which means no distribution of metrics takes
312 place. The instance is used to remedy that situation. An instance
313 is appended to the memeber after the port, and separated by an
314 equals sign, e.g. 127.0.0.1:2006=a for instance a.
315
316 Consistent hashes are consistent in the sense that removal of a
317 member from the cluster should not result in a complete re-mapping
318 of all metrics to members, but instead only add the metrics from
319 the removed member to all remaining members, approximately each
320 gets its fair share. The other way around, when a member is added,
321 each member should see a subset of its metrics now being addressed
322 to the new member. This is an important advantage over a normal
323 hash, where each removal or addition of members (also via e.g. a
324 change in their IP address or hostname) would cause a full re-map‐
325 ping of all metrics over all available metrics.
326
327 ○ fnv1a_ch cluster
328
329 The fnv1a_ch cluster is a identical in behaviour to carbon_ch, but
330 it uses a different hash technique (FNV1a) which is faster but more
331 importantly defined to get by the beforementioned limitation of
332 carbon_ch to use both host and port from the members. This is use‐
333 ful when multiple targets live on the same host just separated by
334 port.
335
336 Since the instance property is no longer necessary with fnv1a_ch
337 this way, this cluster type uses it to completely override the
338 string that the hashkey should be calculated off. This allows for
339 many things, including masquerading old IP addresses, but it basi‐
340 cally can be used to make the hash key location agnostic of the
341 (physical) location of that key. For example, usage like
342 10.0.0.1:2003=4d79d13554fa1301476c1f9fe968b0ac would allow to
343 change port and/or ip address of the server that receives data for
344 the instance key. Obviously, this way migration of data can be
345 dealt with much more conveniently. Note that since the instance
346 name is used as full hash input, instances as a, b, etc. will
347 likely result in poor hash distribution, since their hashes have
348 very little input. Consider using longer and mostly differing in‐
349 stance names such as random hashes for better hash distribution be‐
350 haviour.
351
352 ○ jump_fnv1a_ch cluster
353
354 The jump_fnv1a_ch cluster is also a consistent hash cluster like
355 the previous two, but it does not take the member host, port or in‐
356 stance into account at all. Whether this is useful to you depends
357 on your scenario. The jump hash has almost perfect balancing over
358 the members defined in the cluster, at the expense of not being
359 able to remove any member but the last in order as defined in the
360 cluster. What this means is that this hash is fine to use with ever
361 growing clusters where older nodes are never removed.
362
363 If you have a cluster where removal of old nodes takes place often,
364 the jump hash is not suitable for you. Jump hash works with servers
365 in an ordered list without gaps. To influence the ordering, the in‐
366 stance given to the server will be used as sorting key. Without,
367 the order will be as given in the file. It is a good practice to
368 fix the order of the servers with instances such that it is ex‐
369 plicit what the right nodes for the jump hash are.
370
371
372
373 MATCHES
374 Match rules are the way to direct incoming metrics to one or more clus‐
375 ters. Match rules are processed top to bottom as they are defined in
376 the file. It is possible to define multiple matches in the same rule.
377 Each match rule can send data to one or more clusters. Since match
378 rules "fall through" unless the stop keyword is added, carefully
379 crafted match expression can be used to target multiple clusters or ag‐
380 gregations. This ability allows to replicate metrics, as well as send
381 certain metrics to alternative clusters with careful ordering and usage
382 of the stop keyword. The special cluster blackhole discards any metrics
383 sent to it. This can be useful for weeding out unwanted metrics in cer‐
384 tain cases. Because throwing metrics away is pointless if other matches
385 would accept the same data, a match with as destination the blackhole
386 cluster, has an implicit stop. The validation clause adds a check to
387 the data (what comes after the metric) in the form of a regular expres‐
388 sion. When this expression matches, the match rule will execute as if
389 no validation clause was present. However, if it fails, the match rule
390 is aborted, and no metrics will be sent to destinations, this is the
391 drop behaviour. When log is used, the metric is logged to stderr. Care
392 should be taken with the latter to avoid log flooding. When a validate
393 clause is present, destinations need not to be present, this allows for
394 applying a global validation rule. Note that the cleansing rules are
395 applied before validation is done, thus the data will not have dupli‐
396 cate spaces. The route using clause is used to perform a temporary mod‐
397 ification to the key used for input to the consistent hashing routines.
398 The primary purpose is to route traffic so that appropriate data is
399 sent to the needed aggregation instances.
400
401 REWRITES
402 Rewrite rules take a regular expression as input to match incoming met‐
403 rics, and transform them into the desired new metric name. In the re‐
404 placement, backreferences are allowed to match capture groups defined
405 in the input regular expression. A match of server\.(x|y|z)\. allows to
406 use e.g. role.\1. in the substitution. A few caveats apply to the cur‐
407 rent implementation of rewrite rules. First, their location in the con‐
408 fig file determines when the rewrite is performed. The rewrite is done
409 in-place, as such a match rule before the rewrite would match the orig‐
410 inal name, a match rule after the rewrite no longer matches the origi‐
411 nal name. Care should be taken with the ordering, as multiple rewrite
412 rules in succession can take place, e.g. a gets replaced by b and b
413 gets replaced by c in a succeeding rewrite rule. The second caveat with
414 the current implementation, is that the rewritten metric names are not
415 cleansed, like newly incoming metrics are. Thus, double dots and poten‐
416 tial dangerous characters can appear if the replacement string is
417 crafted to produce them. It is the responsibility of the writer to make
418 sure the metrics are clean. If this is an issue for routing, one can
419 consider to have a rewrite-only instance that forwards all metrics to
420 another instance that will do the routing. Obviously the second in‐
421 stance will cleanse the metrics as they come in. The backreference no‐
422 tation allows to lowercase and uppercase the replacement string with
423 the use of the underscore (_) and carret (^) symbols following directly
424 after the backslash. For example, role.\_1. as substitution will lower‐
425 case the contents of \1. The dot (.) can be used in a similar fashion,
426 or followed after the underscore or caret to replace dots with under‐
427 scores in the substitution. This can be handy for some situations where
428 metrics are sent to graphite.
429
430 AGGREGATIONS
431 The aggregations defined take one or more input metrics expressed by
432 one or more regular expresions, similar to the match rules. Incoming
433 metrics are aggregated over a period of time defined by the interval in
434 seconds. Since events may arrive a bit later in time, the expiration
435 time in seconds defines when the aggregations should be considered fi‐
436 nal, as no new entries are allowed to be added any more. On top of an
437 aggregation multiple aggregations can be computed. They can be of the
438 same or different aggregation types, but should write to a unique new
439 metric. The metric names can include back references like in rewrite
440 expressions, allowing for powerful single aggregation rules that yield
441 in many aggregations. When no send to clause is given, produced metrics
442 are sent to the relay as if they were submitted from the outside, hence
443 match and aggregation rules apply to those. Care should be taken that
444 loops are avoided this way. For this reason, the use of the send to
445 clause is encouraged, to direct the output traffic where possible. Like
446 for match rules, it is possible to define multiple cluster targets.
447 Also, like match rules, the stop keyword applies to control the flow of
448 metrics in the matching process.
449
450 STATISTICS
451 The send statistics to construct is deprecated and will be removed in
452 the next release. Use the special statistics construct instead.
453
454 The statistics construct can control a couple of things about the (in‐
455 ternal) statistics produced by the relay. The send to target can be
456 used to avoid router loops by sending the statistics to a certain des‐
457 tination cluster(s). By default the metrics are prefixed with car‐
458 bon.relays.<hostname>, where hostname is determinted on startup and can
459 be overridden using the -H argument. This prefix can be set using the
460 prefix with clause similar to a rewrite rule target. The input match in
461 this case is the pre-set regular expression ^(([^.]+)(\..*)?)$ on the
462 hostname. As such, one can see that the default prefix is set by car‐
463 bon.relays.\.1. Note that this uses the replace-dot-with-underscore re‐
464 placement feature from rewrite rules. Given the input expression, the
465 following match groups are available: \1 the entire hostname, \2 the
466 short hostname and \3 the domainname (with leading dot). It may make
467 sense to replace the default by something like carbon.relays.\_2 for
468 certain scenarios, to always use the lowercased short hostname, which
469 following the expression doesn´t contain a dot. By default, the metrics
470 are submitted every 60 seconds, this can be changed using the submit
471 every <interval> seconds clause.
472 To obtain a more compatible set of values to carbon-cache.py, use the
473 reset counters after interval clause to make values non-cumulative,
474 that is, they will report the change compared to the previous value.
475
476 LISTENERS
477 The ports and protocols the relay should listen for incoming connec‐
478 tions can be specified using the listen directive. Currently, all lis‐
479 teners need to be of linemode type. An optional compression or encryp‐
480 tion wrapping can be specified for the port and optional interface
481 given by ip address, or unix socket by file. When interface is not
482 specified, the any interface on all available ip protocols is assumed.
483 If no listen directive is present, the relay will use the default lis‐
484 teners for port 2003 on tcp and udp, plus the unix socket /tmp/.s.car‐
485 bon-c-relay.2003. This typically expands to 5 listeners on an IPv6 en‐
486 abled system. The default matches the behaviour of versions prior to
487 v3.2.
488
489 INCLUDES
490 In case configuration becomes very long, or is managed better in sepa‐
491 rate files, the include directive can be used to read another file. The
492 given file will be read in place and added to the router configuration
493 at the time of inclusion. The end result is one big route configura‐
494 tion. Multiple include statements can be used throughout the configura‐
495 tion file. The positioning will influence the order of rules as normal.
496 Beware that recursive inclusion (include from an included file) is sup‐
497 ported, and currently no safeguards exist for an inclusion loop. For
498 what is worth, this feature likely is best used with simple configura‐
499 tion files (e.g. not having include in them).
500
502 carbon-c-relay evolved over time, growing features on demand as the
503 tool proved to be stable and fitting the job well. Below follow some
504 annotated examples of constructs that can be used with the relay.
505
506 Clusters can be defined as much as necessary. They receive data from
507 match rules, and their type defines which members of the cluster fi‐
508 nally get the metric data. The simplest cluster form is a forward clus‐
509 ter:
510
511 cluster send-through forward 10.1.0.1 ;
512
513 Any metric sent to the send-through cluster would simply be forwarded
514 to the server at IPv4 address 10.1.0.1. If we define multiple servers,
515 all of those servers would get the same metric, thus:
516
517 cluster send-through forward 10.1.0.1 10.2.0.1 ;
518
519 The above results in a duplication of metrics send to both machines.
520 This can be useful, but most of the time it is not. The any_of cluster
521 type is like forward, but it sends each incoming metric to any of the
522 members. The same example with such cluster would be:
523
524 cluster send-to-any-one any_of 10.1.0.1:2010 10.1.0.1:2011;
525
526 This would implement a multipath scenario, where two servers are used,
527 the load between them is spread, but should any of them fail, all met‐
528 rics are sent to the remaining one. This typically works well for up‐
529 stream relays, or for balancing carbon-cache processes running on the
530 same machine. Should any member become unavailable, for instance due to
531 a rolling restart, the other members receive the traffic. If it is nec‐
532 essary to have true fail-over, where the secondary server is only used
533 if the first is down, the following would implement that:
534
535 cluster try-first-then-second failover 10.1.0.1:2010 10.1.0.1:2011;
536
537 These types are different from the two consistent hash cluster types:
538
539 cluster graphite carbon_ch 127.0.0.1:2006=a 127.0.0.1:2007=b
540 127.0.0.1:2008=c ;
541
542 If a member in this example fails, all metrics that would go to that
543 member are kept in the queue, waiting for the member to return. This is
544 useful for clusters of carbon-cache machines where it is desirable that
545 the same metric ends up on the same server always. The carbon_ch clus‐
546 ter type is compatible with carbon-relay consistent hash, and can be
547 used for existing clusters populated by carbon-relay. For new clusters,
548 however, it is better to use the fnv1a_ch cluster type, for it is
549 faster, and allows to balance over the same address but different ports
550 without an instance number, in constrast to carbon_ch.
551
552 Because we can use multiple clusters, we can also replicate without the
553 use of the forward cluster type, in a more intelligent way:
554
555 ``` cluster dc-old carbon_ch replication 2 10.1.0.1 10.1.0.2 10.1.0.3 ;
556 cluster dc-new1 fnv1a_ch replication 2 10.2.0.1 10.2.0.2 10.2.0.3 ;
557 cluster dc-new2 fnv1a_ch replication 2 10.3.0.1 10.3.0.2 10.3.0.3 ;
558
559 match * send to dc-old ; match * send to dc-new1 dc-new2 stop ; ```
560
561 In this example all incoming metrics are first sent to dc-old, then
562 dc-new1 and finally to dc-new2. Note that the cluster type of dc-old is
563 different. Each incoming metric will be send to 2 members of all three
564 clusters, thus replicating to in total 6 destinations. For each cluster
565 the destination members are computed independently. Failure of clusters
566 or members does not affect the others, since all have individual
567 queues. The above example could also be written using three match rules
568 for each dc, or one match rule for all three dcs. The difference is
569 mainly in performance, the number of times the incoming metric has to
570 be matched against an expression. The stop rule in dc-new match rule is
571 not strictly necessary in this example, because there are no more fol‐
572 lowing match rules. However, if the match would target a specific sub‐
573 set, e.g. ^sys\., and more clusters would be defined, this could be
574 necessary, as for instance in the following abbreviated example:
575
576 ``` cluster dc1-sys ... ; cluster dc2-sys ... ;
577
578 cluster dc1-misc ... ; cluster dc2-misc ... ;
579
580 match ^sys. send to dc1-sys; match ^sys. send to dc2-sys stop;
581
582 match * send to dc1-misc; match * send to dc2-misc stop; ```
583
584 As can be seen, without the stop in dc2-sys´ match rule, all metrics
585 starting with sys. would also be send to dc1-misc and dc2-misc. It can
586 be that this is desired, of course, but in this example there is a ded‐
587 icated cluster for the sys metrics.
588
589 Suppose there would be some unwanted metric that unfortunately is gen‐
590 erated, let´s assume some bad/old software. We don´t want to store this
591 metric. The blackhole cluster is suitable for that, when it is harder
592 to actually whitelist all wanted metrics. Consider the following:
593
594 match some_legacy1$ some_legacy2$ send to blackhole stop;
595
596 This would throw away all metrics that end with some_legacy, that would
597 otherwise be hard to filter out. Since the order matters, it can be
598 used in a construct like this:
599
600 ``` cluster old ... ; cluster new ... ;
601
602 match * send to old;
603
604 match unwanted send to blackhole stop;
605
606 match * send to new; ```
607
608 In this example the old cluster would receive the metric that´s un‐
609 wanted for the new cluster. So, the order in which the rules occur does
610 matter for the execution.
611
612 Validation can be used to ensure the data for metrics is as expected. A
613 global validation for just number (no floating point) values could be:
614
615 match * validate ^[0-9]+\ [0-9]+$ else drop ;
616
617 (Note the escape with backslash \ of the space, you might be able to
618 use \s or [:space:] instead, this depends on your configured regex im‐
619 plementation.)
620
621 The validation clause can exist on every match rule, so in principle,
622 the following is valid:
623
624 match ^foo validate ^[0-9]+\ [0-9]+$ else drop send to integer-cluster
625 ; match ^foo validate ^[0-9.e+-]+\ [0-9.e+-]+$ else drop send to
626 float-cluster stop;
627
628 Note that the behaviour is different in the previous two examples. When
629 no send to clusters are specified, a validation error makes the match
630 behave like the stop keyword is present. Likewise, when validation
631 passes, processing continues with the next rule. When destination clus‐
632 ters are present, the match respects the stop keyword as normal. When
633 specified, processing will always stop when specified so. However, if
634 validation fails, the rule does not send anything to the destination
635 clusters, the metric will be dropped or logged, but never sent.
636
637 The relay is capable of rewriting incoming metrics on the fly. This
638 process is done based on regular expressions with capture groups that
639 allow to substitute parts in a replacement string. Rewrite rules allow
640 to cleanup metrics from applications, or provide a migration path. In
641 it´s simplest form a rewrite rule looks like this:
642
643 rewrite ^server\.(.+)\.(.+)\.([a-zA-Z]+)([0-9]+) into
644 server.\_1.\2.\3.\3\4 ;
645
646 In this example a metric like server.DC.role.name123 would be trans‐
647 formed into server.dc.role.name.name123. For rewrite rules hold the
648 same as for matches, that their order matters. Hence to build on top of
649 the old/new cluster example done earlier, the following would store the
650 original metric name in the old cluster, and the new metric name in the
651 new cluster:
652
653 ``` match * send to old;
654
655 rewrite ... ;
656
657 match * send to new; ```
658
659 Note that after the rewrite, the original metric name is no longer
660 available, as the rewrite happens in-place.
661
662 Aggregations are probably the most complex part of carbon-c-relay. Two
663 ways of specifying aggregates are supported by carbon-c-relay. The
664 first, static rules, are handled by an optimiser which tries to fold
665 thousands of rules into groups to make the matching more efficient. The
666 second, dynamic rules, are very powerful compact definitions with pos‐
667 sibly thousands of internal instantiations. A typical static aggrega‐
668 tion looks like:
669
670 aggregate ^sys\.dc1\.somehost-[0-9]+\.somecluster\.mysql\.replica‐
671 tion_delay ^sys\.dc2\.somehost-[0-9]+\.somecluster\.mysql\.replica‐
672 tion_delay every 10 seconds expire after 35 seconds timestamp at end of
673 bucket compute sum write to mysql.somecluster.total_replication_delay
674 compute average write to mysql.somecluster.average_replication_delay
675 compute max write to mysql.somecluster.max_replication_delay compute
676 count write to mysql.somecluster.replication_delay_metric_count ;
677
678 In this example, four aggregations are produced from the incoming
679 matching metrics. In this example we could have written the two matches
680 as one, but for demonstration purposes we did not. Obviously they can
681 refer to different metrics, if that makes sense. The every 10 seconds
682 clause specifies in what interval the aggregator can expect new metrics
683 to arrive. This interval is used to produce the aggregations, thus each
684 10 seconds 4 new metrics are generated from the data received sofar.
685 Because data may be in transit for some reason, or generation stalled,
686 the expire after clause specifies how long the data should be kept be‐
687 fore considering a data bucket (which is aggregated) to be complete. In
688 the example, 35 was used, which means after 35 seconds the first aggre‐
689 gates are produced. It also means that metrics can arrive 35 seconds
690 late, and still be taken into account. The exact time at which the ag‐
691 gregate metrics are produced is random between 0 and interval (10 in
692 this case) seconds after the expiry time. This is done to prevent thun‐
693 dering herds of metrics for large aggregation sets. The timestamp that
694 is used for the aggregations can be specified to be the start, middle
695 or end of the bucket. Original carbon-aggregator.py uses start, while
696 carbon-c-relay´s default has always been end. The compute clauses
697 demonstrate a single aggregation rule can produce multiple aggregates,
698 as often is the case. Internally, this comes for free, since all possi‐
699 ble aggregates are always calculated, whether or not they are used. The
700 produced new metrics are resubmitted to the relay, hence matches de‐
701 fined before in the configuration can match output of the aggregator.
702 It is important to avoid loops, that can be generated this way. In gen‐
703 eral, splitting aggregations to their own carbon-c-relay instance, such
704 that it is easy to forward the produced metrics to another relay in‐
705 stance is a good practice.
706
707 The previous example could also be written as follows to be dynamic:
708
709 aggregate ^sys\.dc[0-9].(somehost-[0-9]+)\.([^.]+)\.mysql\.replica‐
710 tion_delay every 10 seconds expire after 35 seconds compute sum write
711 to mysql.host.\1.replication_delay compute sum write to
712 mysql.host.all.replication_delay compute sum write to mysql.clus‐
713 ter.\2.replication_delay compute sum write to mysql.cluster.all.repli‐
714 cation_delay ;
715
716 Here a single match, results in four aggregations, each of a different
717 scope. In this example aggregation based on hostname and cluster are
718 being made, as well as the more general all targets, which in this ex‐
719 ample have both identical values. Note that with this single aggrega‐
720 tion rule, both per-cluster, per-host and total aggregations are pro‐
721 duced. Obviously, the input metrics define which hosts and clusters are
722 produced.
723
724 With use of the send to clause, aggregations can be made more intuitive
725 and less error-prone. Consider the below example:
726
727 ``` cluster graphite fnv1a_ch ip1 ip2 ip3;
728
729 aggregate ^sys.somemetric every 60 seconds expire after 75 seconds com‐
730 pute sum write to sys.somemetric send to graphite stop ;
731
732 match * send to graphite; ```
733
734 It sends all incoming metrics to the graphite cluster, except the
735 sys.somemetric ones, which it replaces with a sum of all the incoming
736 ones. Without a stop in the aggregate, this causes a loop, and without
737 the send to, the metric name can´t be kept its original name, for the
738 output now directly goes to the cluster.
739
740 When configuring cluster you might want to check how the metrics will
741 be routed and hashed. That´s what the -t flag is for. For the following
742 configuration : ``` cluster graphite_swarm_odd fnv1a_ch replication 1
743 host01.dom:2003=31F7A65E315586AC198BD798B6629CE4903D089947
744 host03.dom:2003=9124E29E0C92EB63B3834C1403BD2632AA7508B740
745 host05.dom:2003=B653412CD96B13C797658D2C48D952AEC3EB667313 ;
746
747 cluster graphite_swarm_even fnv1a_ch replication 1
748 host02.dom:2003=31F7A65E315586AC198BD798B6629CE4903D089947
749 host04.dom:2003=9124E29E0C92EB63B3834C1403BD2632AA7508B740
750 host06.dom:2003=B653412CD96B13C797658D2C48D952AEC3EB667313
751
752 ;
753
754 match * send to graphite_swarm_odd graphite_swarm_even stop ; Running
755 the command : `echo "my.super.metric" | carbon-c-relay -f config.conf
756 -t`, will result in : [...] match * -> my.super.metric
757 fnv1a_ch(graphite_swarm_odd) host03.dom:2003
758 fnv1a_ch(graphite_swarm_even) host04.dom:2003 stop
759
760 ``` You now know that your metric my.super.metric will be hashed and
761 arrive on the host03 and host04 machines. Adding the -d flag will in‐
762 crease the amount of information by showing you the hashring
763
765 When carbon-c-relay is run without -d or -s arguments, statistics will
766 be produced. By default they are sent to the relay itself in the form
767 of carbon.relays.<hostname>.*. See the statistics construct to override
768 this prefix, sending interval and values produced. While many metrics
769 have a similar name to what carbon-cache.py would produce, their values
770 are likely different. By default, most values are running counters
771 which only increase over time. The use of the nonNegativeDerivative()
772 function from graphite is useful with these.
773
774 The following metrics are produced under the carbon.relays.<hostname>
775 namespace:
776
777 ○ metricsReceived
778
779 The number of metrics that were received by the relay. Received
780 here means that they were seen and processed by any of the dis‐
781 patchers.
782
783 ○ metricsSent
784
785 The number of metrics that were sent from the relay. This is a to‐
786 tal count for all servers combined. When incoming metrics are du‐
787 plicated by the cluster configuration, this counter will include
788 all those duplications. In other words, the amount of metrics that
789 were successfully sent to other systems. Note that metrics that are
790 processed (received) but still in the sending queue (queued) are
791 not included in this counter.
792
793 ○ metricsDiscarded
794
795 The number of input lines that were not considered to be a valid
796 metric. Such lines can be empty, only containing whitespace, or
797 hitting the limits given for max input length and/or max metric
798 length (see -m and -M options).
799
800 ○ metricsQueued
801
802 The total number of metrics that are currently in the queues for
803 all the server targets. This metric is not cumulative, for it is a
804 sample of the queue size, which can (and should) go up and down.
805 Therefore you should not use the derivative function for this met‐
806 ric.
807
808 ○ metricsDropped
809
810 The total number of metric that had to be dropped due to server
811 queues overflowing. A queue typically overflows when the server it
812 tries to send its metrics to is not reachable, or too slow in in‐
813 gesting the amount of metrics queued. This can be network or re‐
814 source related, and also greatly depends on the rate of metrics be‐
815 ing sent to the particular server.
816
817 ○ metricsBlackholed
818
819 The number of metrics that did not match any rule, or matched a
820 rule with blackhole as target. Depending on your configuration, a
821 high value might be an indication of a misconfiguration somewhere.
822 These metrics were received by the relay, but never sent anywhere,
823 thus they disappeared.
824
825 ○ metricStalls
826
827 The number of times the relay had to stall a client to indicate
828 that the downstream server cannot handle the stream of metrics. A
829 stall is only performed when the queue is full and the server is
830 actually receptive of metrics, but just too slow at the moment.
831 Stalls typically happen during micro-bursts, where the client typi‐
832 cally is unaware that it should stop sending more data, while it is
833 able to.
834
835 ○ connections
836
837 The number of connect requests handled. This is an ever increasing
838 number just counting how many connections were accepted.
839
840 ○ disconnects
841
842 The number of disconnected clients. A disconnect either happens be‐
843 cause the client goes away, or due to an idle timeout in the relay.
844 The difference between this metric and connections is the amount of
845 connections actively held by the relay. In normal situations this
846 amount remains within reasonable bounds. Many connections, but few
847 disconnections typically indicate a possible connection leak in the
848 client. The idle connections disconnect in the relay here is to
849 guard against resource drain in such scenarios.
850
851 ○ dispatch_wallTime_us
852
853 The number of microseconds spent by the dispatchers to do their
854 work. In particular on multi-core systems, this value can be con‐
855 fusing, however, it indicates how long the dispatchers were doing
856 work handling clients. It includes everything they do, from reading
857 data from a socket, cleaning up the input metric, to adding the
858 metric to the appropriate queues. The larger the configuration, and
859 more complex in terms of matches, the more time the dispatchers
860 will spend on the cpu. But also time they do /not/ spend on the cpu
861 is included in this number. It is the pure wallclock time the dis‐
862 patcher was serving a client.
863
864 ○ dispatch_sleepTime_us
865
866 The number of microseconds spent by the dispatchers sleeping wait‐
867 ing for work. When this value gets small (or even zero) the dis‐
868 patcher has so much work that it doesn´t sleep any more, and likely
869 can´t process the work in a timely fashion any more. This value
870 plus the wallTime from above sort of sums up to the total uptime
871 taken by this dispatcher. Therefore, expressing the wallTime as
872 percentage of this sum gives the busyness percentage draining all
873 the way up to 100% if sleepTime goes to 0.
874
875 ○ server_wallTime_us
876
877 The number of microseconds spent by the servers to send the metrics
878 from their queues. This value includes connection creation, reading
879 from the queue, and sending metrics over the network.
880
881 ○ dispatcherX
882
883 For each indivual dispatcher, the metrics received and blackholed
884 plus the wall clock time. The values are as described above.
885
886 ○ destinations.X
887
888 For all known destinations, the number of dropped, queued and sent
889 metrics plus the wall clock time spent. The values are as described
890 above.
891
892 ○ aggregators.metricsReceived
893
894 The number of metrics that were matched an aggregator rule and were
895 accepted by the aggregator. When a metric matches multiple aggrega‐
896 tors, this value will reflect that. A metric is not counted when it
897 is considered syntactically invalid, e.g. no value was found.
898
899 ○ aggregators.metricsDropped
900
901 The number of metrics that were sent to an aggregator, but did not
902 fit timewise. This is either because the metric was too far in the
903 past or future. The expire after clause in aggregate statements
904 controls how long in the past metric values are accepted.
905
906 ○ aggregators.metricsSent
907
908 The number of metrics that were sent from the aggregators. These
909 metrics were produced and are the actual results of aggregations.
910
911
912
914 Please report them at: https://github.com/grobian/carbon-c-relay/issues
915
917 Fabian Groffen <grobian@gentoo.org>
918
920 All other utilities from the graphite stack.
921
922 This project aims to be a fast replacement of the original Carbon relay
923 http://graphite.readthedocs.org/en/1.0/carbon-daemons.html#carbon-re‐
924 lay-py. carbon-c-relay aims to deliver performance and configurability.
925 Carbon is single threaded, and sending metrics to multiple consis‐
926 tent-hash clusters requires chaining of relays. This project provides a
927 multithreaded relay which can address multiple targets and clusters for
928 each and every metric based on pattern matches.
929
930 There are a couple more replacement projects out there, which are car‐
931 bon-relay-ng https://github.com/graphite-ng/carbon-relay-ng and
932 graphite-relay https://github.com/markchadwick/graphite-relay.
933
934 Compared to carbon-relay-ng, this project does provide carbon´s consis‐
935 tent-hash routing. graphite-relay, which does this, however doesn´t do
936 metric-based matches to direct the traffic, which this project does as
937 well. To date, carbon-c-relay can do aggregations, failover targets and
938 more.
939
941 This program was originally developed for Booking.com, which approved
942 that the code was published and released as Open Source on GitHub, for
943 which the author would like to express his gratitude. Development has
944 continued since with the help of many contributors suggesting features,
945 reporting bugs, adding patches and more to make carbon-c-relay into
946 what it is today.
947
948
949
950 November 2021 CARBON-C-RELAY(1)