1CARBON-C-RELAY(1) CARBON-C-RELAY(1)
2
3
4
6 carbon-c-relay - graphite relay, aggregator and rewriter
7
8 https://travis-ci.org/grobian/carbon-c-relay
9
11 carbon-c-relay -f config-file [ options ... ]
12
14 carbon-c-relay accepts, cleanses, matches, rewrites, forwards and
15 aggregates graphite metrics by listening for incoming connections and
16 relaying the messages to other servers defined in its configuration.
17 The core functionality is to route messages via flexible rules to the
18 desired destinations.
19
20 carbon-c-relay is a simple program that reads its routing information
21 from a file. The command line arguments allow to set the location for
22 this file, as well as the amount of dispatchers (worker threads) to use
23 for reading the data from incoming connections and passing them onto
24 the right destination(s). The route file supports two main constructs:
25 clusters and matches. The first define groups of hosts data metrics can
26 be sent to, the latter define which metrics should be sent to which
27 cluster. Aggregation rules are seen as matches.
28
29 For every metric received by the relay, cleansing is performed. The
30 following changes are performed before any match, aggregate or rewrite
31 rule sees the metric:
32
33 · double dot elimination (necessary for correctly functioning consis‐
34 tent hash routing)
35
36 · trailing/leading dot elimination
37
38 · whitespace normalisation (this mostly affects output of the relay
39 to other targets: metric, value and timestamp will be separated by
40 a single space only, ever)
41
42 · irregular char replacement with underscores (_), currently irregu‐
43 lar is defined as not being in [0-9a-zA-Z-_:#], but can be overrid‐
44 den on the command line. Note that tags (when present and allowed)
45 are not processed this way.
46
47
48
50 These options control the behaviour of carbon-c-relay.
51
52 · -v: Print version string and exit.
53
54 · -d: Enable debug mode, this prints statistics to stdout and prints
55 extra messages about some situations encountered by the relay that
56 normally would be too verbose to be enabled. When combined with -t
57 (test mode) this also prints stub routes and consistent-hash ring
58 contents.
59
60 · -s: Enable submission mode. In this mode, internal statistics are
61 not generated. Instead, queue pressure and metrics drops are
62 reported on stdout. This mode is useful when used as submission
63 relay which´ job is just to forward to (a set of) main relays. Sta‐
64 tistics about the submission relays in this case are not needed,
65 and could easily cause a non-desired flood of metrics e.g. when
66 used on each and every host locally.
67
68 · -t: Test mode. This mode doesn´t do any routing at all, but instead
69 reads input from stdin and prints what actions would be taken given
70 the loaded configuration. This mode is very useful for testing
71 relay routes for regular expression syntax etc. It also allows to
72 give insight on how routing is applied in complex configurations,
73 for it shows rewrites and aggregates taking place as well. When -t
74 is repeated, the relay will only test the configuration for valid‐
75 ity and exit immediately afterwards. Any standard output is sup‐
76 pressed in this mode, making it ideal for start-scripts to test a
77 (new) configuration.
78
79 · -f config-file: Read configuration from config-file. A configura‐
80 tion consists of clusters and routes. See CONFIGURATION SYNTAX for
81 more information on the options and syntax of this file.
82
83 · -l log-file: Use log-file for writing messages. Without this
84 option, the relay writes both to stdout and stderr. When logging to
85 file, all messages are prefixed with MSG when they were sent to
86 stdout, and ERR when they were sent to stderr.
87
88 · -p port: Listen for connections on port port. The port number is
89 used for both TCP, UDP and UNIX sockets. In the latter case, the
90 socket file contains the port number. The port defaults to 2003,
91 which is also used by the original carbon-cache.py. Note that this
92 only applies to the defaults, when listen directives are in the
93 config, this setting is ignored.
94
95 · -w workers: Use workers number of threads. The default number of
96 workers is equal to the amount of detected CPU cores. It makes
97 sense to reduce this number on many-core machines, or when the
98 traffic is low.
99
100 · -b batchsize: Set the amount of metrics that sent to remote servers
101 at once to batchsize. When the relay sends metrics to servers, it
102 will retrieve batchsize metrics from the pending queue of metrics
103 waiting for that server and send those one by one. The size of the
104 batch will have minimal impact on sending performance, but it con‐
105 trols the amount of lock-contention on the queue. The default is
106 2500.
107
108 · -q queuesize: Each server from the configuration where the relay
109 will send metrics to, has a queue associated with it. This queue
110 allows for disruptions and bursts to be handled. The size of this
111 queue will be set to queuesize which allows for that amount of met‐
112 rics to be stored in the queue before it overflows, and the relay
113 starts dropping metrics. The larger the queue, more metrics can be
114 absorbed, but also more memory will be used by the relay. The
115 default queue size is 25000.
116
117 · -L stalls: Sets the max mount of stalls to stalls before the relay
118 starts dropping metrics for a server. When a queue fills up, the
119 relay uses a mechanism called stalling to signal the client (writ‐
120 ing to the relay) of this event. In particular when the client
121 sends a large amount of metrics in very short time (burst),
122 stalling can help to avoid dropping metrics, since the client just
123 needs to slow down for a bit, which in many cases is possible (e.g.
124 when catting a file with nc(1)). However, this behaviour can also
125 obstruct, artificially stalling writers which cannot stop that eas‐
126 ily. For this the stalls can be set from 0 to 15, where each stall
127 can take around 1 second on the client. The default value is set to
128 4, which is aimed at the occasional disruption scenario and max
129 effort to not loose metrics with moderate slowing down of clients.
130
131 · -C CAcertpath: Read CA certs (for use with TLS/SSL connections)
132 from given path or file. When not given, the default locations are
133 used. Strict verfication of the peer is performed, so when using
134 self-signed certificates, be sure to include the CA cert in the
135 default location, or provide the path to the cert using this
136 option.
137
138 · -T timeout: Specifies the IO timeout in milliseconds used for
139 server connections. The default is 600 milliseconds, but may need
140 increasing when WAN links are used for target servers. A relatively
141 low value for connection timeout allows the relay to quickly estab‐
142 lish a server is unreachable, and as such failover strategies to
143 kick in before the queue runs high.
144
145 · -c chars: Defines the characters that are next to [A-Za-z0-9]
146 allowed in metrics to chars. Any character not in this list, is
147 replaced by the relay with _ (underscore). The default list of
148 allowed characters is -_:#.
149
150 · -m length: Limits the metric names to be of at most length bytes
151 long. Any lines containing metric names larger than this will be
152 discarded.
153
154 · -M length Limits the input to lines of at most length bytes. Any
155 excess lines will be discarded. Note that -m needs to be smaller
156 than this value.
157
158 · -H hostname: Override hostname determined by a call to gethost‐
159 name(3) with hostname. The hostname is used mainly in the statis‐
160 tics metrics carbon.relays.<hostname>.<...> sent by the relay.
161
162 · -B backlog: Sets TCP connection listen backlog to backlog connec‐
163 tions. The default value is 32 but on servers which receive many
164 concurrent connections, this setting likely needs to be increased
165 to avoid connection refused errors on the clients.
166
167 · -U bufsize: Sets the socket send/receive buffer sizes in bytes, for
168 both TCP and UDP scenarios. When unset, the OS default is used. The
169 maximum is also determined by the OS. The sizes are set using set‐
170 sockopt with the flags SO_RCVBUF and SO_SNDBUF. Setting this size
171 may be necessary for large volume scenarios, for which also -B
172 might apply. Checking the Recv-Q and the receive errors values from
173 netstat gives a good hint about buffer usage.
174
175 · -E: Disable disconnecting idle incoming connections. By default the
176 relay disconnects idle client connections after 10 minutes. It does
177 this to prevent resources clogging up when a faulty or malicious
178 client keeps on opening connections without closing them. It typi‐
179 cally prevents running out of file descriptors. For some scenarios,
180 however, it is not desirable for idle connections to be discon‐
181 nected, hence passing this flag will disable this behaviour.
182
183 · -D: Deamonise into the background after startup. This option
184 requires -l and -P flags to be set as well.
185
186 · -P pidfile: Write the pid of the relay process to a file called
187 pidfile. This is in particular useful when daemonised in combina‐
188 tion with init managers.
189
190 · -O threshold: The minimum number of rules to find before trying to
191 optimise the ruleset. The default is 50, to disable the optimiser,
192 use -1, to always run the optimiser use 0. The optimiser tries to
193 group rules to avoid spending excessive time on matching expres‐
194 sions.
195
196
197
199 The config file supports the following syntax, where comments start
200 with a # character and can appear at any position on a line and sup‐
201 press input until the end of that line:
202
203
204
205 cluster <name>
206 < <forward | any_of | failover> [useall] |
207 <carbon_ch | fnv1a_ch | jump_fnv1a_ch> [replication <count>] [dynamic] >
208 <host[:port][=instance] [proto <udp | tcp>]
209 [type linemode]
210 [transport <plain | gzip | lz4 | snappy>
211 [ssl]]> ...
212 ;
213
214 cluster <name>
215 file [ip]
216 </path/to/file> ...
217 ;
218
219 match
220 <* | expression ...>
221 [validate <expression> else <log | drop>]
222 send to <cluster ... | blackhole>
223 [stop]
224 ;
225
226 rewrite <expression>
227 into <replacement>
228 ;
229
230 aggregate
231 <expression> ...
232 every <interval> seconds
233 expire after <expiration> seconds
234 [timestamp at <start | middle | end> of bucket]
235 compute <sum | count | max | min | average |
236 median | percentile<%> | variance | stddev> write to
237 <metric>
238 [compute ...]
239 [send to <cluster ...>]
240 [stop]
241 ;
242
243 send statistics to <cluster ...>
244 [stop]
245 ;
246 statistics
247 [submit every <interval> seconds]
248 [reset counters after interval]
249 [prefix with <prefix>]
250 [send to <cluster ...>]
251 [stop]
252 ;
253
254 listen
255 type linemode [transport <plain | gzip | lz4 | snappy> [ssl <pemcert>]]
256 <<interface[:port] | port> proto <udp | tcp>> ...
257 </ptah/to/file proto unix> ...
258 ;
259
260 include </path/to/file/or/glob>
261 ;
262
263
264
265 CLUSTERS
266 Multiple clusters can be defined, and need not to be referenced by a
267 match rule. All clusters point to one or more hosts, except the file
268 cluster which writes to files in the local filesystem. host may be an
269 IPv4 or IPv6 address, or a hostname. Since host is followed by an
270 optional : and port, for IPv6 addresses not to be interpreted wrongly,
271 either a port must be given, or the IPv6 address surrounded by brack‐
272 ets, e.g. [::1]. Optional transport and proto clauses can be used to
273 wrap the connection in a compression or encryption later or specify the
274 use of UDP or TCP to connect to the remote server. When omitted the
275 connection defaults to an unwrapped TCP connection. type can only be
276 linemode at the moment.
277
278 The forward and file clusters simply send everything they receive to
279 all defined members (host addresses or files). The any_of cluster is a
280 small variant of the forward cluster, but instead of sending to all
281 defined members, it sends each incoming metric to one of defined mem‐
282 bers. This is not much useful in itself, but since any of the members
283 can receive each metric, this means that when one of the members is
284 unreachable, the other members will receive all of the metrics. This
285 can be useful when the cluster points to other relays. The any_of
286 router tries to send the same metrics consistently to the same destina‐
287 tion. The failover cluster is like the any_of cluster, but sticks to
288 the order in which servers are defined. This is to implement a pure
289 failover scenario between servers. The carbon_ch cluster sends the met‐
290 rics to the member that is responsible according to the consistent hash
291 algorithm (as used in the original carbon), or multiple members if
292 replication is set to more than 1. When dynamic is set, failure of any
293 of the servers does not result in metrics being dropped for that
294 server, but instead the undeliverable metrics are sent to any other
295 server in the cluster in order for the metrics not to get lost. This is
296 most useful when replication is 1. The fnv1a_ch cluster is a identical
297 in behaviour to carbon_ch, but it uses a different hash technique
298 (FNV1a) which is faster but more importantly defined to get by a limi‐
299 tation of carbon_ch to use both host and port from the members. This is
300 useful when multiple targets live on the same host just separated by
301 port. The instance that original carbon uses to get around this can be
302 set by appending it after the port, separated by an equals sign, e.g.
303 127.0.0.1:2006=a for instance a. When using the fnv1a_ch cluster, this
304 instance overrides the hash key in use. This allows for many things,
305 including masquerading old IP addresses, but mostly to make the hash
306 key location to become agnostic of the (physical) location of that key.
307 For example, usage like 10.0.0.1:2003=4d79d13554fa1301476c1f9fe968b0ac
308 would allow to change port and/or ip address of the server that
309 receives data for the instance key. Obviously, this way migration of
310 data can be dealt with much more conveniently. The jump_fnv1a_ch clus‐
311 ter is also a consistent hash cluster like the previous two, but it
312 does not take the server information into account at all. Whether this
313 is useful to you depends on your scenario. The jump hash has a much
314 better balancing over the servers defined in the cluster, at the
315 expense of not being able to remove any server but the last in order.
316 What this means is that this hash is fine to use with ever growing
317 clusters where older nodes are also replaced at some point. If you have
318 a cluster where removal of old nodes takes place often, the jump hash
319 is not suitable for you. Jump hash works with servers in an ordered
320 list without gaps. To influence the ordering, the instance given to the
321 server will be used as sorting key. Without, the order will be as given
322 in the file. It is a good practice to fix the order of the servers with
323 instances such that it is explicit what the right nodes for the jump
324 hash are.
325
326 DNS hostnames are resolved to a single address, according to the pref‐
327 erence rules in RFC 3484 https://www.ietf.org/rfc/rfc3484.txt. The
328 any_of, failover and forward clusters have an explicit useall flag that
329 enables expansion for hostnames resolving to multiple addresses. Each
330 address returned becomes a cluster destination.
331
332 MATCHES
333 Match rules are the way to direct incoming metrics to one or more clus‐
334 ters. Match rules are processed top to bottom as they are defined in
335 the file. It is possible to define multiple matches in the same rule.
336 Each match rule can send data to one or more clusters. Since match
337 rules "fall through" unless the stop keyword is added, carefully
338 crafted match expression can be used to target multiple clusters or
339 aggregations. This ability allows to replicate metrics, as well as send
340 certain metrics to alternative clusters with careful ordering and usage
341 of the stop keyword. The special cluster blackhole discards any metrics
342 sent to it. This can be useful for weeding out unwanted metrics in cer‐
343 tain cases. Because throwing metrics away is pointless if other matches
344 would accept the same data, a match with as destination the blackhole
345 cluster, has an implicit stop. The validation clause adds a check to
346 the data (what comes after the metric) in the form of a regular expres‐
347 sion. When this expression matches, the match rule will execute as if
348 no validation clause was present. However, if it fails, the match rule
349 is aborted, and no metrics will be sent to destinations, this is the
350 drop behaviour. When log is used, the metric is logged to stderr. Care
351 should be taken with the latter to avoid log flooding. When a validate
352 clause is present, destinations need not to be present, this allows for
353 applying a global validation rule. Note that the cleansing rules are
354 applied before validation is done, thus the data will not have dupli‐
355 cate spaces. The route using clause is used to perform a temporary mod‐
356 ification to the key used for input to the consistent hashing routines.
357 The primary purpose is to route traffic so that appropriate data is
358 sent to the needed aggregation instances.
359
360 REWRITES
361 Rewrite rules take a regular expression as input to match incoming met‐
362 rics, and transform them into the desired new metric name. In the
363 replacement, backreferences are allowed to match capture groups defined
364 in the input regular expression. A match of server\.(x|y|z)\. allows to
365 use e.g. role.\1. in the substitution. A few caveats apply to the cur‐
366 rent implementation of rewrite rules. First, their location in the con‐
367 fig file determines when the rewrite is performed. The rewrite is done
368 in-place, as such a match rule before the rewrite would match the orig‐
369 inal name, a match rule after the rewrite no longer matches the origi‐
370 nal name. Care should be taken with the ordering, as multiple rewrite
371 rules in succession can take place, e.g. a gets replaced by b and b
372 gets replaced by c in a succeeding rewrite rule. The second caveat with
373 the current implementation, is that the rewritten metric names are not
374 cleansed, like newly incoming metrics are. Thus, double dots and poten‐
375 tial dangerous characters can appear if the replacement string is
376 crafted to produce them. It is the responsibility of the writer to make
377 sure the metrics are clean. If this is an issue for routing, one can
378 consider to have a rewrite-only instance that forwards all metrics to
379 another instance that will do the routing. Obviously the second
380 instance will cleanse the metrics as they come in. The backreference
381 notation allows to lowercase and uppercase the replacement string with
382 the use of the underscore (_) and carret (^) symbols following directly
383 after the backslash. For example, role.\_1. as substitution will lower‐
384 case the contents of \1. The dot (.) can be used in a similar fashion,
385 or followed after the underscore or caret to replace dots with under‐
386 scores in the substitution. This can be handy for some situations where
387 metrics are sent to graphite.
388
389 AGGREGATIONS
390 The aggregations defined take one or more input metrics expressed by
391 one or more regular expresions, similar to the match rules. Incoming
392 metrics are aggregated over a period of time defined by the interval in
393 seconds. Since events may arrive a bit later in time, the expiration
394 time in seconds defines when the aggregations should be considered
395 final, as no new entries are allowed to be added any more. On top of an
396 aggregation multiple aggregations can be computed. They can be of the
397 same or different aggregation types, but should write to a unique new
398 metric. The metric names can include back references like in rewrite
399 expressions, allowing for powerful single aggregation rules that yield
400 in many aggregations. When no send to clause is given, produced metrics
401 are sent to the relay as if they were submitted from the outside, hence
402 match and aggregation rules apply to those. Care should be taken that
403 loops are avoided this way. For this reason, the use of the send to
404 clause is encouraged, to direct the output traffic where possible. Like
405 for match rules, it is possible to define multiple cluster targets.
406 Also, like match rules, the stop keyword applies to control the flow of
407 metrics in the matching process.
408
409 STATISTICS
410 The send statistics to construct is deprecated and will be removed in
411 the next release. Use the special statistics construct instead.
412
413 The statistics construct can control a couple of things about the
414 (internal) statistics produced by the relay. The send to target can be
415 used to avoid router loops by sending the statistics to a certain des‐
416 tination cluster(s). By default the metrics are prefixed with car‐
417 bon.relays.<hostname>, where hostname is determinted on startup and can
418 be overridden using the -H argument. This prefix can be set using the
419 prefix with clause similar to a rewrite rule target. The input match in
420 this case is the pre-set regular expression ^(([^.]+)(\..*)?)$ on the
421 hostname. As such, one can see that the default prefix is set by car‐
422 bon.relays.\.1. Note that this uses the replace-dot-with-underscore
423 replacement feature from rewrite rules. Given the input expression, the
424 following match groups are available: \1 the entire hostname, \2 the
425 short hostname and \3 the domainname (with leading dot). It may make
426 sense to replace the default by something like carbon.relays.\_2 for
427 certain scenarios, to always use the lowercased short hostname, which
428 following the expression doesn´t contain a dot. By default, the metrics
429 are submitted every 60 seconds, this can be changed using the submit
430 every <interval> seconds clause.
431 To obtain a more compatible set of values to carbon-cache.py, use the
432 reset counters after interval clause to make values non-cumulative,
433 that is, they will report the change compared to the previous value.
434
435 LISTENERS
436 The ports and protocols the relay should listen for incoming connec‐
437 tions can be specified using the listen directive. Currently, all lis‐
438 teners need to be of linemode type. An optional compression or encryp‐
439 tion wrapping can be specified for the port and optional interface
440 given by ip address, or unix socket by file. When interface is not
441 specified, the any interface on all available ip protocols is assumed.
442 If no listen directive is present, the relay will use the default lis‐
443 teners for port 2003 on tcp and udp, plus the unix socket /tmp/.s.car‐
444 bon-c-relay.2003. This typically expands to 5 listeners on an IPv6
445 enabled system. The default matches the behaviour of versions prior to
446 v3.2.
447
448 INCLUDES
449 In case configuration becomes very long, or is managed better in sepa‐
450 rate files, the include directive can be used to read another file. The
451 given file will be read in place and added to the router configuration
452 at the time of inclusion. The end result is one big route configura‐
453 tion. Multiple include statements can be used throughout the configura‐
454 tion file. The positioning will influence the order of rules as normal.
455 Beware that recursive inclusion (include from an included file) is sup‐
456 ported, and currently no safeguards exist for an inclusion loop. For
457 what is worth, this feature likely is best used with simple configura‐
458 tion files (e.g. not having include in them).
459
461 carbon-c-relay evolved over time, growing features on demand as the
462 tool proved to be stable and fitting the job well. Below follow some
463 annotated examples of constructs that can be used with the relay.
464
465 Clusters can be defined as much as necessary. They receive data from
466 match rules, and their type defines which members of the cluster
467 finally get the metric data. The simplest cluster form is a forward
468 cluster:
469
470
471
472 cluster send-through
473 forward
474 10.1.0.1
475 ;
476
477
478
479 Any metric sent to the send-through cluster would simply be forwarded
480 to the server at IPv4 address 10.1.0.1. If we define multiple servers,
481 all of those servers would get the same metric, thus:
482
483
484
485 cluster send-through
486 forward
487 10.1.0.1
488 10.2.0.1
489 ;
490
491
492
493 The above results in a duplication of metrics send to both machines.
494 This can be useful, but most of the time it is not. The any_of cluster
495 type is like forward, but it sends each incoming metric to any of the
496 members. The same example with such cluster would be:
497
498
499
500 cluster send-to-any-one
501 any_of 10.1.0.1:2010 10.1.0.1:2011;
502
503
504
505 This would implement a multipath scenario, where two servers are used,
506 the load between them is spread, but should any of them fail, all met‐
507 rics are sent to the remaining one. This typically works well for
508 upstream relays, or for balancing carbon-cache processes running on the
509 same machine. Should any member become unavailable, for instance due to
510 a rolling restart, the other members receive the traffic. If it is nec‐
511 essary to have true fail-over, where the secondary server is only used
512 if the first is down, the following would implement that:
513
514
515
516 cluster try-first-then-second
517 failover 10.1.0.1:2010 10.1.0.1:2011;
518
519
520
521 These types are different from the two consistent hash cluster types:
522
523
524
525 cluster graphite
526 carbon_ch
527 127.0.0.1:2006=a
528 127.0.0.1:2007=b
529 127.0.0.1:2008=c
530 ;
531
532
533
534 If a member in this example fails, all metrics that would go to that
535 member are kept in the queue, waiting for the member to return. This is
536 useful for clusters of carbon-cache machines where it is desirable that
537 the same metric ends up on the same server always. The carbon_ch clus‐
538 ter type is compatible with carbon-relay consistent hash, and can be
539 used for existing clusters populated by carbon-relay. For new clusters,
540 however, it is better to use the fnv1a_ch cluster type, for it is
541 faster, and allows to balance over the same address but different ports
542 without an instance number, in constrast to carbon_ch.
543
544 Because we can use multiple clusters, we can also replicate without the
545 use of the forward cluster type, in a more intelligent way:
546
547
548
549 cluster dc-old
550 carbon_ch replication 2
551 10.1.0.1
552 10.1.0.2
553 10.1.0.3
554 ;
555 cluster dc-new1
556 fnv1a_ch replication 2
557 10.2.0.1
558 10.2.0.2
559 10.2.0.3
560 ;
561 cluster dc-new2
562 fnv1a_ch replication 2
563 10.3.0.1
564 10.3.0.2
565 10.3.0.3
566 ;
567
568 match *
569 send to dc-old
570 ;
571 match *
572 send to
573 dc-new1
574 dc-new2
575 stop
576 ;
577
578
579
580 In this example all incoming metrics are first sent to dc-old, then
581 dc-new1 and finally to dc-new2. Note that the cluster type of dc-old is
582 different. Each incoming metric will be send to 2 members of all three
583 clusters, thus replicating to in total 6 destinations. For each cluster
584 the destination members are computed independently. Failure of clusters
585 or members does not affect the others, since all have individual
586 queues. The above example could also be written using three match rules
587 for each dc, or one match rule for all three dcs. The difference is
588 mainly in performance, the number of times the incoming metric has to
589 be matched against an expression. The stop rule in dc-new match rule is
590 not strictly necessary in this example, because there are no more fol‐
591 lowing match rules. However, if the match would target a specific sub‐
592 set, e.g. ^sys\., and more clusters would be defined, this could be
593 necessary, as for instance in the following abbreviated example:
594
595
596
597 cluster dc1-sys ... ;
598 cluster dc2-sys ... ;
599
600 cluster dc1-misc ... ;
601 cluster dc2-misc ... ;
602
603 match ^sys\. send to dc1-sys;
604 match ^sys\. send to dc2-sys stop;
605
606 match * send to dc1-misc;
607 match * send to dc2-misc stop;
608
609
610
611 As can be seen, without the stop in dc2-sys´ match rule, all metrics
612 starting with sys. would also be send to dc1-misc and dc2-misc. It can
613 be that this is desired, of course, but in this example there is a ded‐
614 icated cluster for the sys metrics.
615
616 Suppose there would be some unwanted metric that unfortunately is gen‐
617 erated, let´s assume some bad/old software. We don´t want to store this
618 metric. The blackhole cluster is suitable for that, when it is harder
619 to actually whitelist all wanted metrics. Consider the following:
620
621
622
623 match
624 some_legacy1$
625 some_legacy2$
626 send to blackhole
627 stop;
628
629
630
631 This would throw away all metrics that end with some_legacy, that would
632 otherwise be hard to filter out. Since the order matters, it can be
633 used in a construct like this:
634
635
636
637 cluster old ... ;
638 cluster new ... ;
639
640 match * send to old;
641
642 match unwanted send to blackhole stop;
643
644 match * send to new;
645
646
647
648 In this example the old cluster would receive the metric that´s
649 unwanted for the new cluster. So, the order in which the rules occur
650 does matter for the execution.
651
652 Validation can be used to ensure the data for metrics is as expected. A
653 global validation for just number (no floating point) values could be:
654
655
656
657 match *
658 validate ^[0-9]+\ [0-9]+$ else drop
659 ;
660
661
662
663 (Note the escape with backslash \ of the space, you might be able to
664 use \s or [:space:] instead, this depends on your libc implementation.)
665
666 The validation clause can exist on every match rule, so in principle,
667 the following is valid:
668
669
670
671 match ^foo
672 validate ^[0-9]+\ [0-9]+$ else drop
673 send to integer-cluster
674 ;
675 match ^foo
676 validate ^[0-9.e+-]+\ [0-9.e+-]+$ else drop
677 send to float-cluster
678 stop;
679
680
681
682 Note that the behaviour is different in the previous two examples. When
683 no send to clusters are specified, a validation error makes the match
684 behave like the stop keyword is present. Likewise, when validation
685 passes, processing continues with the next rule. When destination clus‐
686 ters are present, the match respects the stop keyword as normal. When
687 specified, processing will always stop when specified so. However, if
688 validation fails, the rule does not send anything to the destination
689 clusters, the metric will be dropped or logged, but never sent.
690
691 The relay is capable of rewriting incoming metrics on the fly. This
692 process is done based on regular expressions with capture groups that
693 allow to substitute parts in a replacement string. Rewrite rules allow
694 to cleanup metrics from applications, or provide a migration path. In
695 it´s simplest form a rewrite rule looks like this:
696
697
698
699 rewrite ^server\.(.+)\.(.+)\.([a-zA-Z]+)([0-9]+)
700 into server.\_1.\2.\3.\3\4
701 ;
702
703
704
705 In this example a metric like server.DC.role.name123 would be trans‐
706 formed into server.dc.role.name.name123. For rewrite rules hold the
707 same as for matches, that their order matters. Hence to build on top of
708 the old/new cluster example done earlier, the following would store the
709 original metric name in the old cluster, and the new metric name in the
710 new cluster:
711
712
713
714 match * send to old;
715
716 rewrite ... ;
717
718 match * send to new;
719
720
721
722 Note that after the rewrite, the original metric name is no longer
723 available, as the rewrite happens in-place.
724
725 Aggregations are probably the most complex part of carbon-c-relay. Two
726 ways of specifying aggregates are supported by carbon-c-relay. The
727 first, static rules, are handled by an optimiser which tries to fold
728 thousands of rules into groups to make the matching more efficient. The
729 second, dynamic rules, are very powerful compact definitions with pos‐
730 sibly thousands of internal instantiations. A typical static aggrega‐
731 tion looks like:
732
733
734
735 aggregate
736 ^sys\.dc1\.somehost-[0-9]+\.somecluster\.mysql\.replication_delay
737 ^sys\.dc2\.somehost-[0-9]+\.somecluster\.mysql\.replication_delay
738 every 10 seconds
739 expire after 35 seconds
740 timestamp at end of bucket
741 compute sum write to
742 mysql.somecluster.total_replication_delay
743 compute average write to
744 mysql.somecluster.average_replication_delay
745 compute max write to
746 mysql.somecluster.max_replication_delay
747 compute count write to
748 mysql.somecluster.replication_delay_metric_count
749 ;
750
751
752
753 In this example, four aggregations are produced from the incoming
754 matching metrics. In this example we could have written the two matches
755 as one, but for demonstration purposes we did not. Obviously they can
756 refer to different metrics, if that makes sense. The every 10 seconds
757 clause specifies in what interval the aggregator can expect new metrics
758 to arrive. This interval is used to produce the aggregations, thus each
759 10 seconds 4 new metrics are generated from the data received sofar.
760 Because data may be in transit for some reason, or generation stalled,
761 the expire after clause specifies how long the data should be kept
762 before considering a data bucket (which is aggregated) to be complete.
763 In the example, 35 was used, which means after 35 seconds the first
764 aggregates are produced. It also means that metrics can arrive 35 sec‐
765 onds late, and still be taken into account. The exact time at which the
766 aggregate metrics are produced is random between 0 and interval (10 in
767 this case) seconds after the expiry time. This is done to prevent thun‐
768 dering herds of metrics for large aggregation sets. The timestamp that
769 is used for the aggregations can be specified to be the start, middle
770 or end of the bucket. Original carbon-aggregator.py uses start, while
771 carbon-c-relay´s default has always been end. The compute clauses
772 demonstrate a single aggregation rule can produce multiple aggregates,
773 as often is the case. Internally, this comes for free, since all possi‐
774 ble aggregates are always calculated, whether or not they are used. The
775 produced new metrics are resubmitted to the relay, hence matches
776 defined before in the configuration can match output of the aggregator.
777 It is important to avoid loops, that can be generated this way. In gen‐
778 eral, splitting aggregations to their own carbon-c-relay instance, such
779 that it is easy to forward the produced metrics to another relay
780 instance is a good practice.
781
782 The previous example could also be written as follows to be dynamic:
783
784
785
786 aggregate
787 ^sys\.dc[0-9].(somehost-[0-9]+)\.([^.]+)\.mysql\.replication_delay
788 every 10 seconds
789 expire after 35 seconds
790 compute sum write to
791 mysql.host.\1.replication_delay
792 compute sum write to
793 mysql.host.all.replication_delay
794 compute sum write to
795 mysql.cluster.\2.replication_delay
796 compute sum write to
797 mysql.cluster.all.replication_delay
798 ;
799
800
801
802 Here a single match, results in four aggregations, each of a different
803 scope. In this example aggregation based on hostname and cluster are
804 being made, as well as the more general all targets, which in this
805 example have both identical values. Note that with this single aggrega‐
806 tion rule, both per-cluster, per-host and total aggregations are pro‐
807 duced. Obviously, the input metrics define which hosts and clusters are
808 produced.
809
810 With use of the send to clause, aggregations can be made more intuitive
811 and less error-prone. Consider the below example:
812
813
814
815 cluster graphite fnv1a_ch ip1 ip2 ip3;
816
817 aggregate ^sys\.somemetric
818 every 60 seconds
819 expire after 75 seconds
820 compute sum write to
821 sys.somemetric
822 send to graphite
823 stop
824 ;
825
826 match * send to graphite;
827
828
829
830 It sends all incoming metrics to the graphite cluster, except the
831 sys.somemetric ones, which it replaces with a sum of all the incoming
832 ones. Without a stop in the aggregate, this causes a loop, and without
833 the send to, the metric name can´t be kept its original name, for the
834 output now directly goes to the cluster.
835
837 When carbon-c-relay is run without -d or -s arguments, statistics will
838 be produced. By default they are sent to the relay itself in the form
839 of carbon.relays.<hostname>.*. See the statistics construct to override
840 this prefix, sending interval and values produced. While many metrics
841 have a similar name to what carbon-cache.py would produce, their values
842 are likely different. By default, most values are running counters
843 which only increase over time. The use of the nonNegativeDerivative()
844 function from graphite is useful with these.
845
846 The following metrics are produced under the carbon.relays.<hostname>
847 namespace:
848
849 · metricsReceived
850
851 The number of metrics that were received by the relay. Received
852 here means that they were seen and processed by any of the dis‐
853 patchers.
854
855 · metricsSent
856
857 The number of metrics that were sent from the relay. This is a
858 total count for all servers combined. When incoming metrics are
859 duplicated by the cluster configuration, this counter will include
860 all those duplications. In other words, the amount of metrics that
861 were successfully sent to other systems. Note that metrics that are
862 processed (received) but still in the sending queue (queued) are
863 not included in this counter.
864
865 · metricsDiscarded
866
867 The number of input lines that were not considered to be a valid
868 metric. Such lines can be empty, only containing whitespace, or
869 hitting the limits given for max input length and/or max metric
870 length (see -m and -M options).
871
872 · metricsQueued
873
874 The total number of metrics that are currently in the queues for
875 all the server targets. This metric is not cumulative, for it is a
876 sample of the queue size, which can (and should) go up and down.
877 Therefore you should not use the derivative function for this met‐
878 ric.
879
880 · metricsDropped
881
882 The total number of metric that had to be dropped due to server
883 queues overflowing. A queue typically overflows when the server it
884 tries to send its metrics to is not reachable, or too slow in
885 ingesting the amount of metrics queued. This can be network or
886 resource related, and also greatly depends on the rate of metrics
887 being sent to the particular server.
888
889 · metricsBlackholed
890
891 The number of metrics that did not match any rule, or matched a
892 rule with blackhole as target. Depending on your configuration, a
893 high value might be an indication of a misconfiguration somewhere.
894 These metrics were received by the relay, but never sent anywhere,
895 thus they disappeared.
896
897 · metricStalls
898
899 The number of times the relay had to stall a client to indicate
900 that the downstream server cannot handle the stream of metrics. A
901 stall is only performed when the queue is full and the server is
902 actually receptive of metrics, but just too slow at the moment.
903 Stalls typically happen during micro-bursts, where the client typi‐
904 cally is unaware that it should stop sending more data, while it is
905 able to.
906
907 · connections
908
909 The number of connect requests handled. This is an ever increasing
910 number just counting how many connections were accepted.
911
912 · disconnects
913
914 The number of disconnected clients. A disconnect either happens
915 because the client goes away, or due to an idle timeout in the
916 relay. The difference between this metric and connections is the
917 amount of connections actively held by the relay. In normal situa‐
918 tions this amount remains within reasonable bounds. Many connec‐
919 tions, but few disconnections typically indicate a possible connec‐
920 tion leak in the client. The idle connections disconnect in the
921 relay here is to guard against resource drain in such scenarios.
922
923 · dispatch_wallTime_us
924
925 The number of microseconds spent by the dispatchers to do their
926 work. In particular on multi-core systems, this value can be con‐
927 fusing, however, it indicates how long the dispatchers were doing
928 work handling clients. It includes everything they do, from reading
929 data from a socket, cleaning up the input metric, to adding the
930 metric to the appropriate queues. The larger the configuration, and
931 more complex in terms of matches, the more time the dispatchers
932 will spend on the cpu. But also time they do /not/ spend on the cpu
933 is included in this number. It is the pure wallclock time the dis‐
934 patcher was serving a client.
935
936 · dispatch_sleepTime_us
937
938 The number of microseconds spent by the dispatchers sleeping wait‐
939 ing for work. When this value gets small (or even zero) the dis‐
940 patcher has so much work that it doesn´t sleep any more, and likely
941 can´t process the work in a timely fashion any more. This value
942 plus the wallTime from above sort of sums up to the total uptime
943 taken by this dispatcher. Therefore, expressing the wallTime as
944 percentage of this sum gives the busyness percentage draining all
945 the way up to 100% if sleepTime goes to 0.
946
947 · server_wallTime_us
948
949 The number of microseconds spent by the servers to send the metrics
950 from their queues. This value includes connection creation, reading
951 from the queue, and sending metrics over the network.
952
953 · dispatcherX
954
955 For each indivual dispatcher, the metrics received and blackholed
956 plus the wall clock time. The values are as described above.
957
958 · destinations.X
959
960 For all known destinations, the number of dropped, queued and sent
961 metrics plus the wall clock time spent. The values are as described
962 above.
963
964 · aggregators.metricsReceived
965
966 The number of metrics that were matched an aggregator rule and were
967 accepted by the aggregator. When a metric matches multiple aggrega‐
968 tors, this value will reflect that. A metric is not counted when it
969 is considered syntactically invalid, e.g. no value was found.
970
971 · aggregators.metricsDropped
972
973 The number of metrics that were sent to an aggregator, but did not
974 fit timewise. This is either because the metric was too far in the
975 past or future. The expire after clause in aggregate statements
976 controls how long in the past metric values are accepted.
977
978 · aggregators.metricsSent
979
980 The number of metrics that were sent from the aggregators. These
981 metrics were produced and are the actual results of aggregations.
982
983
984
986 Please report them at: https://github.com/grobian/carbon-c-relay/issues
987
989 Fabian Groffen <grobian@gentoo.org>
990
992 All other utilities from the graphite stack.
993
994 This project aims to be a fast replacement of the original Carbon relay
995 http://graphite.readthedocs.org/en/1.0/carbon-daemons.html#car‐
996 bon-relay-py. carbon-c-relay aims to deliver performance and configura‐
997 bility. Carbon is single threaded, and sending metrics to multiple con‐
998 sistent-hash clusters requires chaining of relays. This project pro‐
999 vides a multithreaded relay which can address multiple targets and
1000 clusters for each and every metric based on pattern matches.
1001
1002 There are a couple more replacement projects out there, which are car‐
1003 bon-relay-ng https://github.com/graphite-ng/carbon-relay-ng and
1004 graphite-relay https://github.com/markchadwick/graphite-relay.
1005
1006 Compared to carbon-relay-ng, this project does provide carbon´s consis‐
1007 tent-hash routing. graphite-relay, which does this, however doesn´t do
1008 metric-based matches to direct the traffic, which this project does as
1009 well. To date, carbon-c-relay can do aggregations, failover targets and
1010 more.
1011
1013 This program was originally developed for Booking.com, which approved
1014 that the code was published and released as Open Source on GitHub, for
1015 which the author would like to express his gratitude. Development has
1016 continued since with the help of many contributors suggesting features,
1017 reporting bugs, adding patches and more to make carbon-c-relay into
1018 what it is today.
1019
1020
1021
1022 October 2019 CARBON-C-RELAY(1)