1CARBON-C-RELAY(1) General Commands Manual CARBON-C-RELAY(1)
2
3
4
6 carbon-c-relay - graphite relay, aggregator and rewriter
7
8 https://travis-ci.org/grobian/carbon-c-relay
9
11 carbon-c-relay -f config-file [ options ... ]
12
14 carbon-c-relay accepts, cleanses, matches, rewrites, forwards and ag‐
15 gregates graphite metrics by listening for incoming connections and re‐
16 laying the messages to other servers defined in its configuration. The
17 core functionality is to route messages via flexible rules to the de‐
18 sired destinations.
19
20 carbon-c-relay is a simple program that reads its routing information
21 from a file. The command line arguments allow to set the location for
22 this file, as well as the amount of dispatchers (worker threads) to use
23 for reading the data from incoming connections and passing them onto
24 the right destination(s). The route file supports two main constructs:
25 clusters and matches. The first define groups of hosts data metrics can
26 be sent to, the latter define which metrics should be sent to which
27 cluster. Aggregation rules are seen as matches.
28
29 For every metric received by the relay, cleansing is performed. The
30 following changes are performed before any match, aggregate or rewrite
31 rule sees the metric:
32
33 ○ double dot elimination (necessary for correctly functioning consis‐
34 tent hash routing)
35
36 ○ trailing/leading dot elimination
37
38 ○ whitespace normalisation (this mostly affects output of the relay
39 to other targets: metric, value and timestamp will be separated by
40 a single space only, ever)
41
42 ○ irregular char replacement with underscores (_), currently irregu‐
43 lar is defined as not being in [0-9a-zA-Z-_:#], but can be overrid‐
44 den on the command line. Note that tags (when present and allowed)
45 are not processed this way.
46
47
48
50 These options control the behaviour of carbon-c-relay.
51
52 ○ -v: Print version string and exit.
53
54 ○ -d: Enable debug mode, this prints statistics to stdout and prints
55 extra messages about some situations encountered by the relay that
56 normally would be too verbose to be enabled. When combined with -t
57 (test mode) this also prints stub routes and consistent-hash ring
58 contents.
59
60 ○ -s: Enable submission mode. In this mode, internal statistics are
61 not generated. Instead, queue pressure and metrics drops are re‐
62 ported on stdout. This mode is useful when used as submission relay
63 which´ job is just to forward to (a set of) main relays. Statistics
64 about the submission relays in this case are not needed, and could
65 easily cause a non-desired flood of metrics e.g. when used on each
66 and every host locally.
67
68 ○ -t: Test mode. This mode doesn´t do any routing at all, but instead
69 reads input from stdin and prints what actions would be taken given
70 the loaded configuration. This mode is very useful for testing re‐
71 lay routes for regular expression syntax etc. It also allows to
72 give insight on how routing is applied in complex configurations,
73 for it shows rewrites and aggregates taking place as well. When -t
74 is repeated, the relay will only test the configuration for valid‐
75 ity and exit immediately afterwards. Any standard output is sup‐
76 pressed in this mode, making it ideal for start-scripts to test a
77 (new) configuration.
78
79 ○ -f config-file: Read configuration from config-file. A configura‐
80 tion consists of clusters and routes. See CONFIGURATION SYNTAX for
81 more information on the options and syntax of this file.
82
83 ○ -l log-file: Use log-file for writing messages. Without this op‐
84 tion, the relay writes both to stdout and stderr. When logging to
85 file, all messages are prefixed with MSG when they were sent to
86 stdout, and ERR when they were sent to stderr.
87
88 ○ -p port: Listen for connections on port port. The port number is
89 used for both TCP, UDP and UNIX sockets. In the latter case, the
90 socket file contains the port number. The port defaults to 2003,
91 which is also used by the original carbon-cache.py. Note that this
92 only applies to the defaults, when listen directives are in the
93 config, this setting is ignored.
94
95 ○ -w workers: Use workers number of threads. The default number of
96 workers is equal to the amount of detected CPU cores. It makes
97 sense to reduce this number on many-core machines, or when the
98 traffic is low.
99
100 ○ -b batchsize: Set the amount of metrics that sent to remote servers
101 at once to batchsize. When the relay sends metrics to servers, it
102 will retrieve batchsize metrics from the pending queue of metrics
103 waiting for that server and send those one by one. The size of the
104 batch will have minimal impact on sending performance, but it con‐
105 trols the amount of lock-contention on the queue. The default is
106 2500.
107
108 ○ -q queuesize: Each server from the configuration where the relay
109 will send metrics to, has a queue associated with it. This queue
110 allows for disruptions and bursts to be handled. The size of this
111 queue will be set to queuesize which allows for that amount of met‐
112 rics to be stored in the queue before it overflows, and the relay
113 starts dropping metrics. The larger the queue, more metrics can be
114 absorbed, but also more memory will be used by the relay. The de‐
115 fault queue size is 25000.
116
117 ○ -L stalls: Sets the max mount of stalls to stalls before the relay
118 starts dropping metrics for a server. When a queue fills up, the
119 relay uses a mechanism called stalling to signal the client (writ‐
120 ing to the relay) of this event. In particular when the client
121 sends a large amount of metrics in very short time (burst),
122 stalling can help to avoid dropping metrics, since the client just
123 needs to slow down for a bit, which in many cases is possible (e.g.
124 when catting a file with nc(1)). However, this behaviour can also
125 obstruct, artificially stalling writers which cannot stop that eas‐
126 ily. For this the stalls can be set from 0 to 15, where each stall
127 can take around 1 second on the client. The default value is set to
128 4, which is aimed at the occasional disruption scenario and max ef‐
129 fort to not loose metrics with moderate slowing down of clients.
130
131 ○ -C CAcertpath: Read CA certs (for use with TLS/SSL connections)
132 from given path or file. When not given, the default locations are
133 used. Strict verfication of the peer is performed, so when using
134 self-signed certificates, be sure to include the CA cert in the de‐
135 fault location, or provide the path to the cert using this option.
136
137 ○ -T timeout: Specifies the IO timeout in milliseconds used for
138 server connections. The default is 600 milliseconds, but may need
139 increasing when WAN links are used for target servers. A relatively
140 low value for connection timeout allows the relay to quickly estab‐
141 lish a server is unreachable, and as such failover strategies to
142 kick in before the queue runs high.
143
144 ○ -c chars: Defines the characters that are next to [A-Za-z0-9] al‐
145 lowed in metrics to chars. Any character not in this list, is re‐
146 placed by the relay with _ (underscore). The default list of al‐
147 lowed characters is -_:#.
148
149 ○ -m length: Limits the metric names to be of at most length bytes
150 long. Any lines containing metric names larger than this will be
151 discarded.
152
153 ○ -M length Limits the input to lines of at most length bytes. Any
154 excess lines will be discarded. Note that -m needs to be smaller
155 than this value.
156
157 ○ -H hostname: Override hostname determined by a call to gethost‐
158 name(3) with hostname. The hostname is used mainly in the statis‐
159 tics metrics carbon.relays.<hostname>.<...> sent by the relay.
160
161 ○ -B backlog: Sets TCP connection listen backlog to backlog connec‐
162 tions. The default value is 32 but on servers which receive many
163 concurrent connections, this setting likely needs to be increased
164 to avoid connection refused errors on the clients.
165
166 ○ -U bufsize: Sets the socket send/receive buffer sizes in bytes, for
167 both TCP and UDP scenarios. When unset, the OS default is used. The
168 maximum is also determined by the OS. The sizes are set using set‐
169 sockopt with the flags SO_RCVBUF and SO_SNDBUF. Setting this size
170 may be necessary for large volume scenarios, for which also -B
171 might apply. Checking the Recv-Q and the receive errors values from
172 netstat gives a good hint about buffer usage.
173
174 ○ -E: Disable disconnecting idle incoming connections. By default the
175 relay disconnects idle client connections after 10 minutes. It does
176 this to prevent resources clogging up when a faulty or malicious
177 client keeps on opening connections without closing them. It typi‐
178 cally prevents running out of file descriptors. For some scenarios,
179 however, it is not desirable for idle connections to be discon‐
180 nected, hence passing this flag will disable this behaviour.
181
182 ○ -D: Deamonise into the background after startup. This option re‐
183 quires -l and -P flags to be set as well.
184
185 ○ -P pidfile: Write the pid of the relay process to a file called
186 pidfile. This is in particular useful when daemonised in combina‐
187 tion with init managers.
188
189 ○ -O threshold: The minimum number of rules to find before trying to
190 optimise the ruleset. The default is 50, to disable the optimiser,
191 use -1, to always run the optimiser use 0. The optimiser tries to
192 group rules to avoid spending excessive time on matching expres‐
193 sions.
194
195
196
198 The config file supports the following syntax, where comments start
199 with a # character and can appear at any position on a line and sup‐
200 press input until the end of that line:
201
202 ``` cluster name <forward | any_of | failover [useall] | carbon_ch |
203 fnv1a_ch | jump_fnv1a_ch [replication count] [dynamic] >
204 <host[:port][=instance] [proto udp | tcp] [type linemode] [transport
205 plain | gzip | lz4 | snappy [ssl]]> ... ;
206
207 cluster name file [ip] </path/to/file> ... ;
208
209 match <* | expression ...> [validate expression else log | drop] send
210 to <cluster ... | blackhole> [stop] ;
211
212 rewrite expression into replacement ;
213
214 aggregate expression ... every interval seconds expire after expiration
215 seconds [timestamp at start | middle | end of bucket] compute sum |
216 count | max | min | average | median | percentile<% | variance | std‐
217 dev> write to metric [compute ...] [send to <cluster ...>] [stop] ;
218
219 send statistics to <cluster ...> [stop] ; statistics [submit every in‐
220 terval seconds] [reset counters after interval] [prefix with prefix]
221 [send to <cluster ...>] [stop] ;
222
223 listen type linemode [transport plain | gzip | lz4 | snappy [ssl pem‐
224 cert]] «interface[:port] | port> proto udp | tcp> ... </ptah/to/file
225 proto unix> ... ;
226
227 include </path/to/file/or/glob> ; ```
228
229 CLUSTERS
230 Multiple clusters can be defined, and need not to be referenced by a
231 match rule. All clusters point to one or more hosts, except the file
232 cluster which writes to files in the local filesystem. host may be an
233 IPv4 or IPv6 address, or a hostname. Since host is followed by an op‐
234 tional : and port, for IPv6 addresses not to be interpreted wrongly,
235 either a port must be given, or the IPv6 address surrounded by brack‐
236 ets, e.g. [::1]. Optional transport and proto clauses can be used to
237 wrap the connection in a compression or encryption later or specify the
238 use of UDP or TCP to connect to the remote server. When omitted the
239 connection defaults to an unwrapped TCP connection. type can only be
240 linemode at the moment.
241
242 The forward and file clusters simply send everything they receive to
243 all defined members (host addresses or files). The any_of cluster is a
244 small variant of the forward cluster, but instead of sending to all de‐
245 fined members, it sends each incoming metric to one of defined members.
246 This is not much useful in itself, but since any of the members can re‐
247 ceive each metric, this means that when one of the members is unreach‐
248 able, the other members will receive all of the metrics. This can be
249 useful when the cluster points to other relays. The any_of router tries
250 to send the same metrics consistently to the same destination. The
251 failover cluster is like the any_of cluster, but sticks to the order in
252 which servers are defined. This is to implement a pure failover sce‐
253 nario between servers. The carbon_ch cluster sends the metrics to the
254 member that is responsible according to the consistent hash algorithm
255 (as used in the original carbon), or multiple members if replication is
256 set to more than 1. When dynamic is set, failure of any of the servers
257 does not result in metrics being dropped for that server, but instead
258 the undeliverable metrics are sent to any other server in the cluster
259 in order for the metrics not to get lost. This is most useful when
260 replication is 1. The fnv1a_ch cluster is a identical in behaviour to
261 carbon_ch, but it uses a different hash technique (FNV1a) which is
262 faster but more importantly defined to get by a limitation of carbon_ch
263 to use both host and port from the members. This is useful when multi‐
264 ple targets live on the same host just separated by port. The instance
265 that original carbon uses to get around this can be set by appending it
266 after the port, separated by an equals sign, e.g. 127.0.0.1:2006=a for
267 instance a. When using the fnv1a_ch cluster, this instance overrides
268 the hash key in use. This allows for many things, including masquerad‐
269 ing old IP addresses, but mostly to make the hash key location to be‐
270 come agnostic of the (physical) location of that key. For example, us‐
271 age like 10.0.0.1:2003=4d79d13554fa1301476c1f9fe968b0ac would allow to
272 change port and/or ip address of the server that receives data for the
273 instance key. Obviously, this way migration of data can be dealt with
274 much more conveniently. The jump_fnv1a_ch cluster is also a consistent
275 hash cluster like the previous two, but it does not take the server in‐
276 formation into account at all. Whether this is useful to you depends on
277 your scenario. The jump hash has a much better balancing over the
278 servers defined in the cluster, at the expense of not being able to re‐
279 move any server but the last in order. What this means is that this
280 hash is fine to use with ever growing clusters where older nodes are
281 also replaced at some point. If you have a cluster where removal of old
282 nodes takes place often, the jump hash is not suitable for you. Jump
283 hash works with servers in an ordered list without gaps. To influence
284 the ordering, the instance given to the server will be used as sorting
285 key. Without, the order will be as given in the file. It is a good
286 practice to fix the order of the servers with instances such that it is
287 explicit what the right nodes for the jump hash are.
288
289 DNS hostnames are resolved to a single address, according to the pref‐
290 erence rules in RFC 3484 https://www.ietf.org/rfc/rfc3484.txt. The
291 any_of, failover and forward clusters have an explicit useall flag that
292 enables expansion for hostnames resolving to multiple addresses. Each
293 address returned becomes a cluster destination.
294
295 MATCHES
296 Match rules are the way to direct incoming metrics to one or more clus‐
297 ters. Match rules are processed top to bottom as they are defined in
298 the file. It is possible to define multiple matches in the same rule.
299 Each match rule can send data to one or more clusters. Since match
300 rules "fall through" unless the stop keyword is added, carefully
301 crafted match expression can be used to target multiple clusters or ag‐
302 gregations. This ability allows to replicate metrics, as well as send
303 certain metrics to alternative clusters with careful ordering and usage
304 of the stop keyword. The special cluster blackhole discards any metrics
305 sent to it. This can be useful for weeding out unwanted metrics in cer‐
306 tain cases. Because throwing metrics away is pointless if other matches
307 would accept the same data, a match with as destination the blackhole
308 cluster, has an implicit stop. The validation clause adds a check to
309 the data (what comes after the metric) in the form of a regular expres‐
310 sion. When this expression matches, the match rule will execute as if
311 no validation clause was present. However, if it fails, the match rule
312 is aborted, and no metrics will be sent to destinations, this is the
313 drop behaviour. When log is used, the metric is logged to stderr. Care
314 should be taken with the latter to avoid log flooding. When a validate
315 clause is present, destinations need not to be present, this allows for
316 applying a global validation rule. Note that the cleansing rules are
317 applied before validation is done, thus the data will not have dupli‐
318 cate spaces. The route using clause is used to perform a temporary mod‐
319 ification to the key used for input to the consistent hashing routines.
320 The primary purpose is to route traffic so that appropriate data is
321 sent to the needed aggregation instances.
322
323 REWRITES
324 Rewrite rules take a regular expression as input to match incoming met‐
325 rics, and transform them into the desired new metric name. In the re‐
326 placement, backreferences are allowed to match capture groups defined
327 in the input regular expression. A match of server\.(x|y|z)\. allows to
328 use e.g. role.\1. in the substitution. A few caveats apply to the cur‐
329 rent implementation of rewrite rules. First, their location in the con‐
330 fig file determines when the rewrite is performed. The rewrite is done
331 in-place, as such a match rule before the rewrite would match the orig‐
332 inal name, a match rule after the rewrite no longer matches the origi‐
333 nal name. Care should be taken with the ordering, as multiple rewrite
334 rules in succession can take place, e.g. a gets replaced by b and b
335 gets replaced by c in a succeeding rewrite rule. The second caveat with
336 the current implementation, is that the rewritten metric names are not
337 cleansed, like newly incoming metrics are. Thus, double dots and poten‐
338 tial dangerous characters can appear if the replacement string is
339 crafted to produce them. It is the responsibility of the writer to make
340 sure the metrics are clean. If this is an issue for routing, one can
341 consider to have a rewrite-only instance that forwards all metrics to
342 another instance that will do the routing. Obviously the second in‐
343 stance will cleanse the metrics as they come in. The backreference no‐
344 tation allows to lowercase and uppercase the replacement string with
345 the use of the underscore (_) and carret (^) symbols following directly
346 after the backslash. For example, role.\_1. as substitution will lower‐
347 case the contents of \1. The dot (.) can be used in a similar fashion,
348 or followed after the underscore or caret to replace dots with under‐
349 scores in the substitution. This can be handy for some situations where
350 metrics are sent to graphite.
351
352 AGGREGATIONS
353 The aggregations defined take one or more input metrics expressed by
354 one or more regular expresions, similar to the match rules. Incoming
355 metrics are aggregated over a period of time defined by the interval in
356 seconds. Since events may arrive a bit later in time, the expiration
357 time in seconds defines when the aggregations should be considered fi‐
358 nal, as no new entries are allowed to be added any more. On top of an
359 aggregation multiple aggregations can be computed. They can be of the
360 same or different aggregation types, but should write to a unique new
361 metric. The metric names can include back references like in rewrite
362 expressions, allowing for powerful single aggregation rules that yield
363 in many aggregations. When no send to clause is given, produced metrics
364 are sent to the relay as if they were submitted from the outside, hence
365 match and aggregation rules apply to those. Care should be taken that
366 loops are avoided this way. For this reason, the use of the send to
367 clause is encouraged, to direct the output traffic where possible. Like
368 for match rules, it is possible to define multiple cluster targets.
369 Also, like match rules, the stop keyword applies to control the flow of
370 metrics in the matching process.
371
372 STATISTICS
373 The send statistics to construct is deprecated and will be removed in
374 the next release. Use the special statistics construct instead.
375
376 The statistics construct can control a couple of things about the (in‐
377 ternal) statistics produced by the relay. The send to target can be
378 used to avoid router loops by sending the statistics to a certain des‐
379 tination cluster(s). By default the metrics are prefixed with car‐
380 bon.relays.<hostname>, where hostname is determinted on startup and can
381 be overridden using the -H argument. This prefix can be set using the
382 prefix with clause similar to a rewrite rule target. The input match in
383 this case is the pre-set regular expression ^(([^.]+)(\..*)?)$ on the
384 hostname. As such, one can see that the default prefix is set by car‐
385 bon.relays.\.1. Note that this uses the replace-dot-with-underscore re‐
386 placement feature from rewrite rules. Given the input expression, the
387 following match groups are available: \1 the entire hostname, \2 the
388 short hostname and \3 the domainname (with leading dot). It may make
389 sense to replace the default by something like carbon.relays.\_2 for
390 certain scenarios, to always use the lowercased short hostname, which
391 following the expression doesn´t contain a dot. By default, the metrics
392 are submitted every 60 seconds, this can be changed using the submit
393 every <interval> seconds clause.
394 To obtain a more compatible set of values to carbon-cache.py, use the
395 reset counters after interval clause to make values non-cumulative,
396 that is, they will report the change compared to the previous value.
397
398 LISTENERS
399 The ports and protocols the relay should listen for incoming connec‐
400 tions can be specified using the listen directive. Currently, all lis‐
401 teners need to be of linemode type. An optional compression or encryp‐
402 tion wrapping can be specified for the port and optional interface
403 given by ip address, or unix socket by file. When interface is not
404 specified, the any interface on all available ip protocols is assumed.
405 If no listen directive is present, the relay will use the default lis‐
406 teners for port 2003 on tcp and udp, plus the unix socket /tmp/.s.car‐
407 bon-c-relay.2003. This typically expands to 5 listeners on an IPv6 en‐
408 abled system. The default matches the behaviour of versions prior to
409 v3.2.
410
411 INCLUDES
412 In case configuration becomes very long, or is managed better in sepa‐
413 rate files, the include directive can be used to read another file. The
414 given file will be read in place and added to the router configuration
415 at the time of inclusion. The end result is one big route configura‐
416 tion. Multiple include statements can be used throughout the configura‐
417 tion file. The positioning will influence the order of rules as normal.
418 Beware that recursive inclusion (include from an included file) is sup‐
419 ported, and currently no safeguards exist for an inclusion loop. For
420 what is worth, this feature likely is best used with simple configura‐
421 tion files (e.g. not having include in them).
422
424 carbon-c-relay evolved over time, growing features on demand as the
425 tool proved to be stable and fitting the job well. Below follow some
426 annotated examples of constructs that can be used with the relay.
427
428 Clusters can be defined as much as necessary. They receive data from
429 match rules, and their type defines which members of the cluster fi‐
430 nally get the metric data. The simplest cluster form is a forward clus‐
431 ter:
432
433 cluster send-through forward 10.1.0.1 ;
434
435 Any metric sent to the send-through cluster would simply be forwarded
436 to the server at IPv4 address 10.1.0.1. If we define multiple servers,
437 all of those servers would get the same metric, thus:
438
439 cluster send-through forward 10.1.0.1 10.2.0.1 ;
440
441 The above results in a duplication of metrics send to both machines.
442 This can be useful, but most of the time it is not. The any_of cluster
443 type is like forward, but it sends each incoming metric to any of the
444 members. The same example with such cluster would be:
445
446 cluster send-to-any-one any_of 10.1.0.1:2010 10.1.0.1:2011;
447
448 This would implement a multipath scenario, where two servers are used,
449 the load between them is spread, but should any of them fail, all met‐
450 rics are sent to the remaining one. This typically works well for up‐
451 stream relays, or for balancing carbon-cache processes running on the
452 same machine. Should any member become unavailable, for instance due to
453 a rolling restart, the other members receive the traffic. If it is nec‐
454 essary to have true fail-over, where the secondary server is only used
455 if the first is down, the following would implement that:
456
457 cluster try-first-then-second failover 10.1.0.1:2010 10.1.0.1:2011;
458
459 These types are different from the two consistent hash cluster types:
460
461 cluster graphite carbon_ch 127.0.0.1:2006=a 127.0.0.1:2007=b
462 127.0.0.1:2008=c ;
463
464 If a member in this example fails, all metrics that would go to that
465 member are kept in the queue, waiting for the member to return. This is
466 useful for clusters of carbon-cache machines where it is desirable that
467 the same metric ends up on the same server always. The carbon_ch clus‐
468 ter type is compatible with carbon-relay consistent hash, and can be
469 used for existing clusters populated by carbon-relay. For new clusters,
470 however, it is better to use the fnv1a_ch cluster type, for it is
471 faster, and allows to balance over the same address but different ports
472 without an instance number, in constrast to carbon_ch.
473
474 Because we can use multiple clusters, we can also replicate without the
475 use of the forward cluster type, in a more intelligent way:
476
477 ``` cluster dc-old carbon_ch replication 2 10.1.0.1 10.1.0.2 10.1.0.3 ;
478 cluster dc-new1 fnv1a_ch replication 2 10.2.0.1 10.2.0.2 10.2.0.3 ;
479 cluster dc-new2 fnv1a_ch replication 2 10.3.0.1 10.3.0.2 10.3.0.3 ;
480
481 match * send to dc-old ; match * send to dc-new1 dc-new2 stop ; ```
482
483 In this example all incoming metrics are first sent to dc-old, then
484 dc-new1 and finally to dc-new2. Note that the cluster type of dc-old is
485 different. Each incoming metric will be send to 2 members of all three
486 clusters, thus replicating to in total 6 destinations. For each cluster
487 the destination members are computed independently. Failure of clusters
488 or members does not affect the others, since all have individual
489 queues. The above example could also be written using three match rules
490 for each dc, or one match rule for all three dcs. The difference is
491 mainly in performance, the number of times the incoming metric has to
492 be matched against an expression. The stop rule in dc-new match rule is
493 not strictly necessary in this example, because there are no more fol‐
494 lowing match rules. However, if the match would target a specific sub‐
495 set, e.g. ^sys\., and more clusters would be defined, this could be
496 necessary, as for instance in the following abbreviated example:
497
498 ``` cluster dc1-sys ... ; cluster dc2-sys ... ;
499
500 cluster dc1-misc ... ; cluster dc2-misc ... ;
501
502 match ^sys. send to dc1-sys; match ^sys. send to dc2-sys stop;
503
504 match * send to dc1-misc; match * send to dc2-misc stop; ```
505
506 As can be seen, without the stop in dc2-sys´ match rule, all metrics
507 starting with sys. would also be send to dc1-misc and dc2-misc. It can
508 be that this is desired, of course, but in this example there is a ded‐
509 icated cluster for the sys metrics.
510
511 Suppose there would be some unwanted metric that unfortunately is gen‐
512 erated, let´s assume some bad/old software. We don´t want to store this
513 metric. The blackhole cluster is suitable for that, when it is harder
514 to actually whitelist all wanted metrics. Consider the following:
515
516 match some_legacy1$ some_legacy2$ send to blackhole stop;
517
518 This would throw away all metrics that end with some_legacy, that would
519 otherwise be hard to filter out. Since the order matters, it can be
520 used in a construct like this:
521
522 ``` cluster old ... ; cluster new ... ;
523
524 match * send to old;
525
526 match unwanted send to blackhole stop;
527
528 match * send to new; ```
529
530 In this example the old cluster would receive the metric that´s un‐
531 wanted for the new cluster. So, the order in which the rules occur does
532 matter for the execution.
533
534 Validation can be used to ensure the data for metrics is as expected. A
535 global validation for just number (no floating point) values could be:
536
537 match * validate ^[0-9]+\ [0-9]+$ else drop ;
538
539 (Note the escape with backslash \ of the space, you might be able to
540 use \s or [:space:] instead, this depends on your libc implementation.)
541
542 The validation clause can exist on every match rule, so in principle,
543 the following is valid:
544
545 match ^foo validate ^[0-9]+\ [0-9]+$ else drop send to integer-cluster
546 ; match ^foo validate ^[0-9.e+-]+\ [0-9.e+-]+$ else drop send to
547 float-cluster stop;
548
549 Note that the behaviour is different in the previous two examples. When
550 no send to clusters are specified, a validation error makes the match
551 behave like the stop keyword is present. Likewise, when validation
552 passes, processing continues with the next rule. When destination clus‐
553 ters are present, the match respects the stop keyword as normal. When
554 specified, processing will always stop when specified so. However, if
555 validation fails, the rule does not send anything to the destination
556 clusters, the metric will be dropped or logged, but never sent.
557
558 The relay is capable of rewriting incoming metrics on the fly. This
559 process is done based on regular expressions with capture groups that
560 allow to substitute parts in a replacement string. Rewrite rules allow
561 to cleanup metrics from applications, or provide a migration path. In
562 it´s simplest form a rewrite rule looks like this:
563
564 rewrite ^server\.(.+)\.(.+)\.([a-zA-Z]+)([0-9]+) into
565 server.\_1.\2.\3.\3\4 ;
566
567 In this example a metric like server.DC.role.name123 would be trans‐
568 formed into server.dc.role.name.name123. For rewrite rules hold the
569 same as for matches, that their order matters. Hence to build on top of
570 the old/new cluster example done earlier, the following would store the
571 original metric name in the old cluster, and the new metric name in the
572 new cluster:
573
574 ``` match * send to old;
575
576 rewrite ... ;
577
578 match * send to new; ```
579
580 Note that after the rewrite, the original metric name is no longer
581 available, as the rewrite happens in-place.
582
583 Aggregations are probably the most complex part of carbon-c-relay. Two
584 ways of specifying aggregates are supported by carbon-c-relay. The
585 first, static rules, are handled by an optimiser which tries to fold
586 thousands of rules into groups to make the matching more efficient. The
587 second, dynamic rules, are very powerful compact definitions with pos‐
588 sibly thousands of internal instantiations. A typical static aggrega‐
589 tion looks like:
590
591 aggregate ^sys\.dc1\.somehost-[0-9]+\.somecluster\.mysql\.replica‐
592 tion_delay ^sys\.dc2\.somehost-[0-9]+\.somecluster\.mysql\.replica‐
593 tion_delay every 10 seconds expire after 35 seconds timestamp at end of
594 bucket compute sum write to mysql.somecluster.total_replication_delay
595 compute average write to mysql.somecluster.average_replication_delay
596 compute max write to mysql.somecluster.max_replication_delay compute
597 count write to mysql.somecluster.replication_delay_metric_count ;
598
599 In this example, four aggregations are produced from the incoming
600 matching metrics. In this example we could have written the two matches
601 as one, but for demonstration purposes we did not. Obviously they can
602 refer to different metrics, if that makes sense. The every 10 seconds
603 clause specifies in what interval the aggregator can expect new metrics
604 to arrive. This interval is used to produce the aggregations, thus each
605 10 seconds 4 new metrics are generated from the data received sofar.
606 Because data may be in transit for some reason, or generation stalled,
607 the expire after clause specifies how long the data should be kept be‐
608 fore considering a data bucket (which is aggregated) to be complete. In
609 the example, 35 was used, which means after 35 seconds the first aggre‐
610 gates are produced. It also means that metrics can arrive 35 seconds
611 late, and still be taken into account. The exact time at which the ag‐
612 gregate metrics are produced is random between 0 and interval (10 in
613 this case) seconds after the expiry time. This is done to prevent thun‐
614 dering herds of metrics for large aggregation sets. The timestamp that
615 is used for the aggregations can be specified to be the start, middle
616 or end of the bucket. Original carbon-aggregator.py uses start, while
617 carbon-c-relay´s default has always been end. The compute clauses
618 demonstrate a single aggregation rule can produce multiple aggregates,
619 as often is the case. Internally, this comes for free, since all possi‐
620 ble aggregates are always calculated, whether or not they are used. The
621 produced new metrics are resubmitted to the relay, hence matches de‐
622 fined before in the configuration can match output of the aggregator.
623 It is important to avoid loops, that can be generated this way. In gen‐
624 eral, splitting aggregations to their own carbon-c-relay instance, such
625 that it is easy to forward the produced metrics to another relay in‐
626 stance is a good practice.
627
628 The previous example could also be written as follows to be dynamic:
629
630 aggregate ^sys\.dc[0-9].(somehost-[0-9]+)\.([^.]+)\.mysql\.replica‐
631 tion_delay every 10 seconds expire after 35 seconds compute sum write
632 to mysql.host.\1.replication_delay compute sum write to
633 mysql.host.all.replication_delay compute sum write to mysql.clus‐
634 ter.\2.replication_delay compute sum write to mysql.cluster.all.repli‐
635 cation_delay ;
636
637 Here a single match, results in four aggregations, each of a different
638 scope. In this example aggregation based on hostname and cluster are
639 being made, as well as the more general all targets, which in this ex‐
640 ample have both identical values. Note that with this single aggrega‐
641 tion rule, both per-cluster, per-host and total aggregations are pro‐
642 duced. Obviously, the input metrics define which hosts and clusters are
643 produced.
644
645 With use of the send to clause, aggregations can be made more intuitive
646 and less error-prone. Consider the below example:
647
648 ``` cluster graphite fnv1a_ch ip1 ip2 ip3;
649
650 aggregate ^sys.somemetric every 60 seconds expire after 75 seconds com‐
651 pute sum write to sys.somemetric send to graphite stop ;
652
653 match * send to graphite; ```
654
655 It sends all incoming metrics to the graphite cluster, except the
656 sys.somemetric ones, which it replaces with a sum of all the incoming
657 ones. Without a stop in the aggregate, this causes a loop, and without
658 the send to, the metric name can´t be kept its original name, for the
659 output now directly goes to the cluster.
660
662 When carbon-c-relay is run without -d or -s arguments, statistics will
663 be produced. By default they are sent to the relay itself in the form
664 of carbon.relays.<hostname>.*. See the statistics construct to override
665 this prefix, sending interval and values produced. While many metrics
666 have a similar name to what carbon-cache.py would produce, their values
667 are likely different. By default, most values are running counters
668 which only increase over time. The use of the nonNegativeDerivative()
669 function from graphite is useful with these.
670
671 The following metrics are produced under the carbon.relays.<hostname>
672 namespace:
673
674 ○ metricsReceived
675
676 The number of metrics that were received by the relay. Received
677 here means that they were seen and processed by any of the dis‐
678 patchers.
679
680 ○ metricsSent
681
682 The number of metrics that were sent from the relay. This is a to‐
683 tal count for all servers combined. When incoming metrics are du‐
684 plicated by the cluster configuration, this counter will include
685 all those duplications. In other words, the amount of metrics that
686 were successfully sent to other systems. Note that metrics that are
687 processed (received) but still in the sending queue (queued) are
688 not included in this counter.
689
690 ○ metricsDiscarded
691
692 The number of input lines that were not considered to be a valid
693 metric. Such lines can be empty, only containing whitespace, or
694 hitting the limits given for max input length and/or max metric
695 length (see -m and -M options).
696
697 ○ metricsQueued
698
699 The total number of metrics that are currently in the queues for
700 all the server targets. This metric is not cumulative, for it is a
701 sample of the queue size, which can (and should) go up and down.
702 Therefore you should not use the derivative function for this met‐
703 ric.
704
705 ○ metricsDropped
706
707 The total number of metric that had to be dropped due to server
708 queues overflowing. A queue typically overflows when the server it
709 tries to send its metrics to is not reachable, or too slow in in‐
710 gesting the amount of metrics queued. This can be network or re‐
711 source related, and also greatly depends on the rate of metrics be‐
712 ing sent to the particular server.
713
714 ○ metricsBlackholed
715
716 The number of metrics that did not match any rule, or matched a
717 rule with blackhole as target. Depending on your configuration, a
718 high value might be an indication of a misconfiguration somewhere.
719 These metrics were received by the relay, but never sent anywhere,
720 thus they disappeared.
721
722 ○ metricStalls
723
724 The number of times the relay had to stall a client to indicate
725 that the downstream server cannot handle the stream of metrics. A
726 stall is only performed when the queue is full and the server is
727 actually receptive of metrics, but just too slow at the moment.
728 Stalls typically happen during micro-bursts, where the client typi‐
729 cally is unaware that it should stop sending more data, while it is
730 able to.
731
732 ○ connections
733
734 The number of connect requests handled. This is an ever increasing
735 number just counting how many connections were accepted.
736
737 ○ disconnects
738
739 The number of disconnected clients. A disconnect either happens be‐
740 cause the client goes away, or due to an idle timeout in the relay.
741 The difference between this metric and connections is the amount of
742 connections actively held by the relay. In normal situations this
743 amount remains within reasonable bounds. Many connections, but few
744 disconnections typically indicate a possible connection leak in the
745 client. The idle connections disconnect in the relay here is to
746 guard against resource drain in such scenarios.
747
748 ○ dispatch_wallTime_us
749
750 The number of microseconds spent by the dispatchers to do their
751 work. In particular on multi-core systems, this value can be con‐
752 fusing, however, it indicates how long the dispatchers were doing
753 work handling clients. It includes everything they do, from reading
754 data from a socket, cleaning up the input metric, to adding the
755 metric to the appropriate queues. The larger the configuration, and
756 more complex in terms of matches, the more time the dispatchers
757 will spend on the cpu. But also time they do /not/ spend on the cpu
758 is included in this number. It is the pure wallclock time the dis‐
759 patcher was serving a client.
760
761 ○ dispatch_sleepTime_us
762
763 The number of microseconds spent by the dispatchers sleeping wait‐
764 ing for work. When this value gets small (or even zero) the dis‐
765 patcher has so much work that it doesn´t sleep any more, and likely
766 can´t process the work in a timely fashion any more. This value
767 plus the wallTime from above sort of sums up to the total uptime
768 taken by this dispatcher. Therefore, expressing the wallTime as
769 percentage of this sum gives the busyness percentage draining all
770 the way up to 100% if sleepTime goes to 0.
771
772 ○ server_wallTime_us
773
774 The number of microseconds spent by the servers to send the metrics
775 from their queues. This value includes connection creation, reading
776 from the queue, and sending metrics over the network.
777
778 ○ dispatcherX
779
780 For each indivual dispatcher, the metrics received and blackholed
781 plus the wall clock time. The values are as described above.
782
783 ○ destinations.X
784
785 For all known destinations, the number of dropped, queued and sent
786 metrics plus the wall clock time spent. The values are as described
787 above.
788
789 ○ aggregators.metricsReceived
790
791 The number of metrics that were matched an aggregator rule and were
792 accepted by the aggregator. When a metric matches multiple aggrega‐
793 tors, this value will reflect that. A metric is not counted when it
794 is considered syntactically invalid, e.g. no value was found.
795
796 ○ aggregators.metricsDropped
797
798 The number of metrics that were sent to an aggregator, but did not
799 fit timewise. This is either because the metric was too far in the
800 past or future. The expire after clause in aggregate statements
801 controls how long in the past metric values are accepted.
802
803 ○ aggregators.metricsSent
804
805 The number of metrics that were sent from the aggregators. These
806 metrics were produced and are the actual results of aggregations.
807
808
809
811 Please report them at: https://github.com/grobian/carbon-c-relay/issues
812
814 Fabian Groffen <grobian@gentoo.org>
815
817 All other utilities from the graphite stack.
818
819 This project aims to be a fast replacement of the original Carbon relay
820 http://graphite.readthedocs.org/en/1.0/carbon-daemons.html#carbon-re‐
821 lay-py. carbon-c-relay aims to deliver performance and configurability.
822 Carbon is single threaded, and sending metrics to multiple consis‐
823 tent-hash clusters requires chaining of relays. This project provides a
824 multithreaded relay which can address multiple targets and clusters for
825 each and every metric based on pattern matches.
826
827 There are a couple more replacement projects out there, which are car‐
828 bon-relay-ng https://github.com/graphite-ng/carbon-relay-ng and
829 graphite-relay https://github.com/markchadwick/graphite-relay.
830
831 Compared to carbon-relay-ng, this project does provide carbon´s consis‐
832 tent-hash routing. graphite-relay, which does this, however doesn´t do
833 metric-based matches to direct the traffic, which this project does as
834 well. To date, carbon-c-relay can do aggregations, failover targets and
835 more.
836
838 This program was originally developed for Booking.com, which approved
839 that the code was published and released as Open Source on GitHub, for
840 which the author would like to express his gratitude. Development has
841 continued since with the help of many contributors suggesting features,
842 reporting bugs, adding patches and more to make carbon-c-relay into
843 what it is today.
844
845
846
847 October 2019 CARBON-C-RELAY(1)