carbon-c-relay(1)

1CARBON-C-RELAY(1)                                            CARBON-C-RELAY(1)
2
3
4

NAME

6       carbon-c-relay - graphite relay, aggregator and rewriter
7
8        https://travis-ci.org/grobian/carbon-c-relay
9

SYNOPSIS

11       carbon-c-relay -f config-file [ options ... ]
12

DESCRIPTION

14       carbon-c-relay  accepts,  cleanses,  matches,  rewrites,  forwards  and
15       aggregates graphite metrics by listening for incoming  connections  and
16       relaying  the  messages  to other servers defined in its configuration.
17       The core functionality is to route messages via flexible rules  to  the
18       desired destinations.
19
20       carbon-c-relay  is  a simple program that reads its routing information
21       from a file. The command line arguments allow to set the  location  for
22       this file, as well as the amount of dispatchers (worker threads) to use
23       for reading the data from incoming connections and  passing  them  onto
24       the  right destination(s). The route file supports two main constructs:
25       clusters and matches. The first define groups of hosts data metrics can
26       be  sent  to,  the  latter define which metrics should be sent to which
27       cluster. Aggregation rules are seen as matches.
28
29       For every metric received by the relay,  cleansing  is  performed.  The
30       following  changes are performed before any match, aggregate or rewrite
31       rule sees the metric:
32
33       ·   double dot elimination (necessary for correctly functioning consis‐
34           tent hash routing)
35
36       ·   trailing/leading dot elimination
37
38       ·   whitespace  normalisation  (this mostly affects output of the relay
39           to other targets: metric, value and timestamp will be separated  by
40           a single space only, ever)
41
42       ·   irregular  char replacement with underscores (_), currently irregu‐
43           lar is defined as not being in [0-9a-zA-Z-_:#], but can be overrid‐
44           den  on the command line. Note that tags (when present and allowed)
45           are not processed this way.
46
47
48

OPTIONS

50       These options control the behaviour of carbon-c-relay.
51
52       ·   -v: Print version string and exit.
53
54       ·   -d: Enable debug mode, this prints statistics to stdout and  prints
55           extra  messages about some situations encountered by the relay that
56           normally would be too verbose to be enabled. When combined with  -t
57           (test  mode)  this also prints stub routes and consistent-hash ring
58           contents.
59
60       ·   -s: Enable submission mode. In this mode, internal  statistics  are
61           not  generated.  Instead,  queue  pressure  and  metrics  drops are
62           reported on stdout. This mode is useful  when  used  as  submission
63           relay which´ job is just to forward to (a set of) main relays. Sta‐
64           tistics about the submission relays in this case  are  not  needed,
65           and  could  easily  cause  a non-desired flood of metrics e.g. when
66           used on each and every host locally.
67
68       ·   -t: Test mode. This mode doesn´t do any routing at all, but instead
69           reads input from stdin and prints what actions would be taken given
70           the loaded configuration. This mode  is  very  useful  for  testing
71           relay  routes  for regular expression syntax etc. It also allows to
72           give insight on how routing is applied in  complex  configurations,
73           for  it shows rewrites and aggregates taking place as well. When -t
74           is repeated, the relay will only test the configuration for  valid‐
75           ity  and  exit  immediately afterwards. Any standard output is sup‐
76           pressed in this mode, making it ideal for start-scripts to  test  a
77           (new) configuration.
78
79       ·   -f  config-file:  Read configuration from config-file. A configura‐
80           tion consists of clusters and routes. See CONFIGURATION SYNTAX  for
81           more information on the options and syntax of this file.
82
83       ·   -l  log-file:  Use  log-file  for  writing  messages.  Without this
84           option, the relay writes both to stdout and stderr. When logging to
85           file,  all  messages  are  prefixed with MSG when they were sent to
86           stdout, and ERR when they were sent to stderr.
87
88       ·   -p port: Listen for connections on port port. The  port  number  is
89           used  for  both  TCP, UDP and UNIX sockets. In the latter case, the
90           socket file contains the port number. The port  defaults  to  2003,
91           which  is also used by the original carbon-cache.py. Note that this
92           only applies to the defaults, when listen  directives  are  in  the
93           config, this setting is ignored.
94
95       ·   -w  workers:  Use  workers number of threads. The default number of
96           workers is equal to the amount of  detected  CPU  cores.  It  makes
97           sense  to  reduce  this  number  on many-core machines, or when the
98           traffic is low.
99
100       ·   -b batchsize: Set the amount of metrics that sent to remote servers
101           at  once  to batchsize. When the relay sends metrics to servers, it
102           will retrieve batchsize metrics from the pending queue  of  metrics
103           waiting  for that server and send those one by one. The size of the
104           batch will have minimal impact on sending performance, but it  con‐
105           trols  the  amount  of lock-contention on the queue. The default is
106           2500.
107
108       ·   -q queuesize: Each server from the configuration  where  the  relay
109           will  send  metrics  to, has a queue associated with it. This queue
110           allows for disruptions and bursts to be handled. The size  of  this
111           queue will be set to queuesize which allows for that amount of met‐
112           rics to be stored in the queue before it overflows, and  the  relay
113           starts  dropping metrics. The larger the queue, more metrics can be
114           absorbed, but also more memory will  be  used  by  the  relay.  The
115           default queue size is 25000.
116
117       ·   -L  stalls: Sets the max mount of stalls to stalls before the relay
118           starts dropping metrics for a server. When a queue  fills  up,  the
119           relay  uses a mechanism called stalling to signal the client (writ‐
120           ing to the relay) of this event.  In  particular  when  the  client
121           sends  a  large  amount  of  metrics  in  very  short time (burst),
122           stalling can help to avoid dropping metrics, since the client  just
123           needs to slow down for a bit, which in many cases is possible (e.g.
124           when catting a file with nc(1)). However, this behaviour  can  also
125           obstruct, artificially stalling writers which cannot stop that eas‐
126           ily. For this the stalls can be set from 0 to 15, where each  stall
127           can take around 1 second on the client. The default value is set to
128           4, which is aimed at the occasional  disruption  scenario  and  max
129           effort to not loose metrics with moderate slowing down of clients.
130
131       ·   -C  CAcertpath:  Read  CA  certs (for use with TLS/SSL connections)
132           from given path or file. When not given, the default locations  are
133           used.  Strict  verfication  of the peer is performed, so when using
134           self-signed certificates, be sure to include the  CA  cert  in  the
135           default  location,  or  provide  the  path  to  the cert using this
136           option.
137
138       ·   -T timeout: Specifies the  IO  timeout  in  milliseconds  used  for
139           server  connections.  The default is 600 milliseconds, but may need
140           increasing when WAN links are used for target servers. A relatively
141           low value for connection timeout allows the relay to quickly estab‐
142           lish a server is unreachable, and as such  failover  strategies  to
143           kick in before the queue runs high.
144
145       ·   -c  chars:  Defines  the  characters  that  are next to [A-Za-z0-9]
146           allowed in metrics to chars. Any character not  in  this  list,  is
147           replaced  by  the  relay  with  _ (underscore). The default list of
148           allowed characters is -_:#.
149
150       ·   -m length: Limits the metric names to be of at  most  length  bytes
151           long.  Any  lines  containing metric names larger than this will be
152           discarded.
153
154       ·   -M length Limits the input to lines of at most  length  bytes.  Any
155           excess  lines  will  be discarded. Note that -m needs to be smaller
156           than this value.
157
158       ·   -H hostname: Override hostname determined by  a  call  to  gethost‐
159           name(3)  with  hostname. The hostname is used mainly in the statis‐
160           tics metrics carbon.relays.<hostname>.<...> sent by the relay.
161
162       ·   -B backlog: Sets TCP connection listen backlog to  backlog  connec‐
163           tions.  The  default  value is 32 but on servers which receive many
164           concurrent connections, this setting likely needs to  be  increased
165           to avoid connection refused errors on the clients.
166
167       ·   -U bufsize: Sets the socket send/receive buffer sizes in bytes, for
168           both TCP and UDP scenarios. When unset, the OS default is used. The
169           maximum  is also determined by the OS. The sizes are set using set‐
170           sockopt with the flags SO_RCVBUF and SO_SNDBUF. Setting  this  size
171           may  be  necessary  for  large  volume scenarios, for which also -B
172           might apply. Checking the Recv-Q and the receive errors values from
173           netstat gives a good hint about buffer usage.
174
175       ·   -E: Disable disconnecting idle incoming connections. By default the
176           relay disconnects idle client connections after 10 minutes. It does
177           this  to  prevent  resources clogging up when a faulty or malicious
178           client keeps on opening connections without closing them. It  typi‐
179           cally prevents running out of file descriptors. For some scenarios,
180           however, it is not desirable for idle  connections  to  be  discon‐
181           nected, hence passing this flag will disable this behaviour.
182
183       ·   -D:  Deamonise  into  the  background  after  startup.  This option
184           requires -l and -P flags to be set as well.
185
186       ·   -P pidfile: Write the pid of the relay process  to  a  file  called
187           pidfile.  This  is in particular useful when daemonised in combina‐
188           tion with init managers.
189
190       ·   -O threshold: The minimum number of rules to find before trying  to
191           optimise  the ruleset. The default is 50, to disable the optimiser,
192           use -1, to always run the optimiser use 0. The optimiser  tries  to
193           group  rules  to  avoid spending excessive time on matching expres‐
194           sions.
195
196
197

CONFIGURATION SYNTAX

199       The config file supports the following  syntax,  where  comments  start
200       with  a  #  character and can appear at any position on a line and sup‐
201       press input until the end of that line:
202
203
204
205           cluster <name>
206               < <forward | any_of | failover> [useall] |
207                 <carbon_ch | fnv1a_ch | jump_fnv1a_ch> [replication <count>] [dynamic] >
208                   <host[:port][=instance] [proto <udp | tcp>]
209                                           [type linemode]
210                                           [transport <plain | gzip | lz4 | snappy>
211                                                      [ssl]]> ...
212               ;
213
214           cluster <name>
215               file [ip]
216                   </path/to/file> ...
217               ;
218
219           match
220                   <* | expression ...>
221               [validate <expression> else <log | drop>]
222               send to <cluster ... | blackhole>
223               [stop]
224               ;
225
226           rewrite <expression>
227               into <replacement>
228               ;
229
230           aggregate
231                   <expression> ...
232               every <interval> seconds
233               expire after <expiration> seconds
234               [timestamp at <start | middle | end> of bucket]
235               compute <sum | count | max | min | average |
236                        median | percentile<%> | variance | stddev> write to
237                   <metric>
238               [compute ...]
239               [send to <cluster ...>]
240               [stop]
241               ;
242
243           send statistics to <cluster ...>
244               [stop]
245               ;
246           statistics
247               [submit every <interval> seconds]
248               [reset counters after interval]
249               [prefix with <prefix>]
250               [send to <cluster ...>]
251               [stop]
252               ;
253
254           listen
255               type linemode [transport <plain | gzip | lz4 | snappy> [ssl <pemcert>]]
256                   <<interface[:port] | port> proto <udp | tcp>> ...
257                   </ptah/to/file proto unix> ...
258               ;
259
260           include </path/to/file/or/glob>
261               ;
262
263
264
265   CLUSTERS
266       Multiple clusters can be defined, and need not to be  referenced  by  a
267       match  rule.  All  clusters point to one or more hosts, except the file
268       cluster which writes to files in the local filesystem. host may  be  an
269       IPv4  or  IPv6  address,  or  a  hostname. Since host is followed by an
270       optional : and port, for IPv6 addresses not to be interpreted  wrongly,
271       either  a  port must be given, or the IPv6 address surrounded by brack‐
272       ets, e.g. [::1]. Optional transport and proto clauses can  be  used  to
273       wrap the connection in a compression or encryption later or specify the
274       use of UDP or TCP to connect to the remote  server.  When  omitted  the
275       connection  defaults  to  an unwrapped TCP connection. type can only be
276       linemode at the moment.
277
278       The forward and file clusters simply send everything  they  receive  to
279       all  defined members (host addresses or files). The any_of cluster is a
280       small variant of the forward cluster, but instead  of  sending  to  all
281       defined  members,  it sends each incoming metric to one of defined mem‐
282       bers. This is not much useful in itself, but since any of  the  members
283       can  receive  each  metric,  this means that when one of the members is
284       unreachable, the other members will receive all of  the  metrics.  This
285       can  be  useful  when  the  cluster  points to other relays. The any_of
286       router tries to send the same metrics consistently to the same destina‐
287       tion.  The  failover  cluster is like the any_of cluster, but sticks to
288       the order in which servers are defined. This is  to  implement  a  pure
289       failover scenario between servers. The carbon_ch cluster sends the met‐
290       rics to the member that is responsible according to the consistent hash
291       algorithm  (as  used  in  the  original carbon), or multiple members if
292       replication is set to more than 1. When dynamic is set, failure of  any
293       of  the  servers  does  not  result  in  metrics being dropped for that
294       server, but instead the undeliverable metrics are  sent  to  any  other
295       server in the cluster in order for the metrics not to get lost. This is
296       most useful when replication is 1. The fnv1a_ch cluster is a  identical
297       in  behaviour  to  carbon_ch,  but  it  uses a different hash technique
298       (FNV1a) which is faster but more importantly defined to get by a  limi‐
299       tation of carbon_ch to use both host and port from the members. This is
300       useful when multiple targets live on the same host  just  separated  by
301       port.  The instance that original carbon uses to get around this can be
302       set by appending it after the port, separated by an equals  sign,  e.g.
303       127.0.0.1:2006=a  for instance a. When using the fnv1a_ch cluster, this
304       instance overrides the hash key in use. This allows  for  many  things,
305       including  masquerading  old  IP addresses, but mostly to make the hash
306       key location to become agnostic of the (physical) location of that key.
307       For  example, usage like 10.0.0.1:2003=4d79d13554fa1301476c1f9fe968b0ac
308       would allow to change  port  and/or  ip  address  of  the  server  that
309       receives  data  for  the instance key. Obviously, this way migration of
310       data can be dealt with much more conveniently. The jump_fnv1a_ch  clus‐
311       ter  is  also  a  consistent hash cluster like the previous two, but it
312       does not take the server information into account at all. Whether  this
313       is  useful  to  you  depends on your scenario. The jump hash has a much
314       better balancing over the  servers  defined  in  the  cluster,  at  the
315       expense  of  not being able to remove any server but the last in order.
316       What this means is that this hash is fine  to  use  with  ever  growing
317       clusters where older nodes are also replaced at some point. If you have
318       a cluster where removal of old nodes takes place often, the  jump  hash
319       is  not  suitable  for  you. Jump hash works with servers in an ordered
320       list without gaps. To influence the ordering, the instance given to the
321       server will be used as sorting key. Without, the order will be as given
322       in the file. It is a good practice to fix the order of the servers with
323       instances  such  that  it is explicit what the right nodes for the jump
324       hash are.
325
326       DNS hostnames are resolved to a single address, according to the  pref‐
327       erence  rules  in  RFC  3484  https://www.ietf.org/rfc/rfc3484.txt. The
328       any_of, failover and forward clusters have an explicit useall flag that
329       enables  expansion  for hostnames resolving to multiple addresses. Each
330       address returned becomes a cluster destination.
331
332   MATCHES
333       Match rules are the way to direct incoming metrics to one or more clus‐
334       ters.  Match  rules  are processed top to bottom as they are defined in
335       the file. It is possible to define multiple matches in the  same  rule.
336       Each  match  rule  can  send  data to one or more clusters. Since match
337       rules "fall through"  unless  the  stop  keyword  is  added,  carefully
338       crafted  match  expression  can  be used to target multiple clusters or
339       aggregations. This ability allows to replicate metrics, as well as send
340       certain metrics to alternative clusters with careful ordering and usage
341       of the stop keyword. The special cluster blackhole discards any metrics
342       sent to it. This can be useful for weeding out unwanted metrics in cer‐
343       tain cases. Because throwing metrics away is pointless if other matches
344       would  accept  the same data, a match with as destination the blackhole
345       cluster, has an implicit stop. The validation clause adds  a  check  to
346       the data (what comes after the metric) in the form of a regular expres‐
347       sion. When this expression matches, the match rule will execute  as  if
348       no  validation clause was present. However, if it fails, the match rule
349       is aborted, and no metrics will be sent to destinations,  this  is  the
350       drop  behaviour. When log is used, the metric is logged to stderr. Care
351       should be taken with the latter to avoid log flooding. When a  validate
352       clause is present, destinations need not to be present, this allows for
353       applying a global validation rule. Note that the  cleansing  rules  are
354       applied  before  validation is done, thus the data will not have dupli‐
355       cate spaces. The route using clause is used to perform a temporary mod‐
356       ification to the key used for input to the consistent hashing routines.
357       The primary purpose is to route traffic so  that  appropriate  data  is
358       sent to the needed aggregation instances.
359
360   REWRITES
361       Rewrite rules take a regular expression as input to match incoming met‐
362       rics, and transform them into the  desired  new  metric  name.  In  the
363       replacement, backreferences are allowed to match capture groups defined
364       in the input regular expression. A match of server\.(x|y|z)\. allows to
365       use  e.g. role.\1. in the substitution. A few caveats apply to the cur‐
366       rent implementation of rewrite rules. First, their location in the con‐
367       fig  file determines when the rewrite is performed. The rewrite is done
368       in-place, as such a match rule before the rewrite would match the orig‐
369       inal  name, a match rule after the rewrite no longer matches the origi‐
370       nal name. Care should be taken with the ordering, as  multiple  rewrite
371       rules  in  succession  can  take place, e.g. a gets replaced by b and b
372       gets replaced by c in a succeeding rewrite rule. The second caveat with
373       the  current implementation, is that the rewritten metric names are not
374       cleansed, like newly incoming metrics are. Thus, double dots and poten‐
375       tial  dangerous  characters  can  appear  if  the replacement string is
376       crafted to produce them. It is the responsibility of the writer to make
377       sure  the  metrics  are clean. If this is an issue for routing, one can
378       consider to have a rewrite-only instance that forwards all  metrics  to
379       another  instance  that  will  do  the  routing.  Obviously  the second
380       instance will cleanse the metrics as they come  in.  The  backreference
381       notation  allows to lowercase and uppercase the replacement string with
382       the use of the underscore (_) and carret (^) symbols following directly
383       after the backslash. For example, role.\_1. as substitution will lower‐
384       case the contents of \1. The dot (.) can be used in a similar  fashion,
385       or  followed  after the underscore or caret to replace dots with under‐
386       scores in the substitution. This can be handy for some situations where
387       metrics are sent to graphite.
388
389   AGGREGATIONS
390       The  aggregations  defined  take one or more input metrics expressed by
391       one or more regular expresions, similar to the  match  rules.  Incoming
392       metrics are aggregated over a period of time defined by the interval in
393       seconds. Since events may arrive a bit later in  time,  the  expiration
394       time  in  seconds  defines  when  the aggregations should be considered
395       final, as no new entries are allowed to be added any more. On top of an
396       aggregation  multiple  aggregations can be computed. They can be of the
397       same or different aggregation types, but should write to a  unique  new
398       metric.  The  metric  names can include back references like in rewrite
399       expressions, allowing for powerful single aggregation rules that  yield
400       in many aggregations. When no send to clause is given, produced metrics
401       are sent to the relay as if they were submitted from the outside, hence
402       match  and  aggregation rules apply to those. Care should be taken that
403       loops are avoided this way. For this reason, the use  of  the  send  to
404       clause is encouraged, to direct the output traffic where possible. Like
405       for match rules, it is possible to  define  multiple  cluster  targets.
406       Also, like match rules, the stop keyword applies to control the flow of
407       metrics in the matching process.
408
409   STATISTICS
410       The send statistics to construct is deprecated and will be  removed  in
411       the next release. Use the special statistics construct instead.
412
413       The  statistics  construct  can  control  a  couple of things about the
414       (internal) statistics produced by the relay. The send to target can  be
415       used  to avoid router loops by sending the statistics to a certain des‐
416       tination cluster(s). By default the  metrics  are  prefixed  with  car‐
417       bon.relays.<hostname>, where hostname is determinted on startup and can
418       be overridden using the -H argument. This prefix can be set  using  the
419       prefix with clause similar to a rewrite rule target. The input match in
420       this case is the pre-set regular expression ^(([^.]+)(\..*)?)$  on  the
421       hostname.  As  such, one can see that the default prefix is set by car‐
422       bon.relays.\.1. Note that  this  uses  the  replace-dot-with-underscore
423       replacement feature from rewrite rules. Given the input expression, the
424       following match groups are available: \1 the entire  hostname,  \2  the
425       short  hostname  and  \3 the domainname (with leading dot). It may make
426       sense to replace the default by something  like  carbon.relays.\_2  for
427       certain  scenarios,  to always use the lowercased short hostname, which
428       following the expression doesn´t contain a dot. By default, the metrics
429       are  submitted  every  60 seconds, this can be changed using the submit
430       every <interval> seconds clause.
431       To obtain a more compatible set of values to carbon-cache.py,  use  the
432       reset  counters  after  interval  clause to make values non-cumulative,
433       that is, they will report the change compared to the previous value.
434
435   LISTENERS
436       The ports and protocols the relay should listen  for  incoming  connec‐
437       tions  can be specified using the listen directive. Currently, all lis‐
438       teners need to be of linemode type. An optional compression or  encryp‐
439       tion  wrapping  can  be  specified  for the port and optional interface
440       given by ip address, or unix socket by  file.  When  interface  is  not
441       specified,  the any interface on all available ip protocols is assumed.
442       If no listen directive is present, the relay will use the default  lis‐
443       teners  for port 2003 on tcp and udp, plus the unix socket /tmp/.s.car‐
444       bon-c-relay.2003. This typically expands to  5  listeners  on  an  IPv6
445       enabled  system. The default matches the behaviour of versions prior to
446       v3.2.
447
448   INCLUDES
449       In case configuration becomes very long, or is managed better in  sepa‐
450       rate files, the include directive can be used to read another file. The
451       given file will be read in place and added to the router  configuration
452       at  the  time  of inclusion. The end result is one big route configura‐
453       tion. Multiple include statements can be used throughout the configura‐
454       tion file. The positioning will influence the order of rules as normal.
455       Beware that recursive inclusion (include from an included file) is sup‐
456       ported,  and  currently  no safeguards exist for an inclusion loop. For
457       what is worth, this feature likely is best used with simple  configura‐
458       tion files (e.g. not having include in them).
459

EXAMPLES

461       carbon-c-relay  evolved  over  time,  growing features on demand as the
462       tool proved to be stable and fitting the job well.  Below  follow  some
463       annotated examples of constructs that can be used with the relay.
464
465       Clusters  can  be  defined as much as necessary. They receive data from
466       match rules, and their  type  defines  which  members  of  the  cluster
467       finally  get  the  metric  data. The simplest cluster form is a forward
468       cluster:
469
470
471
472           cluster send-through
473               forward
474                   10.1.0.1
475               ;
476
477
478
479       Any metric sent to the send-through cluster would simply  be  forwarded
480       to  the server at IPv4 address 10.1.0.1. If we define multiple servers,
481       all of those servers would get the same metric, thus:
482
483
484
485           cluster send-through
486               forward
487                   10.1.0.1
488                   10.2.0.1
489               ;
490
491
492
493       The above results in a duplication of metrics send  to  both  machines.
494       This  can be useful, but most of the time it is not. The any_of cluster
495       type is like forward, but it sends each incoming metric to any  of  the
496       members. The same example with such cluster would be:
497
498
499
500           cluster send-to-any-one
501               any_of 10.1.0.1:2010 10.1.0.1:2011;
502
503
504
505       This  would implement a multipath scenario, where two servers are used,
506       the load between them is spread, but should any of them fail, all  met‐
507       rics  are  sent  to  the  remaining  one. This typically works well for
508       upstream relays, or for balancing carbon-cache processes running on the
509       same machine. Should any member become unavailable, for instance due to
510       a rolling restart, the other members receive the traffic. If it is nec‐
511       essary  to have true fail-over, where the secondary server is only used
512       if the first is down, the following would implement that:
513
514
515
516           cluster try-first-then-second
517               failover 10.1.0.1:2010 10.1.0.1:2011;
518
519
520
521       These types are different from the two consistent hash cluster types:
522
523
524
525           cluster graphite
526               carbon_ch
527                   127.0.0.1:2006=a
528                   127.0.0.1:2007=b
529                   127.0.0.1:2008=c
530               ;
531
532
533
534       If a member in this example fails, all metrics that would  go  to  that
535       member are kept in the queue, waiting for the member to return. This is
536       useful for clusters of carbon-cache machines where it is desirable that
537       the  same metric ends up on the same server always. The carbon_ch clus‐
538       ter type is compatible with carbon-relay consistent hash,  and  can  be
539       used for existing clusters populated by carbon-relay. For new clusters,
540       however, it is better to use the  fnv1a_ch  cluster  type,  for  it  is
541       faster, and allows to balance over the same address but different ports
542       without an instance number, in constrast to carbon_ch.
543
544       Because we can use multiple clusters, we can also replicate without the
545       use of the forward cluster type, in a more intelligent way:
546
547
548
549           cluster dc-old
550               carbon_ch replication 2
551                   10.1.0.1
552                   10.1.0.2
553                   10.1.0.3
554               ;
555           cluster dc-new1
556               fnv1a_ch replication 2
557                   10.2.0.1
558                   10.2.0.2
559                   10.2.0.3
560               ;
561           cluster dc-new2
562               fnv1a_ch replication 2
563                   10.3.0.1
564                   10.3.0.2
565                   10.3.0.3
566               ;
567
568           match *
569               send to dc-old
570               ;
571           match *
572               send to
573                   dc-new1
574                   dc-new2
575               stop
576               ;
577
578
579
580       In  this  example  all  incoming metrics are first sent to dc-old, then
581       dc-new1 and finally to dc-new2. Note that the cluster type of dc-old is
582       different.  Each incoming metric will be send to 2 members of all three
583       clusters, thus replicating to in total 6 destinations. For each cluster
584       the destination members are computed independently. Failure of clusters
585       or members does not  affect  the  others,  since  all  have  individual
586       queues. The above example could also be written using three match rules
587       for each dc, or one match rule for all three  dcs.  The  difference  is
588       mainly  in  performance, the number of times the incoming metric has to
589       be matched against an expression. The stop rule in dc-new match rule is
590       not  strictly necessary in this example, because there are no more fol‐
591       lowing match rules. However, if the match would target a specific  sub‐
592       set,  e.g.  ^sys\.,  and  more clusters would be defined, this could be
593       necessary, as for instance in the following abbreviated example:
594
595
596
597           cluster dc1-sys ... ;
598           cluster dc2-sys ... ;
599
600           cluster dc1-misc ... ;
601           cluster dc2-misc ... ;
602
603           match ^sys\. send to dc1-sys;
604           match ^sys\. send to dc2-sys stop;
605
606           match * send to dc1-misc;
607           match * send to dc2-misc stop;
608
609
610
611       As can be seen, without the stop in dc2-sys´ match  rule,  all  metrics
612       starting  with sys. would also be send to dc1-misc and dc2-misc. It can
613       be that this is desired, of course, but in this example there is a ded‐
614       icated cluster for the sys metrics.
615
616       Suppose  there would be some unwanted metric that unfortunately is gen‐
617       erated, let´s assume some bad/old software. We don´t want to store this
618       metric.  The  blackhole cluster is suitable for that, when it is harder
619       to actually whitelist all wanted metrics. Consider the following:
620
621
622
623           match
624                   some_legacy1$
625                   some_legacy2$
626               send to blackhole
627               stop;
628
629
630
631       This would throw away all metrics that end with some_legacy, that would
632       otherwise  be  hard  to  filter out. Since the order matters, it can be
633       used in a construct like this:
634
635
636
637           cluster old ... ;
638           cluster new ... ;
639
640           match * send to old;
641
642           match unwanted send to blackhole stop;
643
644           match * send to new;
645
646
647
648       In this example  the  old  cluster  would  receive  the  metric  that´s
649       unwanted  for  the  new cluster. So, the order in which the rules occur
650       does matter for the execution.
651
652       Validation can be used to ensure the data for metrics is as expected. A
653       global validation for just number (no floating point) values could be:
654
655
656
657           match *
658               validate ^[0-9]+\ [0-9]+$ else drop
659               ;
660
661
662
663       (Note  the  escape  with backslash \ of the space, you might be able to
664       use \s or [:space:] instead, this depends on your libc implementation.)
665
666       The validation clause can exist on every match rule, so  in  principle,
667       the following is valid:
668
669
670
671           match ^foo
672               validate ^[0-9]+\ [0-9]+$ else drop
673               send to integer-cluster
674               ;
675           match ^foo
676               validate ^[0-9.e+-]+\ [0-9.e+-]+$ else drop
677               send to float-cluster
678               stop;
679
680
681
682       Note that the behaviour is different in the previous two examples. When
683       no send to clusters are specified, a validation error makes  the  match
684       behave  like  the  stop  keyword  is present. Likewise, when validation
685       passes, processing continues with the next rule. When destination clus‐
686       ters  are  present, the match respects the stop keyword as normal. When
687       specified, processing will always stop when specified so.  However,  if
688       validation  fails,  the  rule does not send anything to the destination
689       clusters, the metric will be dropped or logged, but never sent.
690
691       The relay is capable of rewriting incoming metrics  on  the  fly.  This
692       process  is  done based on regular expressions with capture groups that
693       allow to substitute parts in a replacement string. Rewrite rules  allow
694       to  cleanup  metrics from applications, or provide a migration path. In
695       it´s simplest form a rewrite rule looks like this:
696
697
698
699           rewrite ^server\.(.+)\.(.+)\.([a-zA-Z]+)([0-9]+)
700               into server.\_1.\2.\3.\3\4
701               ;
702
703
704
705       In this example a metric like server.DC.role.name123  would  be  trans‐
706       formed  into  server.dc.role.name.name123.  For  rewrite rules hold the
707       same as for matches, that their order matters. Hence to build on top of
708       the old/new cluster example done earlier, the following would store the
709       original metric name in the old cluster, and the new metric name in the
710       new cluster:
711
712
713
714           match * send to old;
715
716           rewrite ... ;
717
718           match * send to new;
719
720
721
722       Note  that  after  the  rewrite,  the original metric name is no longer
723       available, as the rewrite happens in-place.
724
725       Aggregations are probably the most complex part of carbon-c-relay.  Two
726       ways  of  specifying  aggregates  are  supported by carbon-c-relay. The
727       first, static rules, are handled by an optimiser which  tries  to  fold
728       thousands of rules into groups to make the matching more efficient. The
729       second, dynamic rules, are very powerful compact definitions with  pos‐
730       sibly  thousands  of internal instantiations. A typical static aggrega‐
731       tion looks like:
732
733
734
735           aggregate
736                   ^sys\.dc1\.somehost-[0-9]+\.somecluster\.mysql\.replication_delay
737                   ^sys\.dc2\.somehost-[0-9]+\.somecluster\.mysql\.replication_delay
738               every 10 seconds
739               expire after 35 seconds
740               timestamp at end of bucket
741               compute sum write to
742                   mysql.somecluster.total_replication_delay
743               compute average write to
744                   mysql.somecluster.average_replication_delay
745               compute max write to
746                   mysql.somecluster.max_replication_delay
747               compute count write to
748                   mysql.somecluster.replication_delay_metric_count
749               ;
750
751
752
753       In this example, four  aggregations  are  produced  from  the  incoming
754       matching metrics. In this example we could have written the two matches
755       as one, but for demonstration purposes we did not. Obviously  they  can
756       refer  to  different metrics, if that makes sense. The every 10 seconds
757       clause specifies in what interval the aggregator can expect new metrics
758       to arrive. This interval is used to produce the aggregations, thus each
759       10 seconds 4 new metrics are generated from the  data  received  sofar.
760       Because  data may be in transit for some reason, or generation stalled,
761       the expire after clause specifies how long  the  data  should  be  kept
762       before  considering a data bucket (which is aggregated) to be complete.
763       In the example, 35 was used, which means after  35  seconds  the  first
764       aggregates  are produced. It also means that metrics can arrive 35 sec‐
765       onds late, and still be taken into account. The exact time at which the
766       aggregate  metrics are produced is random between 0 and interval (10 in
767       this case) seconds after the expiry time. This is done to prevent thun‐
768       dering  herds of metrics for large aggregation sets. The timestamp that
769       is used for the aggregations can be specified to be the  start,  middle
770       or  end  of the bucket. Original carbon-aggregator.py uses start, while
771       carbon-c-relay´s default has  always  been  end.  The  compute  clauses
772       demonstrate  a single aggregation rule can produce multiple aggregates,
773       as often is the case. Internally, this comes for free, since all possi‐
774       ble aggregates are always calculated, whether or not they are used. The
775       produced new metrics  are  resubmitted  to  the  relay,  hence  matches
776       defined before in the configuration can match output of the aggregator.
777       It is important to avoid loops, that can be generated this way. In gen‐
778       eral, splitting aggregations to their own carbon-c-relay instance, such
779       that it is easy to  forward  the  produced  metrics  to  another  relay
780       instance is a good practice.
781
782       The previous example could also be written as follows to be dynamic:
783
784
785
786           aggregate
787                   ^sys\.dc[0-9].(somehost-[0-9]+)\.([^.]+)\.mysql\.replication_delay
788               every 10 seconds
789               expire after 35 seconds
790               compute sum write to
791                   mysql.host.\1.replication_delay
792               compute sum write to
793                   mysql.host.all.replication_delay
794               compute sum write to
795                   mysql.cluster.\2.replication_delay
796               compute sum write to
797                   mysql.cluster.all.replication_delay
798               ;
799
800
801
802       Here  a single match, results in four aggregations, each of a different
803       scope. In this example aggregation based on hostname  and  cluster  are
804       being  made,  as  well  as  the more general all targets, which in this
805       example have both identical values. Note that with this single aggrega‐
806       tion  rule,  both per-cluster, per-host and total aggregations are pro‐
807       duced. Obviously, the input metrics define which hosts and clusters are
808       produced.
809
810       With use of the send to clause, aggregations can be made more intuitive
811       and less error-prone. Consider the below example:
812
813
814
815           cluster graphite fnv1a_ch ip1 ip2 ip3;
816
817           aggregate ^sys\.somemetric
818               every 60 seconds
819               expire after 75 seconds
820               compute sum write to
821                   sys.somemetric
822               send to graphite
823               stop
824               ;
825
826           match * send to graphite;
827
828
829
830       It sends all incoming metrics  to  the  graphite  cluster,  except  the
831       sys.somemetric  ones,  which it replaces with a sum of all the incoming
832       ones. Without a stop in the aggregate, this causes a loop, and  without
833       the  send  to, the metric name can´t be kept its original name, for the
834       output now directly goes to the cluster.
835

STATISTICS

837       When carbon-c-relay is run without -d or -s arguments, statistics  will
838       be  produced.  By default they are sent to the relay itself in the form
839       of carbon.relays.<hostname>.*. See the statistics construct to override
840       this  prefix,  sending interval and values produced. While many metrics
841       have a similar name to what carbon-cache.py would produce, their values
842       are  likely  different.  By  default,  most values are running counters
843       which only increase over time. The use of  the  nonNegativeDerivative()
844       function from graphite is useful with these.
845
846       The  following  metrics are produced under the carbon.relays.<hostname>
847       namespace:
848
849       ·   metricsReceived
850
851           The number of metrics that were received  by  the  relay.  Received
852           here  means  that  they  were seen and processed by any of the dis‐
853           patchers.
854
855       ·   metricsSent
856
857           The number of metrics that were sent from  the  relay.  This  is  a
858           total  count  for  all  servers combined. When incoming metrics are
859           duplicated by the cluster configuration, this counter will  include
860           all  those duplications. In other words, the amount of metrics that
861           were successfully sent to other systems. Note that metrics that are
862           processed  (received)  but  still in the sending queue (queued) are
863           not included in this counter.
864
865       ·   metricsDiscarded
866
867           The number of input lines that were not considered to  be  a  valid
868           metric.  Such  lines  can  be empty, only containing whitespace, or
869           hitting the limits given for max input  length  and/or  max  metric
870           length (see -m and -M options).
871
872       ·   metricsQueued
873
874           The  total  number  of metrics that are currently in the queues for
875           all the server targets. This metric is not cumulative, for it is  a
876           sample  of  the  queue size, which can (and should) go up and down.
877           Therefore you should not use the derivative function for this  met‐
878           ric.
879
880       ·   metricsDropped
881
882           The  total  number  of  metric that had to be dropped due to server
883           queues overflowing. A queue typically overflows when the server  it
884           tries  to  send  its  metrics  to  is not reachable, or too slow in
885           ingesting the amount of metrics queued.  This  can  be  network  or
886           resource  related,  and also greatly depends on the rate of metrics
887           being sent to the particular server.
888
889       ·   metricsBlackholed
890
891           The number of metrics that did not match any  rule,  or  matched  a
892           rule  with  blackhole as target. Depending on your configuration, a
893           high value might be an indication of a misconfiguration  somewhere.
894           These  metrics were received by the relay, but never sent anywhere,
895           thus they disappeared.
896
897       ·   metricStalls
898
899           The number of times the relay had to stall  a  client  to  indicate
900           that  the  downstream server cannot handle the stream of metrics. A
901           stall is only performed when the queue is full and  the  server  is
902           actually  receptive  of  metrics,  but just too slow at the moment.
903           Stalls typically happen during micro-bursts, where the client typi‐
904           cally is unaware that it should stop sending more data, while it is
905           able to.
906
907       ·   connections
908
909           The number of connect requests handled. This is an ever  increasing
910           number just counting how many connections were accepted.
911
912       ·   disconnects
913
914           The  number  of  disconnected  clients. A disconnect either happens
915           because the client goes away, or due to  an  idle  timeout  in  the
916           relay.  The  difference  between this metric and connections is the
917           amount of connections actively held by the relay. In normal  situa‐
918           tions  this  amount  remains within reasonable bounds. Many connec‐
919           tions, but few disconnections typically indicate a possible connec‐
920           tion  leak  in  the  client. The idle connections disconnect in the
921           relay here is to guard against resource drain in such scenarios.
922
923       ·   dispatch_wallTime_us
924
925           The number of microseconds spent by the  dispatchers  to  do  their
926           work.  In  particular on multi-core systems, this value can be con‐
927           fusing, however, it indicates how long the dispatchers  were  doing
928           work handling clients. It includes everything they do, from reading
929           data from a socket, cleaning up the input  metric,  to  adding  the
930           metric to the appropriate queues. The larger the configuration, and
931           more complex in terms of matches, the  more  time  the  dispatchers
932           will spend on the cpu. But also time they do /not/ spend on the cpu
933           is included in this number. It is the pure wallclock time the  dis‐
934           patcher was serving a client.
935
936       ·   dispatch_sleepTime_us
937
938           The  number of microseconds spent by the dispatchers sleeping wait‐
939           ing for work. When this value gets small (or even  zero)  the  dis‐
940           patcher has so much work that it doesn´t sleep any more, and likely
941           can´t process the work in a timely fashion  any  more.  This  value
942           plus  the  wallTime  from above sort of sums up to the total uptime
943           taken by this dispatcher. Therefore,  expressing  the  wallTime  as
944           percentage  of  this sum gives the busyness percentage draining all
945           the way up to 100% if sleepTime goes to 0.
946
947       ·   server_wallTime_us
948
949           The number of microseconds spent by the servers to send the metrics
950           from their queues. This value includes connection creation, reading
951           from the queue, and sending metrics over the network.
952
953       ·   dispatcherX
954
955           For each indivual dispatcher, the metrics received  and  blackholed
956           plus the wall clock time. The values are as described above.
957
958       ·   destinations.X
959
960           For  all known destinations, the number of dropped, queued and sent
961           metrics plus the wall clock time spent. The values are as described
962           above.
963
964       ·   aggregators.metricsReceived
965
966           The number of metrics that were matched an aggregator rule and were
967           accepted by the aggregator. When a metric matches multiple aggrega‐
968           tors, this value will reflect that. A metric is not counted when it
969           is considered syntactically invalid, e.g. no value was found.
970
971       ·   aggregators.metricsDropped
972
973           The number of metrics that were sent to an aggregator, but did  not
974           fit  timewise. This is either because the metric was too far in the
975           past or future. The expire after  clause  in  aggregate  statements
976           controls how long in the past metric values are accepted.
977
978       ·   aggregators.metricsSent
979
980           The  number  of  metrics that were sent from the aggregators. These
981           metrics were produced and are the actual results of aggregations.
982
983
984

BUGS

986       Please report them at: https://github.com/grobian/carbon-c-relay/issues
987

AUTHOR

989       Fabian Groffen <grobian@gentoo.org>
990

ACKNOWLEDGEMENTS

1013       This program was originally developed for Booking.com,  which  approved
1014       that  the code was published and released as Open Source on GitHub, for
1015       which the author would like to express his gratitude.  Development  has
1016       continued since with the help of many contributors suggesting features,
1017       reporting bugs, adding patches and more  to  make  carbon-c-relay  into
1018       what it is today.
1019
1020
1021
1022                                 October 2019                CARBON-C-RELAY(1)