1CARBON-C-RELAY(1)                                            CARBON-C-RELAY(1)
2
3
4

NAME

6       carbon-c-relay    --    graphite   relay,   aggregator   and   rewriter
7       https://travis-ci.org/grobian/carbon-c-relay
8

SYNOPSIS

10       carbon-c-relay -f config-file [ options ... ]
11

DESCRIPTION

13       carbon-c-relay  accepts,  cleanses,  matches,  rewrites,  forwards  and
14       aggregates  graphite  metrics by listening for incoming connections and
15       relaying the messages to other servers defined  in  its  configuration.
16       The  core  functionality is to route messages via flexible rules to the
17       desired destinations.
18
19       carbon-c-relay is a simple program that reads its  routing  information
20       from  a file.  The command line arguments allow to set the location for
21       this file, as well as the amount of dispatchers (worker threads) to use
22       for  reading  the  data from incoming connections and passing them onto
23       the right destination(s).  The route file supports two main constructs:
24       clusters  and  matches.   The first define groups of hosts data metrics
25       can be sent to, the latter define which metrics should be sent to which
26       cluster.  Aggregation rules are seen as matches.
27
28       For  every  metric  received by the relay, cleansing is performed.  The
29       following changes are performed before any match, aggregate or  rewrite
30       rule sees the metric:
31
32       ·   double dot elimination (necessary for correctly functioning consis‐
33           tent hash routing)
34
35       ·   trailing/leading dot elimination
36
37       ·   whitespace normalisation (this mostly affects output of  the  relay
38           to  other targets: metric, value and timestamp will be separated by
39           a single space only, ever)
40
41       ·   irregular char replacement with underscores (_), currently  irregu‐
42           lar is defined as not being in [0-9a-zA-Z-_:#], but can be overrid‐
43           den on the command line.  Note that tags (when present and allowed)
44           are not processed this way.
45
46
47

OPTIONS

49       These options control the behaviour of carbon-c-relay.
50
51       ·   -v: Print version string and exit.
52
53       ·   -d:  Enable debug mode, this prints statistics to stdout and prints
54           extra messages about some situations encountered by the relay  that
55           normally would be too verbose to be enabled.  When combined with -t
56           (test mode) this also prints stub routes and  consistent-hash  ring
57           contents.
58
59       ·   -s:  Enable submission mode.  In this mode, internal statistics are
60           not generated.  Instead,  queue  pressure  and  metrics  drops  are
61           reported  on  stdout.   This mode is useful when used as submission
62           relay which´ job is just to forward to  (a  set  of)  main  relays.
63           Statistics about the submission relays in this case are not needed,
64           and could easily cause a non-desired flood  of  metrics  e.g.  when
65           used on each and every host locally.
66
67       ·   -t:  Test  mode.   This  mode  doesn´t  do  any routing at all, but
68           instead reads input from stdin and prints  what  actions  would  be
69           taken given the loaded configuration.  This mode is very useful for
70           testing relay routes for regular expression syntax  etc.   It  also
71           allows to give insight on how routing is applied in complex config‐
72           urations, for it shows rewrites  and  aggregates  taking  place  as
73           well.  When -t is repeated, the relay will only test the configura‐
74           tion for validity and exit immediately  afterwards.   Any  standard
75           output   is   suppressed   in   this  mode,  making  it  ideal  for
76           start-scripts to test a (new) configuration.
77
78       ·   -D: Deamonise into  the  background  after  startup.   This  option
79           requires -l and -P flags to be set as well.
80
81       ·   -f  config-file: Read configuration from config-file.  A configura‐
82           tion consists of clusters and routes.  See CONFIGURATION SYNTAX for
83           more information on the options and syntax of this file.
84
85       ·   -l  log-file:  Use  log-file  for  writing  messages.  Without this
86           option, the relay writes both to stdout and stderr.   When  logging
87           to  file, all messages are prefixed with MSG when they were sent to
88           stdout, and ERR when they were sent to stderr.
89
90       ·   -p port: Listen for connections on port port.  The port  number  is
91           used  for  both TCP, UDP and UNIX sockets.  In the latter case, the
92           socket file contains the port number.  The port defaults  to  2003,
93           which is also used by the original carbon-cache.py.  Note that this
94           only applies to the defaults, when listen  directives  are  in  the
95           config, this setting is ignored.
96
97       ·   -w  workers:  Use workers number of threads.  The default number of
98           workers is equal to the amount of detected  CPU  cores.   It  makes
99           sense  to  reduce  this  number  on many-core machines, or when the
100           traffic is low.
101
102       ·   -b batchsize: Set the amount of metrics that sent to remote servers
103           at  once to batchsize.  When the relay sends metrics to servers, it
104           will retrieve batchsize metrics from the pending queue  of  metrics
105           waiting for that server and send those one by one.  The size of the
106           batch will have minimal impact on sending performance, but it  con‐
107           trols  the  amount of lock-contention on the queue.  The default is
108           2500.
109
110       ·   -q queuesize: Each server from the configuration  where  the  relay
111           will  send  metrics to, has a queue associated with it.  This queue
112           allows for disruptions and bursts to be handled.  The size of  this
113           queue will be set to queuesize which allows for that amount of met‐
114           rics to be stored in the queue before it overflows, and  the  relay
115           starts dropping metrics.  The larger the queue, more metrics can be
116           absorbed, but also more memory will be  used  by  the  relay.   The
117           default queue size is 25000.
118
119       ·   -L  stalls: Sets the max mount of stalls to stalls before the relay
120           starts dropping metrics for a server.  When a queue fills  up,  the
121           relay  uses a mechanism called stalling to signal the client (writ‐
122           ing to the relay) of this event.  In  particular  when  the  client
123           sends  a  large  amount  of  metrics  in  very  short time (burst),
124           stalling can help to avoid dropping metrics, since the client  just
125           needs to slow down for a bit, which in many cases is possible (e.g.
126           when catting a file with nc(1)).  However, this behaviour can  also
127           obstruct, artificially stalling writers which cannot stop that eas‐
128           ily.  For this the stalls can be set from 0 to 15, where each stall
129           can  take  around 1 second on the client.  The default value is set
130           to 4, which is aimed at the occasional disruption scenario and  max
131           effort to not loose metrics with moderate slowing down of clients.
132
133       ·   -B  backlog:  Sets TCP connection listen backlog to backlog connec‐
134           tions.  The default value is 32 but on servers which  receive  many
135           concurrent  connections,  this setting likely needs to be increased
136           to avoid connection refused errors on the clients.
137
138       ·   -U bufsize: Sets the socket send/receive buffer sizes in bytes, for
139           both  TCP  and  UDP scenarios.  When unset, the OS default is used.
140           The maximum is also determined by the OS.  The sizes are set  using
141           setsockopt with the flags SORCVBUF and SOSNDBUF.  Setting this size
142           may be necessary for large volume  scenarios,  for  which  also  -B
143           might  apply.   Checking  the  Recv-Q and the receive errors values
144           from netstat gives a good hint about buffer usage.
145
146       ·   -T timeout: Specifies the  IO  timeout  in  milliseconds  used  for
147           server  connections.  The default is 600 milliseconds, but may need
148           increasing when WAN links are used for  target  servers.   A  rela‐
149           tively low value for connection timeout allows the relay to quickly
150           establish a server is unreachable, and as such failover  strategies
151           to kick in before the queue runs high.
152
153       ·   -E:  Disable  disconnecting  idle incoming connections.  By default
154           the relay disconnects idle client connections after 10 minutes.  It
155           does  this  to prevent resources clogging up when a faulty or mali‐
156           cious client keeps on opening connections without closing them.  It
157           typically  prevents running out of file descriptors.  For some sce‐
158           narios, however, it is not desirable for  idle  connections  to  be
159           disconnected, hence passing this flag will disable this behaviour.
160
161       ·   -c  chars:  Defines  the  characters  that  are next to [A-Za-z0-9]
162           allowed in metrics to chars.  Any character not in  this  list,  is
163           replaced  by  the  relay  with _ (underscore).  The default list of
164           allowed characters is -_:#.
165
166       ·   -H hostname: Override hostname determined by  a  call  to  gethost‐
167           name(3)  with hostname.  The hostname is used mainly in the statis‐
168           tics metrics carbon.relays.<hostname>.<...> sent by the relay.
169
170       ·   -P pidfile: Write the pid of the relay process  to  a  file  called
171           pidfile.   This is in particular useful when daemonised in combina‐
172           tion with init managers.
173
174       ·   -O threshold: The minimum number of rules to find before trying  to
175           optimise the ruleset.  The default is 50, to disable the optimiser,
176           use -1, to always run the optimiser use 0.  The optimiser tries  to
177           group  rules  to  avoid spending excessive time on matching expres‐
178           sions.
179
180
181

CONFIGURATION SYNTAX

183       The config file supports the following  syntax,  where  comments  start
184       with  a  #  character and can appear at any position on a line and sup‐
185       press input until the end of that line:
186
187       ` cluster <name>
188           < <forward | anyof | failover> [useall] |
189             <carbonch | fnv1ach | jumpfnv1a_ch> [replication <count>] >
190               <host[:port][=instance] [proto <udp | tcp>]
191                                       [type linemode]
192                                       [transport  <gzip  |  lz4   |   snappy>
193       [ssl]]> ...
194           ;
195
196       cluster <name>
197           file [ip]
198               </path/to/file> ...
199           ;
200
201       match
202               <* | expression ...>
203           [validate <expression> else <log | drop>]
204           send to <cluster ... | blackhole>
205           [stop]
206           ;
207
208       rewrite <expression>
209           into <replacement>
210           ;
211
212       aggregate
213               <expression> ...
214           every <interval> seconds
215           expire after <expiration> seconds
216           [timestamp at <start | middle | end> of bucket]
217           compute <sum | count | max | min | average |
218                    median | percentile<%> | variance | stddev> write to
219               <metric>
220           [compute ...]
221           [send to <cluster ...>]
222           [stop]
223           ;
224
225       send statistics to <cluster ...>
226           [stop]
227           ; statistics
228           [submit every <interval> seconds]
229           [reset counters after interval]
230           [prefix with <prefix>]
231           [send to <cluster ...>]
232           [stop]
233           ;
234
235       listen
236           type linemode [transport <gzip | lz4> [ssl <pemcert>]]
237               <<interface[:port] | port> proto <udp | tcp>> ...
238               </ptah/to/file proto unix> ...
239           ;
240
241       include </path/to/file/or/glob>
242           ; `
243
244   CLUSTERS
245       Multiple  clusters  can  be defined, and need not to be referenced by a
246       match rule.   All clusters point to one or more hosts, except the  file
247       cluster  which writes to files in the local filesystem.  host may be an
248       IPv4 or IPv6 address, or a hostname.  Since  host  is  followed  by  an
249       optional  : and port, for IPv6 addresses not to be interpreted wrongly,
250       either a port must be given, or the IPv6 address surrounded  by  brack‐
251       ets,  e.g.  [::1].  Optional transport and proto clauses can be used to
252       wrap the connection in a compression or encryption later or specify the
253       use  of  UDP  or TCP to connect to the remote server.  When omitted the
254       connection defaults to an unwrapped TCP connection.  type can  only  be
255       linemode at the moment.
256
257       The  forward  and  file clusters simply send everything they receive to
258       all defined members (host addresses or files).  The any_of cluster is a
259       small  variant  of  the  forward cluster, but instead of sending to all
260       defined members, it sends each incoming metric to one of  defined  mem‐
261       bers.   This is not much useful in itself, but since any of the members
262       can receive each metric, this means that when one  of  the  members  is
263       unreachable,  the  other members will receive all of the metrics.  This
264       can be useful when the cluster points  to  other  relays.   The  any_of
265       router tries to send the same metrics consistently to the same destina‐
266       tion.  The failover cluster is like the any_of cluster, but  sticks  to
267       the  order  in  which servers are defined.  This is to implement a pure
268       failover scenario between servers.  The  carbon_ch  cluster  sends  the
269       metrics  to  the member that is responsible according to the consistent
270       hash algorithm (as used in the original carbon), or multiple members if
271       replication is set to more than 1.  The fnv1a_ch cluster is a identical
272       in behaviour to carbon_ch, but  it  uses  a  different  hash  technique
273       (FNV1a)  which is faster but more importantly defined to get by a limi‐
274       tation of carbon_ch to use both host and port from the  members.   This
275       is useful when multiple targets live on the same host just separated by
276       port.  The instance that original carbon uses to get around this can be
277       set  by  appending it after the port, separated by an equals sign, e.g.
278       127.0.0.1:2006=a for instance a.  When using the fnv1a_ch cluster, this
279       instance  overrides  the hash key in use.  This allows for many things,
280       including masquerading old IP addresses, but mostly to  make  the  hash
281       key location to become agnostic of the (physical) location of that key.
282       For example, usage like  10.0.0.1:2003=4d79d13554fa1301476c1f9fe968b0ac
283       would  allow  to  change  port  and/or  ip  address  of the server that
284       receives data for the instance key.  Obviously, this way  migration  of
285       data can be dealt with much more conveniently.  The jump_fnv1a_ch clus‐
286       ter is also a consistent hash cluster like the  previous  two,  but  it
287       does not take the server information into account at all.  Whether this
288       is useful to you depends on your scenario.  The jump hash  has  a  much
289       better  balancing  over  the  servers  defined  in  the cluster, at the
290       expense of not being able to remove any server but the last  in  order.
291       What  this  means  is  that  this hash is fine to use with ever growing
292       clusters where older nodes are also replaced at  some  point.   If  you
293       have  a  cluster where removal of old nodes takes place often, the jump
294       hash is not suitable for you.  Jump  hash  works  with  servers  in  an
295       ordered  list  without  gaps.   To influence the ordering, the instance
296       given to the server will be used as sorting key.   Without,  the  order
297       will  be  as given in the file.  It is a good practice to fix the order
298       of the servers with instances such that it is explicit what  the  right
299       nodes for the jump hash are.
300
301       DNS  hostnames are resolved to a single address, according to the pref‐
302       erence rules in  RFC  3484  https://www.ietf.org/rfc/rfc3484.txt.   The
303       any_of, failover and forward clusters have an explicit useall flag that
304       enables expansion for hostnames resolving to multiple addresses.   Each
305       address returned becomes a cluster destination.
306
307   MATCHES
308       Match rules are the way to direct incoming metrics to one or more clus‐
309       ters.  Match rules are processed top to bottom as they are  defined  in
310       the  file.  It is possible to define multiple matches in the same rule.
311       Each match rule can send data to one or  more  clusters.   Since  match
312       rules  "fall  through"  unless  the  stop  keyword  is added, carefully
313       crafted match expression can be used to  target  multiple  clusters  or
314       aggregations.   This  ability  allows  to replicate metrics, as well as
315       send certain metrics to alternative clusters with careful ordering  and
316       usage  of the stop keyword.  The special cluster blackhole discards any
317       metrics sent to it.  This can be useful for weeding out  unwanted  met‐
318       rics  in  certain cases.  Because throwing metrics away is pointless if
319       other matches would accept the same data, a match with  as  destination
320       the  blackhole  cluster,  has  an implicit stop.  The validation clause
321       adds a check to the data (what comes after the metric) in the form of a
322       regular  expression.  When this expression matches, the match rule will
323       execute as if no validation clause was present.  However, if it  fails,
324       the match rule is aborted, and no metrics will be sent to destinations,
325       this is the drop behaviour.  When log is used, the metric is logged  to
326       stderr.   Care  should  be taken with the latter to avoid log flooding.
327       When a validate clause is present, destinations need not to be present,
328       this  allows  for  applying  a  global  validation rule.  Note that the
329       cleansing rules are applied before validation is done,  thus  the  data
330       will not have duplicate spaces.  The route using clause is used to per‐
331       form a temporary modification to the key used for input to the  consis‐
332       tent hashing routines.  The primary purpose is to route traffic so that
333       appropriate data is sent to the needed aggregation instances.
334
335   REWRITES
336       Rewrite rules take a regular expression as input to match incoming met‐
337       rics,  and  transform  them  into  the desired new metric name.  In the
338       replacement, backreferences are allowed to match capture groups defined
339       in  the  input regular expression.  A match of server\.(x|y|z)\. allows
340       to use e.g. role.\1. in the substitution.  A few caveats apply  to  the
341       current  implementation of rewrite rules.  First, their location in the
342       config file determines when the rewrite is performed.  The  rewrite  is
343       done  in-place, as such a match rule before the rewrite would match the
344       original name, a match rule after the rewrite  no  longer  matches  the
345       original name.  Care should be taken with the ordering, as multiple re‐
346       write rules in succession can take place, e.g. a gets replaced by b and
347       b  gets  replaced by c in a succeeding rewrite rule.  The second caveat
348       with the current implementation, is that the rewritten metric names are
349       not  cleansed,  like newly incoming metrics are.  Thus, double dots and
350       potential dangerous characters can appear if the replacement string  is
351       crafted  to  produce  them.   It is the responsibility of the writer to
352       make sure the metrics are clean.  If this is an issue for routing,  one
353       can  consider to have a rewrite-only instance that forwards all metrics
354       to another instance that will do the  routing.   Obviously  the  second
355       instance  will  cleanse the metrics as they come in.  The backreference
356       notation allows to lowercase and uppercase the replacement string  with
357       the use of the underscore (_) and carret (^) symbols following directly
358       after the backslash.  For example, role.\_1. as substitution will  low‐
359       ercase  the contents of \1.  The dot (.) can be used in a similar fash‐
360       ion, or followed after the underscore or caret  to  replace  dots  with
361       underscores in the substitution.  This can be handy for some situations
362       where metrics are sent to graphite.
363
364   AGGREGATIONS
365       The aggregations defined take one or more input  metrics  expressed  by
366       one  or  more regular expresions, similar to the match rules.  Incoming
367       metrics are aggregated over a period of time defined by the interval in
368       seconds.   Since  events may arrive a bit later in time, the expiration
369       time in seconds defines when  the  aggregations  should  be  considered
370       final,  as  no new entries are allowed to be added any more.  On top of
371       an aggregation multiple aggregations can be computed.  They can  be  of
372       the  same  or different aggregation types, but should write to a unique
373       new metric.  The metric names can include back references like  in  re‐
374       write  expressions, allowing for powerful single aggregation rules that
375       yield in many aggregations.  When no send to clause is given,  produced
376       metrics  are  sent to the relay as if they were submitted from the out‐
377       side, hence match and aggregation rules apply to those.  Care should be
378       taken that loops are avoided this way.  For this reason, the use of the
379       send to clause is encouraged, to direct the output traffic where possi‐
380       ble.   Like  for match rules, it is possible to define multiple cluster
381       targets.  Also, like match rules, the stop keyword applies  to  control
382       the flow of metrics in the matching process.
383
384   STATISTICS
385       The  send  statistics to construct is deprecated and will be removed in
386       the next release.  Use the special statistics construct instead.
387
388       The statistics construct can control  a  couple  of  things  about  the
389       (internal) statistics produced by the relay.  The send to target can be
390       used to avoid router loops by sending the statistics to a certain  des‐
391       tination  cluster(s).   By  default  the metrics are prefixed with car‐
392       bon.relays.<hostname>, where hostname is determinted on startup and can
393       be  overridden using the -H argument.  This prefix can be set using the
394       prefix with clause similar to a rewrite rule target.  The  input  match
395       in  this  case  is the pre-set regular expression ^(([^.]+)(\..*)?)$ on
396       the hostname.  As such, one can see that the default prefix is  set  by
397       carbon.relays.\.1.  Note that this uses the replace-dot-with-underscore
398       replacement feature from rewrite rules.  Given  the  input  expression,
399       the  following  match  groups are available: \1 the entire hostname, \2
400       the short hostname and \3 the domainname (with leading  dot).   It  may
401       make  sense  to replace the default by something like carbon.relays.\_2
402       for certain scenarios, to always use  the  lowercased  short  hostname,
403       which  following the expression doesn´t contain a dot.  By default, the
404       metrics are submitted every 60 seconds, this can be changed  using  the
405       submit every <interval> seconds clause.
406       To  obtain  a more compatible set of values to carbon-cache.py, use the
407       reset counters after interval clause  to  make  values  non-cumulative,
408       that is, they will report the change compared to the previous value.
409
410   LISTENERS
411       The  ports  and  protocols the relay should listen for incoming connec‐
412       tions can be specified using the listen directive.  Currently, all lis‐
413       teners need to be of linemode type.  An optional compression or encryp‐
414       tion wrapping can be specified for  the  port  and  optional  interface
415       given  by  ip  address,  or unix socket by file.  When interface is not
416       specified, the any interface on all available ip protocols is  assumed.
417       If  no listen directive is present, the relay will use the default lis‐
418       teners for port 2003 on tcp and udp, plus the unix socket  /tmp/.s.car‐
419       bon-c-relay.2003.   This  typically  expands  to 5 listeners on an IPv6
420       enabled system.  The default matches the behaviour of versions prior to
421       v3.2.
422
423   INCLUDES
424       In  case configuration becomes very long, or is managed better in sepa‐
425       rate files, the include directive can be used  to  read  another  file.
426       The given file will be read in place and added to the router configura‐
427       tion at the time of inclusion.  The end result is one big route config‐
428       uration.   Multiple  include statements can be used throughout the con‐
429       figuration file.  The positioning will influence the order of rules  as
430       normal.   Beware  that  recursive  inclusion  (include from an included
431       file) is supported, and currently no safeguards exist for an  inclusion
432       loop.   For what is worth, this feature likely is best used with simple
433       configuration files (e.g. not having include in them).
434

EXAMPLES

436       carbon-c-relay evolved over time, growing features  on  demand  as  the
437       tool  proved  to be stable and fitting the job well.  Below follow some
438       annotated examples of constructs that can be used with the relay.
439
440       Clusters can be defined as much as necessary.  They receive  data  from
441       match  rules,  and  their  type  defines  which  members of the cluster
442       finally get the metric data.  The simplest cluster form  is  a  forward
443       cluster:
444
445        cluster send-through
446           forward
447               10.1.0.1
448           ;
449
450       Any  metric  sent to the send-through cluster would simply be forwarded
451       to the server at IPv4 address 10.1.0.1.  If we define multiple servers,
452       all of those servers would get the same metric, thus:
453
454        cluster send-through
455           forward
456               10.1.0.1
457               10.2.0.1
458           ;
459
460       The  above  results  in a duplication of metrics send to both machines.
461       This can be useful, but most of the time it is not.  The any_of cluster
462       type  is  like forward, but it sends each incoming metric to any of the
463       members.  The same example with such cluster would be:
464
465        cluster send-to-any-one
466           any_of 10.1.0.1:2010 10.1.0.1:2011;
467
468       This would implement a multipath scenario, where two servers are  used,
469       the  load between them is spread, but should any of them fail, all met‐
470       rics are sent to the remaining one.   This  typically  works  well  for
471       upstream relays, or for balancing carbon-cache processes running on the
472       same machine.  Should any member become unavailable, for  instance  due
473       to  a rolling restart, the other members receive the traffic.  If it is
474       necessary to have true fail-over, where the secondary  server  is  only
475       used if the first is down, the following would implement that:
476
477        cluster try-first-then-second
478           failover 10.1.0.1:2010 10.1.0.1:2011;
479
480       These types are different from the two consistent hash cluster types:
481
482        cluster graphite
483           carbon_ch
484               127.0.0.1:2006=a
485               127.0.0.1:2007=b
486               127.0.0.1:2008=c
487           ;
488
489       If  a  member  in this example fails, all metrics that would go to that
490       member are kept in the queue, waiting for the member to  return.   This
491       is  useful  for clusters of carbon-cache machines where it is desirable
492       that the same metric ends up on the same server always.  The  carbon_ch
493       cluster  type  is compatible with carbon-relay consistent hash, and can
494       be used for existing clusters populated by carbon-relay.  For new clus‐
495       ters, however, it is better to use the fnv1a_ch cluster type, for it is
496       faster, and allows to balance over the same address but different ports
497       without an instance number, in constrast to carbon_ch.
498
499       Because we can use multiple clusters, we can also replicate without the
500       use of the forward cluster type, in a more intelligent way:
501
502       ` cluster dc-old
503           carbonch replication 2
504               10.1.0.1
505               10.1.0.2
506               10.1.0.3
507           ; cluster dc-new1
508           fnv1ach replication 2
509               10.2.0.1
510               10.2.0.2
511               10.2.0.3
512           ; cluster dc-new2
513           fnv1a_ch replication 2
514               10.3.0.1
515               10.3.0.2
516               10.3.0.3
517           ;
518
519       match
520           send to dc-old
521           ; match
522           send to
523               dc-new1
524               dc-new2
525           stop
526           ; `
527
528       In this example all incoming metrics are first  sent  to  dc-old,  then
529       dc-new1  and  finally to dc-new2.  Note that the cluster type of dc-old
530       is different.  Each incoming metric will be send to 2  members  of  all
531       three  clusters, thus replicating to in total 6 destinations.  For each
532       cluster the destination members are computed independently.  Failure of
533       clusters or members does not affect the others, since all have individ‐
534       ual queues.  The above example could also be written using three  match
535       rules for each dc, or one match rule for all three dcs.  The difference
536       is mainly in performance, the number of times the incoming  metric  has
537       to  be  matched  against  an expression.  The stop rule in dc-new match
538       rule is not strictly necessary in this example, because  there  are  no
539       more  following match rules.  However, if the match would target a spe‐
540       cific subset, e.g.  ^sys\., and more clusters would  be  defined,  this
541       could  be necessary, as for instance in the following abbreviated exam‐
542       ple:
543
544       ` cluster dc1-sys ... ; cluster dc2-sys ... ;
545
546       cluster dc1-misc ... ; cluster dc2-misc ... ;
547
548       match ^sys. send to dc1-sys; match ^sys. send to dc2-sys stop;
549
550       match  send to dc1-misc; match  send to dc2-misc stop; `
551
552       As can be seen, without the stop in dc2-sys´ match  rule,  all  metrics
553       starting with sys. would also be send to dc1-misc and dc2-misc.  It can
554       be that this is desired, of course, but in this example there is a ded‐
555       icated cluster for the sys metrics.
556
557       Suppose  there would be some unwanted metric that unfortunately is gen‐
558       erated, let´s assume some bad/old software.  We  don´t  want  to  store
559       this  metric.   The  blackhole cluster is suitable for that, when it is
560       harder to actually whitelist all wanted metrics.  Consider the  follow‐
561       ing:
562
563        match
564               some_legacy1$
565               some_legacy2$
566           send to blackhole
567           stop;
568
569       This would throw away all metrics that end with some_legacy, that would
570       otherwise be hard to filter out.  Since the order matters,  it  can  be
571       used in a construct like this:
572
573       ` cluster old ... ; cluster new ... ;
574
575       match * send to old;
576
577       match unwanted send to blackhole stop;
578
579       match * send to new; `
580
581       In  this  example  the  old  cluster  would  receive  the metric that´s
582       unwanted for the new cluster.  So, the order in which the  rules  occur
583       does matter for the execution.
584
585       Validation  can  be used to ensure the data for metrics is as expected.
586       A global validation for just number (no floating  point)  values  could
587       be:
588
589        match *
590           validate ^[0-9]+\ [0-9]+$ else drop
591           ;
592
593       (Note  the  escape  with backslash \ of the space, you might be able to
594       use \s or [:space:] instead, this depends on your libc implementation.)
595
596       The validation clause can exist on every match rule, so  in  principle,
597       the following is valid:
598
599        match ^foo
600           validate ^[0-9]+\ [0-9]+$ else drop
601           send to integer-cluster
602           ; match ^foo
603           validate ^[0-9.e+-]+\ [0-9.e+-]+$ else drop
604           send to float-cluster
605           stop;
606
607       Note  that  the  behaviour  is  different in the previous two examples.
608       When no send to clusters are specified, a validation  error  makes  the
609       match  behave like the stop keyword is present.  Likewise, when valida‐
610       tion passes, processing continues with the next rule.  When destination
611       clusters  are  present,  the match respects the stop keyword as normal.
612       When specified, processing will always stop when  specified  so.   How‐
613       ever,  if validation fails, the rule does not send anything to the des‐
614       tination clusters, the metric will be  dropped  or  logged,  but  never
615       sent.
616
617       The  relay  is  capable of rewriting incoming metrics on the fly.  This
618       process is done based on regular expressions with capture  groups  that
619       allow to substitute parts in a replacement string.  Rewrite rules allow
620       to cleanup metrics from applications, or provide a migration path.   In
621       it´s simplest form a rewrite rule looks like this:
622
623        rewrite ^server\.(.+)\.(.+)\.([a-zA-Z]+)([0-9]+)
624           into server.\_1.\2.\3.\3\4
625           ;
626
627       In  this  example  a metric like server.DC.role.name123 would be trans‐
628       formed into server.dc.role.name.name123.  For rewrite  rules  hold  the
629       same  as  for matches, that their order matters.  Hence to build on top
630       of the old/new cluster example done earlier, the following would  store
631       the original metric name in the old cluster, and the new metric name in
632       the new cluster:
633
634       ` match * send to old;
635
636       rewrite ... ;
637
638       match * send to new; `
639
640       Note that after the rewrite, the original  metric  name  is  no  longer
641       available, as the rewrite happens in-place.
642
643       Aggregations are probably the most complex part of carbon-c-relay.  Two
644       ways of specifying aggregates are  supported  by  carbon-c-relay.   The
645       first,  static  rules,  are handled by an optimiser which tries to fold
646       thousands of rules into groups to make  the  matching  more  efficient.
647       The  second,  dynamic rules, are very powerful compact definitions with
648       possibly thousands of internal instantiations.  A typical static aggre‐
649       gation looks like:
650
651        aggregate
652               ^sys\.dc1\.somehost-[0-9]+\.somecluster\.mysql\.replica‐
653       tion_delay
654               ^sys\.dc2\.somehost-[0-9]+\.somecluster\.mysql\.replica‐
655       tion_delay
656           every 10 seconds
657           expire after 35 seconds
658           timestamp at end of bucket
659           compute sum write to
660               mysql.somecluster.total_replication_delay
661           compute average write to
662               mysql.somecluster.average_replication_delay
663           compute max write to
664               mysql.somecluster.max_replication_delay
665           compute count write to
666               mysql.somecluster.replication_delay_metric_count
667           ;
668
669       In  this  example,  four  aggregations  are  produced from the incoming
670       matching metrics.  In this  example  we  could  have  written  the  two
671       matches  as  one, but for demonstration purposes we did not.  Obviously
672       they can refer to different metrics, if that makes sense.  The every 10
673       seconds clause specifies in what interval the aggregator can expect new
674       metrics to arrive.  This interval is used to produce the  aggregations,
675       thus each 10 seconds 4 new metrics are generated from the data received
676       sofar.  Because data may be in transit for some reason,  or  generation
677       stalled,  the expire after clause specifies how long the data should be
678       kept before considering a data bucket (which is aggregated) to be  com‐
679       plete.   In  the example, 35 was used, which means after 35 seconds the
680       first aggregates are produced.  It also means that metrics  can  arrive
681       35  seconds  late,  and still be taken into account.  The exact time at
682       which the aggregate metrics are produced is random between 0 and inter‐
683       val  (10  in this case) seconds after the expiry time.  This is done to
684       prevent thundering herds of metrics for large  aggregation  sets.   The
685       timestamp  that is used for the aggregations can be specified to be the
686       start, middle or end of the bucket.  Original carbon-aggregator.py uses
687       start, while carbon-c-relay´s default has always been end.  The compute
688       clauses demonstrate a single  aggregation  rule  can  produce  multiple
689       aggregates,  as  often  is  the case.  Internally, this comes for free,
690       since all possible aggregates are always  calculated,  whether  or  not
691       they  are used.  The produced new metrics are resubmitted to the relay,
692       hence matches defined before in the configuration can match  output  of
693       the  aggregator.  It is important to avoid loops, that can be generated
694       this way.   In  general,  splitting  aggregations  to  their  own  car‐
695       bon-c-relay instance, such that it is easy to forward the produced met‐
696       rics to another relay instance is a good practice.
697
698       The previous example could also be written as follows to be dynamic:
699
700        aggregate
701               ^sys\.dc[0-9].(somehost-[0-9]+)\.([^.]+)\.mysql\.replica‐
702       tion_delay
703           every 10 seconds
704           expire after 35 seconds
705           compute sum write to
706               mysql.host.\1.replication_delay
707           compute sum write to
708               mysql.host.all.replication_delay
709           compute sum write to
710               mysql.cluster.\2.replication_delay
711           compute sum write to
712               mysql.cluster.all.replication_delay
713           ;
714
715       Here  a single match, results in four aggregations, each of a different
716       scope.  In this example aggregation based on hostname and  cluster  are
717       being  made,  as  well  as  the more general all targets, which in this
718       example have both identical values.  Note that with this single  aggre‐
719       gation rule, both per-cluster, per-host and total aggregations are pro‐
720       duced.  Obviously, the input metrics define which  hosts  and  clusters
721       are produced.
722
723       With use of the send to clause, aggregations can be made more intuitive
724       and less error-prone.  Consider the below example:
725
726       ` cluster graphite fnv1a_ch ip1 ip2 ip3;
727
728       aggregate ^sys.somemetric
729           every 60 seconds
730           expire after 75 seconds
731           compute sum write to
732               sys.somemetric
733           send to graphite
734           stop
735           ;
736
737       match * send to graphite; `
738
739       It sends all incoming metrics  to  the  graphite  cluster,  except  the
740       sys.somemetric  ones,  which it replaces with a sum of all the incoming
741       ones.  Without a stop in the aggregate, this causes a loop, and without
742       the  send  to, the metric name can´t be kept its original name, for the
743       output now directly goes to the cluster.
744

STATISTICS

746       When carbon-c-relay is run without -d or -s arguments, statistics  will
747       be  produced.  By default they are sent to the relay itself in the form
748       of carbon.relays.<hostname>.*.  See the statistics construct  to  over‐
749       ride  this  prefix,  sending  interval and values produced.  While many
750       metrics have a similar name  to  what  carbon-cache.py  would  produce,
751       their values are likely different.  By default, most values are running
752       counters which only increase over time.  The use of  the  nonNegativeD‐
753       erivative() function from graphite is useful with these.
754
755       The  following  metrics are produced under the carbon.relays.<hostname>
756       namespace:
757
758       ·   metricsReceived
759
760           The number of metrics that were received by  the  relay.   Received
761           here  means  that  they  were seen and processed by any of the dis‐
762           patchers.
763
764       ·   metricsSent
765
766           The number of metrics that were sent from the  relay.   This  is  a
767           total  count  for  all servers combined.  When incoming metrics are
768           duplicated by the cluster configuration, this counter will  include
769           all those duplications.  In other words, the amount of metrics that
770           were successfully sent to other systems.  Note  that  metrics  that
771           are  processed  (received)  but still in the sending queue (queued)
772           are not included in this counter.
773
774       ·   metricsQueued
775
776           The total number of metrics that are currently in  the  queues  for
777           all the server targets.  This metric is not cumulative, for it is a
778           sample of the queue size, which can (and should) go  up  and  down.
779           Therefore  you should not use the derivative function for this met‐
780           ric.
781
782       ·   metricsDropped
783
784           The total number of metric that had to be  dropped  due  to  server
785           queues overflowing.  A queue typically overflows when the server it
786           tries to send its metrics to is  not  reachable,  or  too  slow  in
787           ingesting  the  amount  of  metrics queued.  This can be network or
788           resource related, and also greatly depends on the rate  of  metrics
789           being sent to the particular server.
790
791       ·   metricsBlackholed
792
793           The  number  of  metrics  that did not match any rule, or matched a
794           rule with blackhole as target.  Depending on your configuration,  a
795           high  value might be an indication of a misconfiguration somewhere.
796           These metrics were received by the relay, but never sent  anywhere,
797           thus they disappeared.
798
799       ·   metricStalls
800
801           The  number  of  times  the relay had to stall a client to indicate
802           that the downstream server cannot handle the stream of metrics.   A
803           stall  is  only  performed when the queue is full and the server is
804           actually receptive of metrics, but just too  slow  at  the  moment.
805           Stalls typically happen during micro-bursts, where the client typi‐
806           cally is unaware that it should stop sending more data, while it is
807           able to.
808
809       ·   connections
810
811           The number of connect requests handled.  This is an ever increasing
812           number just counting how many connections were accepted.
813
814       ·   disconnects
815
816           The number of disconnected clients.  A  disconnect  either  happens
817           because  the  client  goes  away,  or due to an idle timeout in the
818           relay.  The difference between this metric and connections  is  the
819           amount of connections actively held by the relay.  In normal situa‐
820           tions this amount remains within reasonable bounds.   Many  connec‐
821           tions, but few disconnections typically indicate a possible connec‐
822           tion leak in the client.  The idle connections  disconnect  in  the
823           relay here is to guard against resource drain in such scenarios.
824
825       ·   dispatch_wallTime_us
826
827           The  number  of  microseconds  spent by the dispatchers to do their
828           work.  In particular on multi-core systems, this value can be  con‐
829           fusing,  however,  it indicates how long the dispatchers were doing
830           work handling clients.  It includes everything they do, from  read‐
831           ing data from a socket, cleaning up the input metric, to adding the
832           metric to the appropriate queues.  The  larger  the  configuration,
833           and more complex in terms of matches, the more time the dispatchers
834           will spend on the cpu.  But also time they do /not/  spend  on  the
835           cpu  is included in this number.  It is the pure wallclock time the
836           dispatcher was serving a client.
837
838       ·   dispatch_sleepTime_us
839
840           The number of microseconds spent by the dispatchers sleeping  wait‐
841           ing  for  work.  When this value gets small (or even zero) the dis‐
842           patcher has so much work that it doesn´t sleep any more, and likely
843           can´t  process  the  work in a timely fashion any more.  This value
844           plus the wallTime from above sort of sums up to  the  total  uptime
845           taken  by  this  dispatcher.  Therefore, expressing the wallTime as
846           percentage of this sum gives the busyness percentage  draining  all
847           the way up to 100% if sleepTime goes to 0.
848
849       ·   server_wallTime_us
850
851           The number of microseconds spent by the servers to send the metrics
852           from their queues.  This value includes connection creation,  read‐
853           ing from the queue, and sending metrics over the network.
854
855       ·   dispatcherX
856
857           For  each  indivual dispatcher, the metrics received and blackholed
858           plus the wall clock time.  The values are as described above.
859
860       ·   destinations.X
861
862           For all known destinations, the number of dropped, queued and  sent
863           metrics  plus  the  wall  clock  time  spent.   The  values  are as
864           described above.
865
866       ·   aggregators.metricsReceived
867
868           The number of metrics that were matched an aggregator rule and were
869           accepted  by the aggregator.  When a metric matches multiple aggre‐
870           gators, this value will reflect that.  A metric is not counted when
871           it is considered syntactically invalid, e.g. no value was found.
872
873       ·   aggregators.metricsDropped
874
875           The  number of metrics that were sent to an aggregator, but did not
876           fit timewise.  This is either because the metric was too far in the
877           past  or  future.   The expire after clause in aggregate statements
878           controls how long in the past metric values are accepted.
879
880       ·   aggregators.metricsSent
881
882           The number of metrics that were sent from the  aggregators.   These
883           metrics were produced and are the actual results of aggregations.
884
885
886

BUGS

888       Please report them at: https://github.com/grobian/carbon-c-relay/issues
889

AUTHOR

891       Fabian Groffen &lt;grobian@gentoo.org&gt;
892

SEE ALSO

894       All other utilities from the graphite stack.
895
896       This project aims to be a fast replacement of the original Carbon relay
897       http://graphite.readthedocs.org/en/1.0/carbon-daemons.html#car‐
898       bon-relay-py. carbon-c-relay aims to deliver performance and configura‐
899       bility.  Carbon is single threaded, and  sending  metrics  to  multiple
900       consistent-hash  clusters  requires  chaining  of relays.  This project
901       provides a multithreaded relay which can address multiple  targets  and
902       clusters for each and every metric based on pattern matches.
903
904       There  are a couple more replacement projects out there, which are car‐
905       bon-relay-ng     https://github.com/graphite-ng/carbon-relay-ng     and
906       graphite-relay https://github.com/markchadwick/graphite-relay .
907
908       Compared to carbon-relay-ng, this project does provide carbon´s consis‐
909       tent-hash routing.  graphite-relay, which does this, however doesn´t do
910       metric-based  matches to direct the traffic, which this project does as
911       well.  To date, carbon-c-relay can do  aggregations,  failover  targets
912       and more.
913

ACKNOWLEDGEMENTS

915       This  program  was originally developed for Booking.com, which approved
916       that the code was published and released as Open Source on GitHub,  for
917       which  the author would like to express his gratitude.  Development has
918       continued since with the help of many contributors suggesting features,
919       reporting  bugs,  adding  patches  and more to make carbon-c-relay into
920       what it is today.
921
922
923
924                                 February 2019               CARBON-C-RELAY(1)
Impressum