1CAKE(8) Linux CAKE(8)
2
3
4
6 CAKE - Common Applications Kept Enhanced (CAKE)
7
9 tc qdisc ... cake
10 [ bandwidth RATE | unlimited* | autorate-ingress ]
11 [ rtt TIME | datacentre | lan | metro | regional | internet* | oceanic
12 | satellite | interplanetary ]
13 [ besteffort | diffserv8 | diffserv4 | diffserv3* ]
14 [ flowblind | srchost | dsthost | hosts | flows | dual-srchost | dual-
15 dsthost | triple-isolate* ]
16 [ nat | nonat* ]
17 [ wash | nowash* ]
18 [ split-gso* | no-split-gso ]
19 [ ack-filter | ack-filter-aggressive | no-ack-filter* ]
20 [ memlimit LIMIT ]
21 [ fwmark MASK ]
22 [ ptm | atm | noatm* ]
23 [ overhead N | conservative | raw* ]
24 [ mpu N ]
25 [ ingress | egress* ]
26 (* marks defaults)
27
28
29
31 CAKE (Common Applications Kept Enhanced) is a shaping-capable queue
32 discipline which uses both AQM and FQ. It combines COBALT, which is an
33 AQM algorithm combining Codel and BLUE, a shaper which operates in
34 deficit mode, and a variant of DRR++ for flow isolation. 8-way set-
35 associative hashing is used to virtually eliminate hash collisions.
36 Priority queuing is available through a simplified diffserv implementa‐
37 tion. Overhead compensation for various encapsulation schemes is
38 tightly integrated.
39
40 All settings are optional; the default settings are chosen to be sensi‐
41 ble in most common deployments. Most people will only need to set the
42 bandwidth parameter to get useful results, but reading the Overhead
43 Compensation and Round Trip Time sections is strongly encouraged.
44
45
47 CAKE uses a deficit-mode shaper, which does not exhibit the initial
48 burst typical of token-bucket shapers. It will automatically burst
49 precisely as much as required to maintain the configured throughput.
50 As such, it is very straightforward to configure.
51
52 unlimited (default)
53 No limit on the bandwidth.
54
55 bandwidth RATE
56 Set the shaper bandwidth. See tc(8) or examples below for details
57 of the RATE value.
58
59 autorate-ingress
60 Automatic capacity estimation based on traffic arriving at this
61 qdisc. This is most likely to be useful with cellular links, which
62 tend to change quality randomly. A bandwidth parameter can be used in
63 conjunction to specify an initial estimate. The shaper will periodi‐
64 cally be set to a bandwidth slightly below the estimated rate. This
65 estimator cannot estimate the bandwidth of links downstream of itself.
66
67
69 The size of each packet on the wire may differ from that seen by Linux.
70 The following parameters allow CAKE to compensate for this difference
71 by internally considering each packet to be bigger than Linux informs
72 it. To assist users who are not expert network engineers, keywords
73 have been provided to represent a number of common link technologies.
74
75
76 Manual Overhead Specification
77 overhead BYTES
78 Adds BYTES to the size of each packet. BYTES may be negative;
79 values between -64 and 256 (inclusive) are accepted.
80
81 mpu BYTES
82 Rounds each packet (including overhead) up to a minimum length
83 BYTES. BYTES may not be negative; values between 0 and 256 (inclusive)
84 are accepted.
85
86 atm
87 Compensates for ATM cell framing, which is normally found on ADSL
88 links. This is performed after the overhead parameter above. ATM uses
89 fixed 53-byte cells, each of which can carry 48 bytes payload.
90
91 ptm
92 Compensates for PTM encoding, which is normally found on VDSL2
93 links and uses a 64b/65b encoding scheme. It is even more efficient to
94 simply derate the specified shaper bandwidth by a factor of 64/65 or
95 0.984. See ITU G.992.3 Annex N and IEEE 802.3 Section 61.3 for details.
96
97 noatm
98 Disables ATM and PTM compensation.
99
100
101 Failsafe Overhead Keywords
102 These two keywords are provided for quick-and-dirty setup. Use them if
103 you can't be bothered to read the rest of this section.
104
105 raw (default)
106 Turns off all overhead compensation in CAKE. The packet size
107 reported by Linux will be used directly.
108
109 Other overhead keywords may be added after "raw". The effect of
110 this is to make the overhead compensation operate relative to the
111 reported packet size, not the underlying IP packet size.
112
113 conservative
114 Compensates for more overhead than is likely to occur on any
115 widely-deployed link technology.
116 Equivalent to overhead 48 atm.
117
118
119 ADSL Overhead Keywords
120 Most ADSL modems have a way to check which framing scheme is in use.
121 Often this is also specified in the settings document provided by the
122 ISP. The keywords in this section are intended to correspond with
123 these sources of information. All of them implicitly set the atm flag.
124
125 pppoa-vcmux
126 Equivalent to overhead 10 atm
127
128 pppoa-llc
129 Equivalent to overhead 14 atm
130
131 pppoe-vcmux
132 Equivalent to overhead 32 atm
133
134 pppoe-llcsnap
135 Equivalent to overhead 40 atm
136
137 bridged-vcmux
138 Equivalent to overhead 24 atm
139
140 bridged-llcsnap
141 Equivalent to overhead 32 atm
142
143 ipoa-vcmux
144 Equivalent to overhead 8 atm
145
146 ipoa-llcsnap
147 Equivalent to overhead 16 atm
148
149 See also the Ethernet Correction Factors section below.
150
151
152 VDSL2 Overhead Keywords
153 ATM was dropped from VDSL2 in favour of PTM, which is a much more
154 straightforward framing scheme. Some ISPs retained PPPoE for compati‐
155 bility with their existing back-end systems.
156
157 pppoe-ptm
158 Equivalent to overhead 30 ptm
159
160 PPPoE: 2B PPP + 6B PPPoE +
161 ETHERNET: 6B dest MAC + 6B src MAC + 2B ethertype + 4B Frame Check
162 Sequence +
163 PTM: 1B Start of Frame (S) + 1B End of Frame (Ck) + 2B TC-CRC
164 (PTM-FCS)
165
166 bridged-ptm
167 Equivalent to overhead 22 ptm
168 ETHERNET: 6B dest MAC + 6B src MAC + 2B ethertype + 4B Frame Check
169 Sequence +
170 PTM: 1B Start of Frame (S) + 1B End of Frame (Ck) + 2B TC-CRC
171 (PTM-FCS)
172
173 See also the Ethernet Correction Factors section below.
174
175
176 DOCSIS Cable Overhead Keyword
177 DOCSIS is the universal standard for providing Internet service over
178 cable-TV infrastructure.
179
180 In this case, the actual on-wire overhead is less important than the
181 packet size the head-end equipment uses for shaping and metering. This
182 is specified to be an Ethernet frame including the CRC (aka FCS).
183
184 docsis
185 Equivalent to overhead 18 mpu 64 noatm
186
187
188 Ethernet Overhead Keywords
189 ethernet
190 Accounts for Ethernet's preamble, inter-frame gap, and Frame Check
191 Sequence. Use this keyword when the bottleneck being shaped for is an
192 actual Ethernet cable.
193 Equivalent to overhead 38 mpu 84 noatm
194
195 ether-vlan
196 Adds 4 bytes to the overhead compensation, accounting for an IEEE
197 802.1Q VLAN header appended to the Ethernet frame header. NB: Some
198 ISPs use one or even two of these within PPPoE; this keyword may be
199 repeated as necessary to express this.
200
201
203 Active Queue Management (AQM) consists of embedding congestion signals
204 in the packet flow, which receivers use to instruct senders to slow
205 down when the queue is persistently occupied. CAKE uses ECN signalling
206 when available, and packet drops otherwise, according to a combination
207 of the Codel and BLUE AQM algorithms called COBALT.
208
209 Very short latencies require a very rapid AQM response to adequately
210 control latency. However, such a rapid response tends to impair
211 throughput when the actual RTT is relatively long. CAKE allows speci‐
212 fying the RTT it assumes for tuning various parameters. Actual RTTs
213 within an order of magnitude of this will generally work well for both
214 throughput and latency management.
215
216 At the 'lan' setting and below, the time constants are similar in mag‐
217 nitude to the jitter in the Linux kernel itself, so congestion might be
218 signalled prematurely. The flows will then become sparse and total
219 throughput reduced, leaving little or no back-pressure for the fairness
220 logic to work against. Use the "metro" setting for local lans unless
221 you have a custom kernel.
222
223 rtt TIME
224 Manually specify an RTT.
225
226 datacentre
227 For extremely high-performance 10GigE+ networks only. Equivalent
228 to rtt 100us.
229
230 lan
231 For pure Ethernet (not Wi-Fi) networks, at home or in the office.
232 Don't use this when shaping for an Internet access link. Equivalent to
233 rtt 1ms.
234
235 metro
236 For traffic mostly within a single city. Equivalent to rtt 10ms.
237
238 regional
239 For traffic mostly within a European-sized country. Equivalent to
240 rtt 30ms.
241
242 internet (default)
243 This is suitable for most Internet traffic. Equivalent to rtt
244 100ms.
245
246 oceanic
247 For Internet traffic with generally above-average latency, such as
248 that suffered by Australasian residents. Equivalent to rtt 300ms.
249
250 satellite
251 For traffic via geostationary satellites. Equivalent to rtt
252 1000ms.
253
254 interplanetary
255 So named because Jupiter is about 1 light-hour from Earth. Use
256 this to (almost) completely disable AQM actions. Equivalent to rtt
257 3600s.
258
259
261 With flow isolation enabled, CAKE places packets from different flows
262 into different queues, each of which carries its own AQM state. Pack‐
263 ets from each queue are then delivered fairly, according to a DRR++
264 algorithm which minimises latency for "sparse" flows. CAKE uses a set-
265 associative hashing algorithm to minimise flow collisions.
266
267 These keywords specify whether fairness based on source address, desti‐
268 nation address, individual flows, or any combination of those is
269 desired.
270
271 flowblind
272 Disables flow isolation; all traffic passes through a single queue
273 for each tin.
274
275 srchost
276 Flows are defined only by source address. Could be useful on the
277 egress path of an ISP backhaul.
278
279 dsthost
280 Flows are defined only by destination address. Could be useful on
281 the ingress path of an ISP backhaul.
282
283 hosts
284 Flows are defined by source-destination host pairs. This is host
285 isolation, rather than flow isolation.
286
287 flows
288 Flows are defined by the entire 5-tuple of source address, desti‐
289 nation address, transport protocol, source port and destination port.
290 This is the type of flow isolation performed by SFQ and fq_codel.
291
292 dual-srchost
293 Flows are defined by the 5-tuple, and fairness is applied first
294 over source addresses, then over individual flows. Good for use on
295 egress traffic from a LAN to the internet, where it'll prevent any one
296 LAN host from monopolising the uplink, regardless of the number of
297 flows they use.
298
299 dual-dsthost
300 Flows are defined by the 5-tuple, and fairness is applied first
301 over destination addresses, then over individual flows. Good for use
302 on ingress traffic to a LAN from the internet, where it'll prevent any
303 one LAN host from monopolising the downlink, regardless of the number
304 of flows they use.
305
306 triple-isolate (default)
307 Flows are defined by the 5-tuple, and fairness is applied over
308 source *and* destination addresses intelligently (ie. not merely by
309 host-pairs), and also over individual flows. Use this if you're not
310 certain whether to use dual-srchost or dual-dsthost; it'll do both jobs
311 at once, preventing any one host on *either* side of the link from
312 monopolising it with a large number of flows.
313
314 nat
315 Instructs Cake to perform a NAT lookup before applying flow-isola‐
316 tion rules, to determine the true addresses and port numbers of the
317 packet, to improve fairness between hosts "inside" the NAT. This has
318 no practical effect in "flowblind" or "flows" modes, or if NAT is per‐
319 formed on a different host.
320
321 nonat (default)
322 Cake will not perform a NAT lookup. Flow isolation will be per‐
323 formed using the addresses and port numbers directly visible to the
324 interface Cake is attached to.
325
326
328 CAKE can divide traffic into "tins" based on the Diffserv field. Each
329 tin has its own independent set of flow-isolation queues, and is ser‐
330 viced based on a WRR algorithm. To avoid perverse Diffserv marking
331 incentives, tin weights have a "priority sharing" value when bandwidth
332 used by that tin is below a threshold, and a lower "bandwidth sharing"
333 value when above. Bandwidth is compared against the threshold using
334 the same algorithm as the deficit-mode shaper.
335
336 Detailed customisation of tin parameters is not provided. The follow‐
337 ing presets perform all necessary tuning, relative to the current
338 shaper bandwidth and RTT settings.
339
340 besteffort
341 Disables priority queuing by placing all traffic in one tin.
342
343 precedence
344 Enables legacy interpretation of TOS "Precedence" field. Use of
345 this preset on the modern Internet is firmly discouraged.
346
347 diffserv4
348 Provides a general-purpose Diffserv implementation with four tins:
349 Bulk (CS1), 6.25% threshold, generally low priority.
350 Best Effort (general), 100% threshold.
351 Video (AF4x, AF3x, CS3, AF2x, CS2, TOS4, TOS1), 50% thresh‐
352 old.
353 Voice (CS7, CS6, EF, VA, CS5, CS4), 25% threshold.
354
355 diffserv3 (default)
356 Provides a simple, general-purpose Diffserv implementation with
357 three tins:
358 Bulk (CS1), 6.25% threshold, generally low priority.
359 Best Effort (general), 100% threshold.
360 Voice (CS7, CS6, EF, VA, TOS4), 25% threshold, reduced Codel
361 interval.
362
363
364 fwmark MASK
365 This options turns on fwmark-based overriding of CAKE's tin selec‐
366 tion. If set, the option specifies a bitmask that will be applied to
367 the fwmark associated with each packet. If the result of this masking
368 is non-zero, the result will be right-shifted by the number of least-
369 significant unset bits in the mask value, and the result will be used
370 as a the tin number for that packet. This can be used to set policies
371 in a firewall script that will override CAKE's built-in tin selection.
372
373
375 memlimit LIMIT
376 Limit the memory consumed by Cake to LIMIT bytes. Note that this
377 does not translate directly to queue size (so do not size this based on
378 bandwidth delay product considerations, but rather on worst case
379 acceptable memory consumption), as there is some overhead in the data
380 structures containing the packets, especially for small packets.
381
382 By default, the limit is calculated based on the bandwidth and RTT
383 settings.
384
385
386 wash
387
388 Traffic entering your diffserv domain is frequently mis-marked in
389 transit from the perspective of your network, and traffic exiting yours
390 may be mis-marked from the perspective of the transiting provider.
391
392 Apply the wash option to clear all extra diffserv (but not ECN bits),
393 after priority queuing has taken place.
394
395 If you are shaping inbound, and cannot trust the diffserv markings (as
396 is the case for Comcast Cable, among others), it is best to use a sin‐
397 gle queue "besteffort" mode with wash.
398
399
400 split-gso
401
402 This option controls whether CAKE will split General Segmentation
403 Offload (GSO) super-packets into their on-the-wire components and
404 dequeue them individually.
405
406 Super-packets are created by the networking stack to improve effi‐
407 ciency. However, because they are larger they take longer to dequeue,
408 which translates to higher latency for competing flows, especially at
409 lower bandwidths. CAKE defaults to splitting GSO packets to achieve the
410 lowest possible latency. At link speeds higher than 10 Gbps, setting
411 the no-split-gso parameter can increase the maximum achievable through‐
412 put by retaining the full GSO packets.
413
414
416 CAKE supports overriding of its internal classification of packets
417 through the tc filter mechanism. Packets can be assigned to different
418 priority tins by setting the priority field on the skb, and the flow
419 hashing can be overridden by setting the classid parameter.
420
421
422 Tin override
423
424 To assign a priority tin, the major number of the priority
425 field needs to match the qdisc handle of the cake instance; if it does,
426 the minor number will be interpreted as the tin index. For example, to
427 classify all ICMP packets as 'bulk', the following filter can be used:
428
429 # tc qdisc replace dev eth0 handle 1: root cake diffserv3
430 # tc filter add dev eth0 parent 1: protocol ip prio 1 \
431 u32 match icmp type 0 0 action skbedit priority 1:1
432
433
434 Flow hash override
435
436 To override flow hashing, the classid can be set. CAKE will
437 interpret the major number of the classid as the host hash used in host
438 isolation mode, and the minor number as the flow hash used for flow-
439 based queueing. One or both of those can be set, and will be used if
440 the relevant flow isolation parameter is set (i.e., the major number
441 will be ignored if CAKE is not configured in hosts mode, and the minor
442 number will be ignored if CAKE is not configured in flows mode).
443
444 This example will assign all ICMP packets to the first queue:
445
446 # tc qdisc replace dev eth0 handle 1: root cake
447 # tc filter add dev eth0 parent 1: protocol ip prio 1 \
448 u32 match icmp type 0 0 classid 0:1
449
450 If only one of the host and flow overrides is set, CAKE will compute
451 the other hash from the packet as normal. Note, however, that the host
452 isolation mode works by assigning a host ID to the flow queue; so if
453 overriding both host and flow, the same flow cannot have more than one
454 host assigned. In addition, it is not possible to assign different
455 source and destination host IDs through the override mechanism; if a
456 host ID is assigned, it will be used as both source and destination
457 host.
458
459
460
461
463 # tc qdisc delete root dev eth0
464 # tc qdisc add root dev eth0 cake bandwidth 100Mbit ethernet
465 # tc -s qdisc show dev eth0
466 qdisc cake 1: root refcnt 2 bandwidth 100Mbit diffserv3 triple-isolate
467 rtt 100.0ms noatm overhead 38 mpu 84
468 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
469 backlog 0b 0p requeues 0
470 memory used: 0b of 5000000b
471 capacity estimate: 100Mbit
472 min/max network layer size: 65535 / 0
473 min/max overhead-adjusted size: 65535 / 0
474 average network hdr offset: 0
475
476 Bulk Best Effort Voice
477 thresh 6250Kbit 100Mbit 25Mbit
478 target 5.0ms 5.0ms 5.0ms
479 interval 100.0ms 100.0ms 100.0ms
480 pk_delay 0us 0us 0us
481 av_delay 0us 0us 0us
482 sp_delay 0us 0us 0us
483 pkts 0 0 0
484 bytes 0 0 0
485 way_inds 0 0 0
486 way_miss 0 0 0
487 way_cols 0 0 0
488 drops 0 0 0
489 marks 0 0 0
490 ack_drop 0 0 0
491 sp_flows 0 0 0
492 bk_flows 0 0 0
493 un_flows 0 0 0
494 max_len 0 0 0
495 quantum 300 1514 762
496
497 After some use:
498 # tc -s qdisc show dev eth0
499
500 qdisc cake 1: root refcnt 2 bandwidth 100Mbit diffserv3 triple-isolate
501 rtt 100.0ms noatm overhead 38 mpu 84
502 Sent 44709231 bytes 31931 pkt (dropped 45, overlimits 93782 requeues
503 0)
504 backlog 33308b 22p requeues 0
505 memory used: 292352b of 5000000b
506 capacity estimate: 100Mbit
507 min/max network layer size: 28 / 1500
508 min/max overhead-adjusted size: 84 / 1538
509 average network hdr offset: 14
510
511 Bulk Best Effort Voice
512 thresh 6250Kbit 100Mbit 25Mbit
513 target 5.0ms 5.0ms 5.0ms
514 interval 100.0ms 100.0ms 100.0ms
515 pk_delay 8.7ms 6.9ms 5.0ms
516 av_delay 4.9ms 5.3ms 3.8ms
517 sp_delay 727us 1.4ms 511us
518 pkts 2590 21271 8137
519 bytes 3081804 30302659 11426206
520 way_inds 0 46 0
521 way_miss 3 17 4
522 way_cols 0 0 0
523 drops 20 15 10
524 marks 0 0 0
525 ack_drop 0 0 0
526 sp_flows 2 4 1
527 bk_flows 1 2 1
528 un_flows 0 0 0
529 max_len 1514 1514 1514
530 quantum 300 1514 762
531
532
534 tc(8), tc-codel(8), tc-fq_codel(8), tc-htb(8)
535
536
538 Cake's principal author is Jonathan Morton, with contributions from
539 Tony Ambardar, Kevin Darbyshire-Bryant, Toke Høiland-Jørgensen, Sebas‐
540 tian Moeller, Ryan Mounce, Dean Scarff, Nils Andreas Svee, and Dave
541 Täht.
542
543 This manual page was written by Loganaden Velvindron. Please report
544 corrections to the Linux Networking mailing list <netdev@vger.ker‐
545 nel.org>.
546
547
548
549iproute2 19 July 2018 CAKE(8)