1CAKE(8) Linux CAKE(8)
2
3
4
6 CAKE - Common Applications Kept Enhanced (CAKE)
7
9 tc qdisc ... cake
10 [ bandwidth RATE | unlimited* | autorate-ingress ]
11 [ rtt TIME | datacentre | lan | metro | regional | internet* | oceanic
12 | satellite | interplanetary ]
13 [ besteffort | diffserv8 | diffserv4 | diffserv3* ]
14 [ flowblind | srchost | dsthost | hosts | flows | dual-srchost | dual-
15 dsthost | triple-isolate* ]
16 [ nat | nonat* ]
17 [ wash | nowash* ]
18 [ split-gso* | no-split-gso ]
19 [ ack-filter | ack-filter-aggressive | no-ack-filter* ]
20 [ memlimit LIMIT ]
21 [ fwmark MASK ]
22 [ ptm | atm | noatm* ]
23 [ overhead N | conservative | raw* ]
24 [ mpu N ]
25 [ ingress | egress* ]
26 (* marks defaults)
27
28
29
31 CAKE (Common Applications Kept Enhanced) is a shaping-capable queue
32 discipline which uses both AQM and FQ. It combines COBALT, which is an
33 AQM algorithm combining Codel and BLUE, a shaper which operates in
34 deficit mode, and a variant of DRR++ for flow isolation. 8-way set-as‐
35 sociative hashing is used to virtually eliminate hash collisions. Pri‐
36 ority queuing is available through a simplified diffserv implementa‐
37 tion. Overhead compensation for various encapsulation schemes is
38 tightly integrated.
39
40 All settings are optional; the default settings are chosen to be sensi‐
41 ble in most common deployments. Most people will only need to set the
42 bandwidth parameter to get useful results, but reading the Overhead
43 Compensation and Round Trip Time sections is strongly encouraged.
44
45
47 CAKE uses a deficit-mode shaper, which does not exhibit the initial
48 burst typical of token-bucket shapers. It will automatically burst
49 precisely as much as required to maintain the configured throughput.
50 As such, it is very straightforward to configure.
51
52 unlimited (default)
53 No limit on the bandwidth.
54
55 bandwidth RATE
56 Set the shaper bandwidth. See tc(8) or examples below for details
57 of the RATE value.
58
59 autorate-ingress
60 Automatic capacity estimation based on traffic arriving at this
61 qdisc. This is most likely to be useful with cellular links, which
62 tend to change quality randomly. A bandwidth parameter can be used in
63 conjunction to specify an initial estimate. The shaper will periodi‐
64 cally be set to a bandwidth slightly below the estimated rate. This
65 estimator cannot estimate the bandwidth of links downstream of itself.
66
67
69 The size of each packet on the wire may differ from that seen by Linux.
70 The following parameters allow CAKE to compensate for this difference
71 by internally considering each packet to be bigger than Linux informs
72 it. To assist users who are not expert network engineers, keywords
73 have been provided to represent a number of common link technologies.
74
75
76 Manual Overhead Specification
77 overhead BYTES
78 Adds BYTES to the size of each packet. BYTES may be negative;
79 values between -64 and 256 (inclusive) are accepted.
80
81 mpu BYTES
82 Rounds each packet (including overhead) up to a minimum length
83 BYTES. BYTES may not be negative; values between 0 and 256 (inclusive)
84 are accepted.
85
86 atm
87 Compensates for ATM cell framing, which is normally found on ADSL
88 links. This is performed after the overhead parameter above. ATM uses
89 fixed 53-byte cells, each of which can carry 48 bytes payload.
90
91 ptm
92 Compensates for PTM encoding, which is normally found on VDSL2
93 links and uses a 64b/65b encoding scheme. It is even more efficient to
94 simply derate the specified shaper bandwidth by a factor of 64/65 or
95 0.984. See ITU G.992.3 Annex N and IEEE 802.3 Section 61.3 for details.
96
97 noatm
98 Disables ATM and PTM compensation.
99
100
101 Failsafe Overhead Keywords
102 These two keywords are provided for quick-and-dirty setup. Use them if
103 you can't be bothered to read the rest of this section.
104
105 raw (default)
106 Turns off all overhead compensation in CAKE. The packet size re‐
107 ported by Linux will be used directly.
108
109 Other overhead keywords may be added after "raw". The effect of
110 this is to make the overhead compensation operate relative to the re‐
111 ported packet size, not the underlying IP packet size.
112
113 conservative
114 Compensates for more overhead than is likely to occur on any
115 widely-deployed link technology.
116 Equivalent to overhead 48 atm.
117
118
119 ADSL Overhead Keywords
120 Most ADSL modems have a way to check which framing scheme is in use.
121 Often this is also specified in the settings document provided by the
122 ISP. The keywords in this section are intended to correspond with
123 these sources of information. All of them implicitly set the atm flag.
124
125 pppoa-vcmux
126 Equivalent to overhead 10 atm
127
128 pppoa-llc
129 Equivalent to overhead 14 atm
130
131 pppoe-vcmux
132 Equivalent to overhead 32 atm
133
134 pppoe-llcsnap
135 Equivalent to overhead 40 atm
136
137 bridged-vcmux
138 Equivalent to overhead 24 atm
139
140 bridged-llcsnap
141 Equivalent to overhead 32 atm
142
143 ipoa-vcmux
144 Equivalent to overhead 8 atm
145
146 ipoa-llcsnap
147 Equivalent to overhead 16 atm
148
149 See also the Ethernet Correction Factors section below.
150
151
152 VDSL2 Overhead Keywords
153 ATM was dropped from VDSL2 in favour of PTM, which is a much more
154 straightforward framing scheme. Some ISPs retained PPPoE for compati‐
155 bility with their existing back-end systems.
156
157 pppoe-ptm
158 Equivalent to overhead 30 ptm
159
160 PPPoE: 2B PPP + 6B PPPoE +
161 ETHERNET: 6B dest MAC + 6B src MAC + 2B ethertype + 4B Frame Check
162 Sequence +
163 PTM: 1B Start of Frame (S) + 1B End of Frame (Ck) + 2B TC-CRC
164 (PTM-FCS)
165
166 bridged-ptm
167 Equivalent to overhead 22 ptm
168 ETHERNET: 6B dest MAC + 6B src MAC + 2B ethertype + 4B Frame Check
169 Sequence +
170 PTM: 1B Start of Frame (S) + 1B End of Frame (Ck) + 2B TC-CRC
171 (PTM-FCS)
172
173 See also the Ethernet Correction Factors section below.
174
175
176 DOCSIS Cable Overhead Keyword
177 DOCSIS is the universal standard for providing Internet service over
178 cable-TV infrastructure.
179
180 In this case, the actual on-wire overhead is less important than the
181 packet size the head-end equipment uses for shaping and metering. This
182 is specified to be an Ethernet frame including the CRC (aka FCS).
183
184 docsis
185 Equivalent to overhead 18 mpu 64 noatm
186
187
188 Ethernet Overhead Keywords
189 ethernet
190 Accounts for Ethernet's preamble, inter-frame gap, and Frame Check
191 Sequence. Use this keyword when the bottleneck being shaped for is an
192 actual Ethernet cable.
193 Equivalent to overhead 38 mpu 84 noatm
194
195 ether-vlan
196 Adds 4 bytes to the overhead compensation, accounting for an IEEE
197 802.1Q VLAN header appended to the Ethernet frame header. NB: Some
198 ISPs use one or even two of these within PPPoE; this keyword may be re‐
199 peated as necessary to express this.
200
201
203 Active Queue Management (AQM) consists of embedding congestion signals
204 in the packet flow, which receivers use to instruct senders to slow
205 down when the queue is persistently occupied. CAKE uses ECN signalling
206 when available, and packet drops otherwise, according to a combination
207 of the Codel and BLUE AQM algorithms called COBALT.
208
209 Very short latencies require a very rapid AQM response to adequately
210 control latency. However, such a rapid response tends to impair
211 throughput when the actual RTT is relatively long. CAKE allows speci‐
212 fying the RTT it assumes for tuning various parameters. Actual RTTs
213 within an order of magnitude of this will generally work well for both
214 throughput and latency management.
215
216 At the 'lan' setting and below, the time constants are similar in mag‐
217 nitude to the jitter in the Linux kernel itself, so congestion might be
218 signalled prematurely. The flows will then become sparse and total
219 throughput reduced, leaving little or no back-pressure for the fairness
220 logic to work against. Use the "metro" setting for local lans unless
221 you have a custom kernel.
222
223 rtt TIME
224 Manually specify an RTT.
225
226 datacentre
227 For extremely high-performance 10GigE+ networks only. Equivalent
228 to rtt 100us.
229
230 lan
231 For pure Ethernet (not Wi-Fi) networks, at home or in the office.
232 Don't use this when shaping for an Internet access link. Equivalent to
233 rtt 1ms.
234
235 metro
236 For traffic mostly within a single city. Equivalent to rtt 10ms.
237
238 regional
239 For traffic mostly within a European-sized country. Equivalent to
240 rtt 30ms.
241
242 internet (default)
243 This is suitable for most Internet traffic. Equivalent to rtt
244 100ms.
245
246 oceanic
247 For Internet traffic with generally above-average latency, such as
248 that suffered by Australasian residents. Equivalent to rtt 300ms.
249
250 satellite
251 For traffic via geostationary satellites. Equivalent to rtt
252 1000ms.
253
254 interplanetary
255 So named because Jupiter is about 1 light-hour from Earth. Use
256 this to (almost) completely disable AQM actions. Equivalent to rtt
257 3600s.
258
259
261 With flow isolation enabled, CAKE places packets from different flows
262 into different queues, each of which carries its own AQM state. Pack‐
263 ets from each queue are then delivered fairly, according to a DRR++ al‐
264 gorithm which minimizes latency for "sparse" flows. CAKE uses a set-
265 associative hashing algorithm to minimize flow collisions.
266
267 These keywords specify whether fairness based on source address, desti‐
268 nation address, individual flows, or any combination of those is de‐
269 sired.
270
271 flowblind
272 Disables flow isolation; all traffic passes through a single queue
273 for each tin.
274
275 srchost
276 Flows are defined only by source address. Could be useful on the
277 egress path of an ISP backhaul.
278
279 dsthost
280 Flows are defined only by destination address. Could be useful on
281 the ingress path of an ISP backhaul.
282
283 hosts
284 Flows are defined by source-destination host pairs. This is host
285 isolation, rather than flow isolation.
286
287 flows
288 Flows are defined by the entire 5-tuple of source address, desti‐
289 nation address, transport protocol, source port and destination port.
290 This is the type of flow isolation performed by SFQ and fq_codel.
291
292 dual-srchost
293 Flows are defined by the 5-tuple, and fairness is applied first
294 over source addresses, then over individual flows. Good for use on
295 egress traffic from a LAN to the internet, where it'll prevent any one
296 LAN host from monopolising the uplink, regardless of the number of
297 flows they use.
298
299 dual-dsthost
300 Flows are defined by the 5-tuple, and fairness is applied first
301 over destination addresses, then over individual flows. Good for use
302 on ingress traffic to a LAN from the internet, where it'll prevent any
303 one LAN host from monopolising the downlink, regardless of the number
304 of flows they use.
305
306 triple-isolate (default)
307 Flows are defined by the 5-tuple, and fairness is applied over
308 source *and* destination addresses intelligently (ie. not merely by
309 host-pairs), and also over individual flows. Use this if you're not
310 certain whether to use dual-srchost or dual-dsthost; it'll do both jobs
311 at once, preventing any one host on *either* side of the link from mo‐
312 nopolising it with a large number of flows.
313
314 nat
315 Instructs Cake to perform a NAT lookup before applying flow-isola‐
316 tion rules, to determine the true addresses and port numbers of the
317 packet, to improve fairness between hosts "inside" the NAT. This has
318 no practical effect in "flowblind" or "flows" modes, or if NAT is per‐
319 formed on a different host.
320
321 nonat (default)
322 Cake will not perform a NAT lookup. Flow isolation will be per‐
323 formed using the addresses and port numbers directly visible to the in‐
324 terface Cake is attached to.
325
326
328 CAKE can divide traffic into "tins" based on the Diffserv field. Each
329 tin has its own independent set of flow-isolation queues, and is ser‐
330 viced based on a WRR algorithm. To avoid perverse Diffserv marking in‐
331 centives, tin weights have a "priority sharing" value when bandwidth
332 used by that tin is below a threshold, and a lower "bandwidth sharing"
333 value when above. Bandwidth is compared against the threshold using
334 the same algorithm as the deficit-mode shaper.
335
336 Detailed customisation of tin parameters is not provided. The follow‐
337 ing presets perform all necessary tuning, relative to the current
338 shaper bandwidth and RTT settings.
339
340 besteffort
341 Disables priority queuing by placing all traffic in one tin.
342
343 precedence
344 Enables legacy interpretation of TOS "Precedence" field. Use of
345 this preset on the modern Internet is firmly discouraged.
346
347 diffserv4
348 Provides a general-purpose Diffserv implementation with four tins:
349 Bulk (CS1, LE in kernel v5.9+), 6.25% threshold, generally
350 low priority.
351 Best Effort (general), 100% threshold.
352 Video (AF4x, AF3x, CS3, AF2x, CS2, TOS4, TOS1), 50% thresh‐
353 old.
354 Voice (CS7, CS6, EF, VA, CS5, CS4), 25% threshold.
355
356 diffserv3 (default)
357 Provides a simple, general-purpose Diffserv implementation with
358 three tins:
359 Bulk (CS1, LE in kernel v5.9+), 6.25% threshold, generally
360 low priority.
361 Best Effort (general), 100% threshold.
362 Voice (CS7, CS6, EF, VA, TOS4), 25% threshold, reduced Codel
363 interval.
364
365
366 fwmark MASK
367 This options turns on fwmark-based overriding of CAKE's tin selec‐
368 tion. If set, the option specifies a bitmask that will be applied to
369 the fwmark associated with each packet. If the result of this masking
370 is non-zero, the result will be right-shifted by the number of least-
371 significant unset bits in the mask value, and the result will be used
372 as a the tin number for that packet. This can be used to set policies
373 in a firewall script that will override CAKE's built-in tin selection.
374
375
377 memlimit LIMIT
378 Limit the memory consumed by Cake to LIMIT bytes. Note that this
379 does not translate directly to queue size (so do not size this based on
380 bandwidth delay product considerations, but rather on worst case ac‐
381 ceptable memory consumption), as there is some overhead in the data
382 structures containing the packets, especially for small packets.
383
384 By default, the limit is calculated based on the bandwidth and RTT
385 settings.
386
387
388 wash
389
390 Traffic entering your diffserv domain is frequently mis-marked in
391 transit from the perspective of your network, and traffic exiting yours
392 may be mis-marked from the perspective of the transiting provider.
393
394 Apply the wash option to clear all extra diffserv (but not ECN bits),
395 after priority queuing has taken place.
396
397 If you are shaping inbound, and cannot trust the diffserv markings (as
398 is the case for Comcast Cable, among others), it is best to use a sin‐
399 gle queue "besteffort" mode with wash.
400
401
402 split-gso
403
404 This option controls whether CAKE will split General Segmentation
405 Offload (GSO) super-packets into their on-the-wire components and de‐
406 queue them individually.
407
408 Super-packets are created by the networking stack to improve effi‐
409 ciency. However, because they are larger they take longer to dequeue,
410 which translates to higher latency for competing flows, especially at
411 lower bandwidths. CAKE defaults to splitting GSO packets to achieve the
412 lowest possible latency. At link speeds higher than 10 Gbps, setting
413 the no-split-gso parameter can increase the maximum achievable through‐
414 put by retaining the full GSO packets.
415
416
418 CAKE supports overriding of its internal classification of packets
419 through the tc filter mechanism. Packets can be assigned to different
420 priority tins by setting the priority field on the skb, and the flow
421 hashing can be overridden by setting the classid parameter.
422
423
424 Tin override
425
426 To assign a priority tin, the major number of the priority
427 field needs to match the qdisc handle of the cake instance; if it does,
428 the minor number will be interpreted as the tin index. For example, to
429 classify all ICMP packets as 'bulk', the following filter can be used:
430
431 # tc qdisc replace dev eth0 handle 1: root cake diffserv3
432 # tc filter add dev eth0 parent 1: protocol ip prio 1 \
433 u32 match icmp type 0 0 action skbedit priority 1:1
434
435
436 Flow hash override
437
438 To override flow hashing, the classid can be set. CAKE will in‐
439 terpret the major number of the classid as the host hash used in host
440 isolation mode, and the minor number as the flow hash used for flow-
441 based queueing. One or both of those can be set, and will be used if
442 the relevant flow isolation parameter is set (i.e., the major number
443 will be ignored if CAKE is not configured in hosts mode, and the minor
444 number will be ignored if CAKE is not configured in flows mode).
445
446 This example will assign all ICMP packets to the first queue:
447
448 # tc qdisc replace dev eth0 handle 1: root cake
449 # tc filter add dev eth0 parent 1: protocol ip prio 1 \
450 u32 match icmp type 0 0 classid 0:1
451
452 If only one of the host and flow overrides is set, CAKE will compute
453 the other hash from the packet as normal. Note, however, that the host
454 isolation mode works by assigning a host ID to the flow queue; so if
455 overriding both host and flow, the same flow cannot have more than one
456 host assigned. In addition, it is not possible to assign different
457 source and destination host IDs through the override mechanism; if a
458 host ID is assigned, it will be used as both source and destination
459 host.
460
461
462
463
465 # tc qdisc delete root dev eth0
466 # tc qdisc add root dev eth0 cake bandwidth 100Mbit ethernet
467 # tc -s qdisc show dev eth0
468 qdisc cake 1: root refcnt 2 bandwidth 100Mbit diffserv3 triple-isolate
469 rtt 100.0ms noatm overhead 38 mpu 84
470 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
471 backlog 0b 0p requeues 0
472 memory used: 0b of 5000000b
473 capacity estimate: 100Mbit
474 min/max network layer size: 65535 / 0
475 min/max overhead-adjusted size: 65535 / 0
476 average network hdr offset: 0
477
478 Bulk Best Effort Voice
479 thresh 6250Kbit 100Mbit 25Mbit
480 target 5.0ms 5.0ms 5.0ms
481 interval 100.0ms 100.0ms 100.0ms
482 pk_delay 0us 0us 0us
483 av_delay 0us 0us 0us
484 sp_delay 0us 0us 0us
485 pkts 0 0 0
486 bytes 0 0 0
487 way_inds 0 0 0
488 way_miss 0 0 0
489 way_cols 0 0 0
490 drops 0 0 0
491 marks 0 0 0
492 ack_drop 0 0 0
493 sp_flows 0 0 0
494 bk_flows 0 0 0
495 un_flows 0 0 0
496 max_len 0 0 0
497 quantum 300 1514 762
498
499 After some use:
500 # tc -s qdisc show dev eth0
501
502 qdisc cake 1: root refcnt 2 bandwidth 100Mbit diffserv3 triple-isolate
503 rtt 100.0ms noatm overhead 38 mpu 84
504 Sent 44709231 bytes 31931 pkt (dropped 45, overlimits 93782 requeues
505 0)
506 backlog 33308b 22p requeues 0
507 memory used: 292352b of 5000000b
508 capacity estimate: 100Mbit
509 min/max network layer size: 28 / 1500
510 min/max overhead-adjusted size: 84 / 1538
511 average network hdr offset: 14
512
513 Bulk Best Effort Voice
514 thresh 6250Kbit 100Mbit 25Mbit
515 target 5.0ms 5.0ms 5.0ms
516 interval 100.0ms 100.0ms 100.0ms
517 pk_delay 8.7ms 6.9ms 5.0ms
518 av_delay 4.9ms 5.3ms 3.8ms
519 sp_delay 727us 1.4ms 511us
520 pkts 2590 21271 8137
521 bytes 3081804 30302659 11426206
522 way_inds 0 46 0
523 way_miss 3 17 4
524 way_cols 0 0 0
525 drops 20 15 10
526 marks 0 0 0
527 ack_drop 0 0 0
528 sp_flows 2 4 1
529 bk_flows 1 2 1
530 un_flows 0 0 0
531 max_len 1514 1514 1514
532 quantum 300 1514 762
533
534
536 tc(8), tc-codel(8), tc-fq_codel(8), tc-htb(8)
537
538
540 Cake's principal author is Jonathan Morton, with contributions from
541 Tony Ambardar, Kevin Darbyshire-Bryant, Toke Høiland-Jørgensen, Sebas‐
542 tian Moeller, Ryan Mounce, Dean Scarff, Nils Andreas Svee, and Dave
543 Täht.
544
545 This manual page was written by Loganaden Velvindron. Please report
546 corrections to the Linux Networking mailing list <netdev@vger.ker‐
547 nel.org>.
548
549
550
551iproute2 19 July 2018 CAKE(8)