1CAKE(8) Linux CAKE(8)
2
3
4
6 CAKE - Common Applications Kept Enhanced (CAKE)
7
9 tc qdisc ... cake
10 [ bandwidth RATE | unlimited* | autorate-ingress ]
11 [ rtt TIME | datacentre | lan | metro | regional | internet* | oceanic
12 | satellite | interplanetary ]
13 [ besteffort | diffserv8 | diffserv4 | diffserv3* ]
14 [ flowblind | srchost | dsthost | hosts | flows | dual-srchost | dual-
15 dsthost | triple-isolate* ]
16 [ nat | nonat* ]
17 [ wash | nowash* ]
18 [ split-gso* | no-split-gso ]
19 [ ack-filter | ack-filter-aggressive | no-ack-filter* ]
20 [ memlimit LIMIT ]
21 [ ptm | atm | noatm* ]
22 [ overhead N | conservative | raw* ]
23 [ mpu N ]
24 [ ingress | egress* ]
25 (* marks defaults)
26
27
28
30 CAKE (Common Applications Kept Enhanced) is a shaping-capable queue
31 discipline which uses both AQM and FQ. It combines COBALT, which is an
32 AQM algorithm combining Codel and BLUE, a shaper which operates in
33 deficit mode, and a variant of DRR++ for flow isolation. 8-way set-
34 associative hashing is used to virtually eliminate hash collisions.
35 Priority queuing is available through a simplified diffserv implementa‐
36 tion. Overhead compensation for various encapsulation schemes is
37 tightly integrated.
38
39 All settings are optional; the default settings are chosen to be sensi‐
40 ble in most common deployments. Most people will only need to set the
41 bandwidth parameter to get useful results, but reading the Overhead
42 Compensation and Round Trip Time sections is strongly encouraged.
43
44
46 CAKE uses a deficit-mode shaper, which does not exhibit the initial
47 burst typical of token-bucket shapers. It will automatically burst
48 precisely as much as required to maintain the configured throughput.
49 As such, it is very straightforward to configure.
50
51 unlimited (default)
52 No limit on the bandwidth.
53
54 bandwidth RATE
55 Set the shaper bandwidth. See tc(8) or examples below for details
56 of the RATE value.
57
58 autorate-ingress
59 Automatic capacity estimation based on traffic arriving at this
60 qdisc. This is most likely to be useful with cellular links, which
61 tend to change quality randomly. A bandwidth parameter can be used in
62 conjunction to specify an initial estimate. The shaper will periodi‐
63 cally be set to a bandwidth slightly below the estimated rate. This
64 estimator cannot estimate the bandwidth of links downstream of itself.
65
66
68 The size of each packet on the wire may differ from that seen by Linux.
69 The following parameters allow CAKE to compensate for this difference
70 by internally considering each packet to be bigger than Linux informs
71 it. To assist users who are not expert network engineers, keywords
72 have been provided to represent a number of common link technologies.
73
74
75 Manual Overhead Specification
76 overhead BYTES
77 Adds BYTES to the size of each packet. BYTES may be negative;
78 values between -64 and 256 (inclusive) are accepted.
79
80 mpu BYTES
81 Rounds each packet (including overhead) up to a minimum length
82 BYTES. BYTES may not be negative; values between 0 and 256 (inclusive)
83 are accepted.
84
85 atm
86 Compensates for ATM cell framing, which is normally found on ADSL
87 links. This is performed after the overhead parameter above. ATM uses
88 fixed 53-byte cells, each of which can carry 48 bytes payload.
89
90 ptm
91 Compensates for PTM encoding, which is normally found on VDSL2
92 links and uses a 64b/65b encoding scheme. It is even more efficient to
93 simply derate the specified shaper bandwidth by a factor of 64/65 or
94 0.984. See ITU G.992.3 Annex N and IEEE 802.3 Section 61.3 for details.
95
96 noatm
97 Disables ATM and PTM compensation.
98
99
100 Failsafe Overhead Keywords
101 These two keywords are provided for quick-and-dirty setup. Use them if
102 you can't be bothered to read the rest of this section.
103
104 raw (default)
105 Turns off all overhead compensation in CAKE. The packet size
106 reported by Linux will be used directly.
107
108 Other overhead keywords may be added after "raw". The effect of
109 this is to make the overhead compensation operate relative to the
110 reported packet size, not the underlying IP packet size.
111
112 conservative
113 Compensates for more overhead than is likely to occur on any
114 widely-deployed link technology.
115 Equivalent to overhead 48 atm.
116
117
118 ADSL Overhead Keywords
119 Most ADSL modems have a way to check which framing scheme is in use.
120 Often this is also specified in the settings document provided by the
121 ISP. The keywords in this section are intended to correspond with
122 these sources of information. All of them implicitly set the atm flag.
123
124 pppoa-vcmux
125 Equivalent to overhead 10 atm
126
127 pppoa-llc
128 Equivalent to overhead 14 atm
129
130 pppoe-vcmux
131 Equivalent to overhead 32 atm
132
133 pppoe-llcsnap
134 Equivalent to overhead 40 atm
135
136 bridged-vcmux
137 Equivalent to overhead 24 atm
138
139 bridged-llcsnap
140 Equivalent to overhead 32 atm
141
142 ipoa-vcmux
143 Equivalent to overhead 8 atm
144
145 ipoa-llcsnap
146 Equivalent to overhead 16 atm
147
148 See also the Ethernet Correction Factors section below.
149
150
151 VDSL2 Overhead Keywords
152 ATM was dropped from VDSL2 in favour of PTM, which is a much more
153 straightforward framing scheme. Some ISPs retained PPPoE for compati‐
154 bility with their existing back-end systems.
155
156 pppoe-ptm
157 Equivalent to overhead 30 ptm
158
159 PPPoE: 2B PPP + 6B PPPoE +
160 ETHERNET: 6B dest MAC + 6B src MAC + 2B ethertype + 4B Frame Check
161 Sequence +
162 PTM: 1B Start of Frame (S) + 1B End of Frame (Ck) + 2B TC-CRC
163 (PTM-FCS)
164
165 bridged-ptm
166 Equivalent to overhead 22 ptm
167 ETHERNET: 6B dest MAC + 6B src MAC + 2B ethertype + 4B Frame Check
168 Sequence +
169 PTM: 1B Start of Frame (S) + 1B End of Frame (Ck) + 2B TC-CRC
170 (PTM-FCS)
171
172 See also the Ethernet Correction Factors section below.
173
174
175 DOCSIS Cable Overhead Keyword
176 DOCSIS is the universal standard for providing Internet service over
177 cable-TV infrastructure.
178
179 In this case, the actual on-wire overhead is less important than the
180 packet size the head-end equipment uses for shaping and metering. This
181 is specified to be an Ethernet frame including the CRC (aka FCS).
182
183 docsis
184 Equivalent to overhead 18 mpu 64 noatm
185
186
187 Ethernet Overhead Keywords
188 ethernet
189 Accounts for Ethernet's preamble, inter-frame gap, and Frame Check
190 Sequence. Use this keyword when the bottleneck being shaped for is an
191 actual Ethernet cable.
192 Equivalent to overhead 38 mpu 84 noatm
193
194 ether-vlan
195 Adds 4 bytes to the overhead compensation, accounting for an IEEE
196 802.1Q VLAN header appended to the Ethernet frame header. NB: Some
197 ISPs use one or even two of these within PPPoE; this keyword may be
198 repeated as necessary to express this.
199
200
202 Active Queue Management (AQM) consists of embedding congestion signals
203 in the packet flow, which receivers use to instruct senders to slow
204 down when the queue is persistently occupied. CAKE uses ECN signalling
205 when available, and packet drops otherwise, according to a combination
206 of the Codel and BLUE AQM algorithms called COBALT.
207
208 Very short latencies require a very rapid AQM response to adequately
209 control latency. However, such a rapid response tends to impair
210 throughput when the actual RTT is relatively long. CAKE allows speci‐
211 fying the RTT it assumes for tuning various parameters. Actual RTTs
212 within an order of magnitude of this will generally work well for both
213 throughput and latency management.
214
215 At the 'lan' setting and below, the time constants are similar in mag‐
216 nitude to the jitter in the Linux kernel itself, so congestion might be
217 signalled prematurely. The flows will then become sparse and total
218 throughput reduced, leaving little or no back-pressure for the fairness
219 logic to work against. Use the "metro" setting for local lans unless
220 you have a custom kernel.
221
222 rtt TIME
223 Manually specify an RTT.
224
225 datacentre
226 For extremely high-performance 10GigE+ networks only. Equivalent
227 to rtt 100us.
228
229 lan
230 For pure Ethernet (not Wi-Fi) networks, at home or in the office.
231 Don't use this when shaping for an Internet access link. Equivalent to
232 rtt 1ms.
233
234 metro
235 For traffic mostly within a single city. Equivalent to rtt 10ms.
236
237 regional
238 For traffic mostly within a European-sized country. Equivalent to
239 rtt 30ms.
240
241 internet (default)
242 This is suitable for most Internet traffic. Equivalent to rtt
243 100ms.
244
245 oceanic
246 For Internet traffic with generally above-average latency, such as
247 that suffered by Australasian residents. Equivalent to rtt 300ms.
248
249 satellite
250 For traffic via geostationary satellites. Equivalent to rtt
251 1000ms.
252
253 interplanetary
254 So named because Jupiter is about 1 light-hour from Earth. Use
255 this to (almost) completely disable AQM actions. Equivalent to rtt
256 3600s.
257
258
260 With flow isolation enabled, CAKE places packets from different flows
261 into different queues, each of which carries its own AQM state. Pack‐
262 ets from each queue are then delivered fairly, according to a DRR++
263 algorithm which minimises latency for "sparse" flows. CAKE uses a set-
264 associative hashing algorithm to minimise flow collisions.
265
266 These keywords specify whether fairness based on source address, desti‐
267 nation address, individual flows, or any combination of those is
268 desired.
269
270 flowblind
271 Disables flow isolation; all traffic passes through a single queue
272 for each tin.
273
274 srchost
275 Flows are defined only by source address. Could be useful on the
276 egress path of an ISP backhaul.
277
278 dsthost
279 Flows are defined only by destination address. Could be useful on
280 the ingress path of an ISP backhaul.
281
282 hosts
283 Flows are defined by source-destination host pairs. This is host
284 isolation, rather than flow isolation.
285
286 flows
287 Flows are defined by the entire 5-tuple of source address, desti‐
288 nation address, transport protocol, source port and destination port.
289 This is the type of flow isolation performed by SFQ and fq_codel.
290
291 dual-srchost
292 Flows are defined by the 5-tuple, and fairness is applied first
293 over source addresses, then over individual flows. Good for use on
294 egress traffic from a LAN to the internet, where it'll prevent any one
295 LAN host from monopolising the uplink, regardless of the number of
296 flows they use.
297
298 dual-dsthost
299 Flows are defined by the 5-tuple, and fairness is applied first
300 over destination addresses, then over individual flows. Good for use
301 on ingress traffic to a LAN from the internet, where it'll prevent any
302 one LAN host from monopolising the downlink, regardless of the number
303 of flows they use.
304
305 triple-isolate (default)
306 Flows are defined by the 5-tuple, and fairness is applied over
307 source *and* destination addresses intelligently (ie. not merely by
308 host-pairs), and also over individual flows. Use this if you're not
309 certain whether to use dual-srchost or dual-dsthost; it'll do both jobs
310 at once, preventing any one host on *either* side of the link from
311 monopolising it with a large number of flows.
312
313 nat
314 Instructs Cake to perform a NAT lookup before applying flow-isola‐
315 tion rules, to determine the true addresses and port numbers of the
316 packet, to improve fairness between hosts "inside" the NAT. This has
317 no practical effect in "flowblind" or "flows" modes, or if NAT is per‐
318 formed on a different host.
319
320 nonat (default)
321 Cake will not perform a NAT lookup. Flow isolation will be per‐
322 formed using the addresses and port numbers directly visible to the
323 interface Cake is attached to.
324
325
327 CAKE can divide traffic into "tins" based on the Diffserv field. Each
328 tin has its own independent set of flow-isolation queues, and is ser‐
329 viced based on a WRR algorithm. To avoid perverse Diffserv marking
330 incentives, tin weights have a "priority sharing" value when bandwidth
331 used by that tin is below a threshold, and a lower "bandwidth sharing"
332 value when above. Bandwidth is compared against the threshold using
333 the same algorithm as the deficit-mode shaper.
334
335 Detailed customisation of tin parameters is not provided. The follow‐
336 ing presets perform all necessary tuning, relative to the current
337 shaper bandwidth and RTT settings.
338
339 besteffort
340 Disables priority queuing by placing all traffic in one tin.
341
342 precedence
343 Enables legacy interpretation of TOS "Precedence" field. Use of
344 this preset on the modern Internet is firmly discouraged.
345
346 diffserv4
347 Provides a general-purpose Diffserv implementation with four tins:
348 Bulk (CS1), 6.25% threshold, generally low priority.
349 Best Effort (general), 100% threshold.
350 Video (AF4x, AF3x, CS3, AF2x, CS2, TOS4, TOS1), 50% thresh‐
351 old.
352 Voice (CS7, CS6, EF, VA, CS5, CS4), 25% threshold.
353
354 diffserv3 (default)
355 Provides a simple, general-purpose Diffserv implementation with
356 three tins:
357 Bulk (CS1), 6.25% threshold, generally low priority.
358 Best Effort (general), 100% threshold.
359 Voice (CS7, CS6, EF, VA, TOS4), 25% threshold, reduced Codel
360 interval.
361
362
364 memlimit LIMIT
365 Limit the memory consumed by Cake to LIMIT bytes. Note that this
366 does not translate directly to queue size (so do not size this based on
367 bandwidth delay product considerations, but rather on worst case
368 acceptable memory consumption), as there is some overhead in the data
369 structures containing the packets, especially for small packets.
370
371 By default, the limit is calculated based on the bandwidth and RTT
372 settings.
373
374
375 wash
376
377 Traffic entering your diffserv domain is frequently mis-marked in
378 transit from the perspective of your network, and traffic exiting yours
379 may be mis-marked from the perspective of the transiting provider.
380
381 Apply the wash option to clear all extra diffserv (but not ECN bits),
382 after priority queuing has taken place.
383
384 If you are shaping inbound, and cannot trust the diffserv markings (as
385 is the case for Comcast Cable, among others), it is best to use a sin‐
386 gle queue "besteffort" mode with wash.
387
388
389 split-gso
390
391 This option controls whether CAKE will split General Segmentation
392 Offload (GSO) super-packets into their on-the-wire components and
393 dequeue them individually.
394
395 Super-packets are created by the networking stack to improve effi‐
396 ciency. However, because they are larger they take longer to dequeue,
397 which translates to higher latency for competing flows, especially at
398 lower bandwidths. CAKE defaults to splitting GSO packets to achieve the
399 lowest possible latency. At link speeds higher than 10 Gbps, setting
400 the no-split-gso parameter can increase the maximum achievable through‐
401 put by retaining the full GSO packets.
402
403
405 CAKE supports overriding of its internal classification of packets
406 through the tc filter mechanism. Packets can be assigned to different
407 priority tins by setting the priority field on the skb, and the flow
408 hashing can be overridden by setting the classid parameter.
409
410
411 Tin override
412
413 To assign a priority tin, the major number of the priority
414 field needs to match the qdisc handle of the cake instance; if it does,
415 the minor number will be interpreted as the tin index. For example, to
416 classify all ICMP packets as 'bulk', the following filter can be used:
417
418 # tc qdisc replace dev eth0 handle 1: root cake diffserv3
419 # tc filter add dev eth0 parent 1: protocol ip prio 1 \
420 u32 match icmp type 0 0 action skbedit priority 1:1
421
422
423 Flow hash override
424
425 To override flow hashing, the classid can be set. CAKE will
426 interpret the major number of the classid as the host hash used in host
427 isolation mode, and the minor number as the flow hash used for flow-
428 based queueing. One or both of those can be set, and will be used if
429 the relevant flow isolation parameter is set (i.e., the major number
430 will be ignored if CAKE is not configured in hosts mode, and the minor
431 number will be ignored if CAKE is not configured in flows mode).
432
433 This example will assign all ICMP packets to the first queue:
434
435 # tc qdisc replace dev eth0 handle 1: root cake
436 # tc filter add dev eth0 parent 1: protocol ip prio 1 \
437 u32 match icmp type 0 0 classid 0:1
438
439 If only one of the host and flow overrides is set, CAKE will compute
440 the other hash from the packet as normal. Note, however, that the host
441 isolation mode works by assigning a host ID to the flow queue; so if
442 overriding both host and flow, the same flow cannot have more than one
443 host assigned. In addition, it is not possible to assign different
444 source and destination host IDs through the override mechanism; if a
445 host ID is assigned, it will be used as both source and destination
446 host.
447
448
449
450
452 # tc qdisc delete root dev eth0
453 # tc qdisc add root dev eth0 cake bandwidth 100Mbit ethernet
454 # tc -s qdisc show dev eth0
455 qdisc cake 1: root refcnt 2 bandwidth 100Mbit diffserv3 triple-isolate
456 rtt 100.0ms noatm overhead 38 mpu 84
457 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
458 backlog 0b 0p requeues 0
459 memory used: 0b of 5000000b
460 capacity estimate: 100Mbit
461 min/max network layer size: 65535 / 0
462 min/max overhead-adjusted size: 65535 / 0
463 average network hdr offset: 0
464
465 Bulk Best Effort Voice
466 thresh 6250Kbit 100Mbit 25Mbit
467 target 5.0ms 5.0ms 5.0ms
468 interval 100.0ms 100.0ms 100.0ms
469 pk_delay 0us 0us 0us
470 av_delay 0us 0us 0us
471 sp_delay 0us 0us 0us
472 pkts 0 0 0
473 bytes 0 0 0
474 way_inds 0 0 0
475 way_miss 0 0 0
476 way_cols 0 0 0
477 drops 0 0 0
478 marks 0 0 0
479 ack_drop 0 0 0
480 sp_flows 0 0 0
481 bk_flows 0 0 0
482 un_flows 0 0 0
483 max_len 0 0 0
484 quantum 300 1514 762
485
486 After some use:
487 # tc -s qdisc show dev eth0
488
489 qdisc cake 1: root refcnt 2 bandwidth 100Mbit diffserv3 triple-isolate
490 rtt 100.0ms noatm overhead 38 mpu 84
491 Sent 44709231 bytes 31931 pkt (dropped 45, overlimits 93782 requeues
492 0)
493 backlog 33308b 22p requeues 0
494 memory used: 292352b of 5000000b
495 capacity estimate: 100Mbit
496 min/max network layer size: 28 / 1500
497 min/max overhead-adjusted size: 84 / 1538
498 average network hdr offset: 14
499
500 Bulk Best Effort Voice
501 thresh 6250Kbit 100Mbit 25Mbit
502 target 5.0ms 5.0ms 5.0ms
503 interval 100.0ms 100.0ms 100.0ms
504 pk_delay 8.7ms 6.9ms 5.0ms
505 av_delay 4.9ms 5.3ms 3.8ms
506 sp_delay 727us 1.4ms 511us
507 pkts 2590 21271 8137
508 bytes 3081804 30302659 11426206
509 way_inds 0 46 0
510 way_miss 3 17 4
511 way_cols 0 0 0
512 drops 20 15 10
513 marks 0 0 0
514 ack_drop 0 0 0
515 sp_flows 2 4 1
516 bk_flows 1 2 1
517 un_flows 0 0 0
518 max_len 1514 1514 1514
519 quantum 300 1514 762
520
521
523 tc(8), tc-codel(8), tc-fq_codel(8), tc-htb(8)
524
525
527 Cake's principal author is Jonathan Morton, with contributions from
528 Tony Ambardar, Kevin Darbyshire-Bryant, Toke Høiland-Jørgensen, Sebas‐
529 tian Moeller, Ryan Mounce, Dean Scarff, Nils Andreas Svee, and Dave
530 Täht.
531
532 This manual page was written by Loganaden Velvindron. Please report
533 corrections to the Linux Networking mailing list <netdev@vger.ker‐
534 nel.org>.
535
536
537
538iproute2 19 July 2018 CAKE(8)