1TCP(7) Linux Programmer's Manual TCP(7)
2
3
4
6 tcp - TCP protocol
7
9 #include <sys/socket.h>
10 #include <netinet/in.h>
11 #include <netinet/tcp.h>
12 tcp_socket = socket(PF_INET, SOCK_STREAM, 0);
13
15 This is an implementation of the TCP protocol defined in RFC 793,
16 RFC 1122 and RFC 2001 with the NewReno and SACK extensions. It pro‐
17 vides a reliable, stream-oriented, full-duplex connection between two
18 sockets on top of ip(7), for both v4 and v6 versions. TCP guarantees
19 that the data arrives in order and retransmits lost packets. It gener‐
20 ates and checks a per-packet checksum to catch transmission errors.
21 TCP does not preserve record boundaries.
22
23 A newly created TCP socket has no remote or local address and is not
24 fully specified. To create an outgoing TCP connection use connect(2)
25 to establish a connection to another TCP socket. To receive new incom‐
26 ing connections, first bind(2) the socket to a local address and port
27 and then call listen(2) to put the socket into the listening state.
28 After that a new socket for each incoming connection can be accepted
29 using accept(2). A socket which has had accept() or connect() success‐
30 fully called on it is fully specified and may transmit data. Data can‐
31 not be transmitted on listening or not yet connected sockets.
32
33 Linux supports RFC 1323 TCP high performance extensions. These include
34 Protection Against Wrapped Sequence Numbers (PAWS), Window Scaling and
35 Timestamps. Window scaling allows the use of large (> 64K) TCP windows
36 in order to support links with high latency or bandwidth. To make use
37 of them, the send and receive buffer sizes must be increased. They can
38 be set globally with the net.ipv4.tcp_wmem and net.ipv4.tcp_rmem sysctl
39 variables, or on individual sockets by using the SO_SNDBUF and
40 SO_RCVBUF socket options with the setsockopt(2) call.
41
42 The maximum sizes for socket buffers declared via the SO_SNDBUF and
43 SO_RCVBUF mechanisms are limited by the global net.core.rmem_max and
44 net.core.wmem_max sysctls. Note that TCP actually allocates twice the
45 size of the buffer requested in the setsockopt(2) call, and so a suc‐
46 ceeding getsockopt(2) call will not return the same size of buffer as
47 requested in the setsockopt(2) call. TCP uses the extra space for
48 administrative purposes and internal kernel structures, and the sysctl
49 variables reflect the larger sizes compared to the actual TCP windows.
50 On individual connections, the socket buffer size must be set prior to
51 the listen() or connect() calls in order to have it take effect. See
52 socket(7) for more information.
53
54 TCP supports urgent data. Urgent data is used to signal the receiver
55 that some important message is part of the data stream and that it
56 should be processed as soon as possible. To send urgent data specify
57 the MSG_OOB option to send(2). When urgent data is received, the ker‐
58 nel sends a SIGURG signal to the process or process group that has been
59 set as the socket "owner" using the SIOCSPGRP or FIOSETOWN ioctls (or
60 the POSIX.1-2001-specified fcntl(2) F_SETOWN operation). When the
61 SO_OOBINLINE socket option is enabled, urgent data is put into the nor‐
62 mal data stream (a program can test for its location using the SIOCAT‐
63 MARK ioctl described below), otherwise it can be only received when the
64 MSG_OOB flag is set for recv(2) or recvmsg(2).
65
66 Linux 2.4 introduced a number of changes for improved throughput and
67 scaling, as well as enhanced functionality. Some of these features
68 include support for zero-copy sendfile(2), Explicit Congestion Notifi‐
69 cation, new management of TIME_WAIT sockets, keep-alive socket options
70 and support for Duplicate SACK extensions.
71
73 TCP is built on top of IP (see ip(7)). The address formats defined by
74 ip(7) apply to TCP. TCP only supports point-to-point communication;
75 broadcasting and multicasting are not supported.
76
78 These variables can be accessed by the /proc/sys/net/ipv4/* files or
79 with the sysctl(2) interface. In addition, most IP sysctls also apply
80 to TCP; see ip(7). Variables described as Boolean take an integer
81 value, with a non-zero value ("true") meaning that the corresponding
82 option is enabled, and a zero value ("false") meaning that the option
83 is disabled.
84
85 tcp_abort_on_overflow (Boolean; default: disabled)
86 Enable resetting connections if the listening service is too
87 slow and unable to keep up and accept them. It means that if
88 overflow occurred due to a burst, the connection will recover.
89 Enable this option only if you are really sure that the listen‐
90 ing daemon cannot be tuned to accept connections faster.
91 Enabling this option can harm the clients of your server.
92
93 tcp_adv_win_scale (integer; default: 2)
94 Count buffering overhead as bytes/2^tcp_adv_win_scale (if
95 tcp_adv_win_scale > 0) or bytes-bytes/2^(-tcp_adv_win_scale), if
96 it is <= 0.
97
98 The socket receive buffer space is shared between the applica‐
99 tion and kernel. TCP maintains part of the buffer as the TCP
100 window, this is the size of the receive window advertised to the
101 other end. The rest of the space is used as the "application"
102 buffer, used to isolate the network from scheduling and applica‐
103 tion latencies. The tcp_adv_win_scale default value of 2
104 implies that the space used for the application buffer is one
105 fourth that of the total.
106
107 tcp_app_win (integer; default: 31)
108 This variable defines how many bytes of the TCP window are
109 reserved for buffering overhead.
110
111 A maximum of (window/2^tcp_app_win, mss) bytes in the window are
112 reserved for the application buffer. A value of 0 implies that
113 no amount is reserved.
114
115 tcp_bic (Boolean; default: disabled)
116 Enable BIC TCP congestion control algorithm. BIC-TCP is a
117 sender-side only change that ensures a linear RTT fairness under
118 large windows while offering both scalability and bounded TCP-
119 friendliness. The protocol combines two schemes called additive
120 increase and binary search increase. When the congestion window
121 is large, additive increase with a large increment ensures lin‐
122 ear RTT fairness as well as good scalability. Under small con‐
123 gestion windows, binary search increase provides TCP friendli‐
124 ness.
125
126 tcp_bic_low_window (integer; default: 14)
127 Sets the threshold window (in packets) where BIC TCP starts to
128 adjust the congestion window. Below this threshold BIC TCP
129 behaves the same as the default TCP Reno.
130
131 tcp_bic_fast_convergence (Boolean; default: enabled)
132 Forces BIC TCP to more quickly respond to changes in congestion
133 window. Allows two flows sharing the same connection to converge
134 more rapidly.
135
136 tcp_dsack (Boolean; default: enabled)
137 Enable RFC 2883 TCP Duplicate SACK support.
138
139 tcp_ecn (Boolean; default: disabled)
140 Enable RFC 2884 Explicit Congestion Notification. When enabled,
141 connectivity to some destinations could be affected due to
142 older, misbehaving routers along the path causing connections to
143 be dropped.
144
145 tcp_fack (Boolean; default: enabled)
146 Enable TCP Forward Acknowledgement support.
147
148 tcp_fin_timeout (integer; default: 60)
149 This specifies how many seconds to wait for a final FIN packet
150 before the socket is forcibly closed. This is strictly a viola‐
151 tion of the TCP specification, but required to prevent denial-
152 of-service attacks. In Linux 2.2, the default value was 180.
153
154 tcp_frto (Boolean; default: disabled)
155 Enables F-RTO, an enhanced recovery algorithm for TCP retrans‐
156 mission timeouts. It is particularly beneficial in wireless
157 environments where packet loss is typically due to random radio
158 interference rather than intermediate router congestion.
159
160 tcp_keepalive_intvl (integer; default: 75)
161 The number of seconds between TCP keep-alive probes.
162
163 tcp_keepalive_probes (integer; default: 9)
164 The maximum number of TCP keep-alive probes to send before giv‐
165 ing up and killing the connection if no response is obtained
166 from the other end.
167
168 tcp_keepalive_time (integer; default: 7200)
169 The number of seconds a connection needs to be idle before TCP
170 begins sending out keep-alive probes. Keep-alives are only sent
171 when the SO_KEEPALIVE socket option is enabled. The default
172 value is 7200 seconds (2 hours). An idle connection is termi‐
173 nated after approximately an additional 11 minutes (9 probes an
174 interval of 75 seconds apart) when keep-alive is enabled.
175
176 Note that underlying connection tracking mechanisms and applica‐
177 tion timeouts may be much shorter.
178
179 tcp_low_latency (Boolean; default: disabled)
180 If enabled, the TCP stack makes decisions that prefer lower
181 latency as opposed to higher throughput. It this option is dis‐
182 abled, then higher throughput is preferred. An example of an
183 application where this default should be changed would be a
184 Beowulf compute cluster.
185
186 tcp_max_orphans (integer; default: see below)
187 The maximum number of orphaned (not attached to any user file
188 handle) TCP sockets allowed in the system. When this number is
189 exceeded, the orphaned connection is reset and a warning is
190 printed. This limit exists only to prevent simple denial-of-
191 service attacks. Lowering this limit is not recommended. Net‐
192 work conditions might require you to increase the number of
193 orphans allowed, but note that each orphan can eat up to ~64K of
194 unswappable memory. The default initial value is set equal to
195 the kernel parameter NR_FILE. This initial default is adjusted
196 depending on the memory in the system.
197
198 tcp_max_syn_backlog (integer; default: see below)
199 The maximum number of queued connection requests which have
200 still not received an acknowledgement from the connecting
201 client. If this number is exceeded, the kernel will begin drop‐
202 ping requests. The default value of 256 is increased to 1024
203 when the memory present in the system is adequate or greater (>=
204 128Mb), and reduced to 128 for those systems with very low mem‐
205 ory (<= 32Mb). It is recommended that if this needs to be
206 increased above 1024, TCP_SYNQ_HSIZE in include/net/tcp.h be
207 modified to keep TCP_SYNQ_HSIZE*16<=tcp_max_syn_backlog, and the
208 kernel be recompiled.
209
210 tcp_max_tw_buckets (integer; default: see below)
211 The maximum number of sockets in TIME_WAIT state allowed in the
212 system. This limit exists only to prevent simple denial-of-ser‐
213 vice attacks. The default value of NR_FILE*2 is adjusted
214 depending on the memory in the system. If this number is
215 exceeded, the socket is closed and a warning is printed.
216
217 tcp_mem
218 This is a vector of 3 integers: [low, pressure, high]. These
219 bounds are used by TCP to track its memory usage. The defaults
220 are calculated at boot time from the amount of available memory.
221 (TCP can only use low memory for this, which is limited to
222 around 900 megabytes on 32-bit systems. 64-bit systems do not
223 suffer this limitation.)
224
225 low - TCP doesn't regulate its memory allocation when the number
226 of pages it has allocated globally is below this number.
227
228 pressure - when the amount of memory allocated by TCP exceeds
229 this number of pages, TCP moderates its memory consumption.
230 This memory pressure state is exited once the number of pages
231 allocated falls below the low mark.
232
233 high - the maximum number of pages, globally, that TCP will
234 allocate. This value overrides any other limits imposed by the
235 kernel.
236
237 tcp_orphan_retries (integer; default: 8)
238 The maximum number of attempts made to probe the other end of a
239 connection which has been closed by our end.
240
241 tcp_reordering (integer; default: 3)
242 The maximum a packet can be reordered in a TCP packet stream
243 without TCP assuming packet loss and going into slow start. It
244 is not advisable to change this number. This is a packet
245 reordering detection metric designed to minimize unnecessary
246 back off and retransmits provoked by reordering of packets on a
247 connection.
248
249 tcp_retrans_collapse (Boolean; default: enabled)
250 Try to send full-sized packets during retransmit.
251
252 tcp_retries1 (integer; default: 3)
253 The number of times TCP will attempt to retransmit a packet on
254 an established connection normally, without the extra effort of
255 getting the network layers involved. Once we exceed this number
256 of retransmits, we first have the network layer update the route
257 if possible before each new retransmit. The default is the RFC
258 specified minimum of 3.
259
260 tcp_retries2 (integer; default: 15)
261 The maximum number of times a TCP packet is retransmitted in
262 established state before giving up. The default value is 15,
263 which corresponds to a duration of approximately between 13 to
264 30 minutes, depending on the retransmission timeout. The
265 RFC 1122 specified minimum limit of 100 seconds is typically
266 deemed too short.
267
268 tcp_rfc1337 (Boolean; default: disabled)
269 Enable TCP behaviour conformant with RFC 1337. When disabled,
270 if a RST is received in TIME_WAIT state, we close the socket
271 immediately without waiting for the end of the TIME_WAIT period.
272
273 tcp_rmem
274 This is a vector of 3 integers: [min, default, max]. These
275 parameters are used by TCP to regulate receive buffer sizes.
276 TCP dynamically adjusts the size of the receive buffer from the
277 defaults listed below, in the range of these sysctl variables,
278 depending on memory available in the system.
279
280 min - minimum size of the receive buffer used by each TCP
281 socket. The default value is 4K, and is lowered to PAGE_SIZE
282 bytes in low-memory systems. This value is used to ensure that
283 in memory pressure mode, allocations below this size will still
284 succeed. This is not used to bound the size of the receive buf‐
285 fer declared using SO_RCVBUF on a socket.
286
287 default - the default size of the receive buffer for a TCP
288 socket. This value overwrites the initial default buffer size
289 from the generic global net.core.rmem_default defined for all
290 protocols. The default value is 87380 bytes, and is lowered to
291 43689 in low-memory systems. If larger receive buffer sizes are
292 desired, this value should be increased (to affect all sockets).
293 To employ large TCP windows, the net.ipv4.tcp_window_scaling
294 must be enabled (default).
295
296 max - the maximum size of the receive buffer used by each TCP
297 socket. This value does not override the global
298 net.core.rmem_max. This is not used to limit the size of the
299 receive buffer declared using SO_RCVBUF on a socket. The
300 default value of 87380*2 bytes is lowered to 87380 in low-memory
301 systems.
302
303 tcp_sack (Boolean; default: enabled)
304 Enable RFC 2018 TCP Selective Acknowledgements.
305
306 tcp_stdurg (Boolean; default: disabled)
307 If this option is enabled, then use the RFC 1122 interpretation
308 of the TCP urgent-pointer field. According to this interpreta‐
309 tion, the urgent pointer points to the last byte of urgent data.
310 If this option is disabled, then use the BSD-compatible inter‐
311 pretation of the urgent pointer: the urgent pointer points to
312 the first byte after the urgent data. Enabling this option may
313 lead to interoperability problems.
314
315 tcp_synack_retries (integer; default: 5)
316 The maximum number of times a SYN/ACK segment for a passive TCP
317 connection will be retransmitted. This number should not be
318 higher than 255.
319
320 tcp_syncookies (Boolean)
321 Enable TCP syncookies. The kernel must be compiled with CON‐
322 FIG_SYN_COOKIES. Send out syncookies when the syn backlog queue
323 of a socket overflows. The syncookies feature attempts to pro‐
324 tect a socket from a SYN flood attack. This should be used as a
325 last resort, if at all. This is a violation of the TCP proto‐
326 col, and conflicts with other areas of TCP such as TCP exten‐
327 sions. It can cause problems for clients and relays. It is not
328 recommended as a tuning mechanism for heavily loaded servers to
329 help with overloaded or misconfigured conditions. For recom‐
330 mended alternatives see tcp_max_syn_backlog, tcp_synack_retries,
331 and tcp_abort_on_overflow.
332
333 tcp_syn_retries (integer; default: 5)
334 The maximum number of times initial SYNs for an active TCP con‐
335 nection attempt will be retransmitted. This value should not be
336 higher than 255. The default value is 5, which corresponds to
337 approximately 180 seconds.
338
339 tcp_timestamps (Boolean; default: enabled)
340 Enable RFC 1323 TCP timestamps.
341
342 tcp_tw_recycle (Boolean; default: disabled)
343 Enable fast recycling of TIME-WAIT sockets. Enabling this
344 option is not recommended since this causes problems when work‐
345 ing with NAT (Network Address Translation).
346
347 tcp_tw_reuse (Boolean; default: disabled)
348 Allow to reuse TIME-WAIT sockets for new connections when it is
349 safe from protocol viewpoint. It should not be changed without
350 advice/request of technical experts.
351
352 tcp_window_scaling (Boolean; default: enabled)
353 Enable RFC 1323 TCP window scaling. This feature allows the use
354 of a large window (> 64K) on a TCP connection, should the other
355 end support it. Normally, the 16 bit window length field in the
356 TCP header limits the window size to less than 64K bytes. If
357 larger windows are desired, applications can increase the size
358 of their socket buffers and the window scaling option will be
359 employed. If tcp_window_scaling is disabled, TCP will not nego‐
360 tiate the use of window scaling with the other end during con‐
361 nection setup.
362
363 tcp_vegas_cong_avoid (Boolean; default: disabled)
364 Enable TCP Vegas congestion avoidance algorithm. TCP Vegas is a
365 sender-side only change to TCP that anticipates the onset of
366 congestion by estimating the bandwidth. TCP Vegas adjusts the
367 sending rate by modifying the congestion window. TCP Vegas
368 should provide less packet loss, but it is not as aggressive as
369 TCP Reno.
370
371 tcp_westwood (Boolean; default: disabled)
372 Enable TCP Westwood+ congestion control algorithm. TCP West‐
373 wood+ is a sender-side only modification of the TCP Reno proto‐
374 col stack that optimizes the performance of TCP congestion con‐
375 trol. It is based on end-to-end bandwidth estimation to set con‐
376 gestion window and slow start threshold after a congestion
377 episode. Using this estimation, TCP Westwood+ adaptively sets a
378 slow start threshold and a congestion window which takes into
379 account the bandwidth used at the time congestion is experi‐
380 enced. TCP Westwood+ significantly increases fairness with
381 respect to TCP Reno in wired networks and throughput over wire‐
382 less links.
383
384 tcp_wmem
385 This is a vector of 3 integers: [min, default, max]. These
386 parameters are used by TCP to regulate send buffer sizes. TCP
387 dynamically adjusts the size of the send buffer from the default
388 values listed below, in the range of these sysctl variables,
389 depending on memory available.
390
391 min - minimum size of the send buffer used by each TCP socket.
392 The default value is 4K bytes. This value is used to ensure
393 that in memory pressure mode, allocations below this size will
394 still succeed. This is not used to bound the size of the send
395 buffer declared using SO_SNDBUF on a socket.
396
397 default - the default size of the send buffer for a TCP socket.
398 This value overwrites the initial default buffer size from the
399 generic global net.core.wmem_default defined for all protocols.
400 The default value is 16K bytes. If larger send buffer sizes are
401 desired, this value should be increased (to affect all sockets).
402 To employ large TCP windows, the sysctl variable
403 net.ipv4.tcp_window_scaling must be enabled (default).
404
405 max - the maximum size of the send buffer used by each TCP
406 socket. This value does not override the global
407 net.core.wmem_max. This is not used to limit the size of the
408 send buffer declared using SO_SNDBUF on a socket. The default
409 value is 128K bytes. It is lowered to 64K depending on the mem‐
410 ory available in the system.
411
413 To set or get a TCP socket option, call getsockopt(2) to read or set‐
414 sockopt(2) to write the option with the option level argument set to
415 IPPROTO_TCP. In addition, most IPPROTO_IP socket options are valid on
416 TCP sockets. For more information see ip(7).
417
418 TCP_CORK
419 If set, don't send out partial frames. All queued partial
420 frames are sent when the option is cleared again. This is use‐
421 ful for prepending headers before calling sendfile(2), or for
422 throughput optimization. As currently implemented, there is a
423 200 millisecond ceiling on the time for which output is corked
424 by TCP_CORK. If this ceiling is reached, then queued data is
425 automatically transmitted. This option can be combined with
426 TCP_NODELAY only since Linux 2.5.71. This option should not be
427 used in code intended to be portable.
428
429 TCP_DEFER_ACCEPT
430 Allows a listener to be awakened only when data arrives on the
431 socket. Takes an integer value (seconds), this can bound the
432 maximum number of attempts TCP will make to complete the connec‐
433 tion. This option should not be used in code intended to be
434 portable.
435
436 TCP_INFO
437 Used to collect information about this socket. The kernel
438 returns a struct tcp_info as defined in the file
439 /usr/include/linux/tcp.h. This option should not be used in
440 code intended to be portable.
441
442 TCP_KEEPCNT
443 The maximum number of keepalive probes TCP should send before
444 dropping the connection. This option should not be used in code
445 intended to be portable.
446
447 TCP_KEEPIDLE
448 The time (in seconds) the connection needs to remain idle before
449 TCP starts sending keepalive probes, if the socket option
450 SO_KEEPALIVE has been set on this socket. This option should
451 not be used in code intended to be portable.
452
453 TCP_KEEPINTVL
454 The time (in seconds) between individual keepalive probes. This
455 option should not be used in code intended to be portable.
456
457 TCP_LINGER2
458 The lifetime of orphaned FIN_WAIT2 state sockets. This option
459 can be used to override the system wide sysctl tcp_fin_timeout
460 on this socket. This is not to be confused with the socket(7)
461 level option SO_LINGER. This option should not be used in code
462 intended to be portable.
463
464 TCP_MAXSEG
465 The maximum segment size for outgoing TCP packets. If this
466 option is set before connection establishment, it also changes
467 the MSS value announced to the other end in the initial packet.
468 Values greater than the (eventual) interface MTU have no effect.
469 TCP will also impose its minimum and maximum bounds over the
470 value provided.
471
472 TCP_NODELAY
473 If set, disable the Nagle algorithm. This means that segments
474 are always sent as soon as possible, even if there is only a
475 small amount of data. When not set, data is buffered until
476 there is a sufficient amount to send out, thereby avoiding the
477 frequent sending of small packets, which results in poor uti‐
478 lization of the network. This option is overridden by TCP_CORK;
479 however, setting this option forces an explicit flush of pending
480 output, even if TCP_CORK is currently set.
481
482 TCP_QUICKACK
483 Enable quickack mode if set or disable quickack mode if cleared.
484 In quickack mode, acks are sent immediately, rather than delayed
485 if needed in accordance to normal TCP operation. This flag is
486 not permanent, it only enables a switch to or from quickack
487 mode. Subsequent operation of the TCP protocol will once again
488 enter/leave quickack mode depending on internal protocol pro‐
489 cessing and factors such as delayed ack timeouts occurring and
490 data transfer. This option should not be used in code intended
491 to be portable.
492
493 TCP_SYNCNT
494 Set the number of SYN retransmits that TCP should send before
495 aborting the attempt to connect. It cannot exceed 255. This
496 option should not be used in code intended to be portable.
497
498 TCP_WINDOW_CLAMP
499 Bound the size of the advertised window to this value. The ker‐
500 nel imposes a minimum size of SOCK_MIN_RCVBUF/2. This option
501 should not be used in code intended to be portable.
502
504 These following ioctl(2) calls return information in value. The cor‐
505 rect syntax is:
506
507 int value;
508 error = ioctl(tcp_socket, ioctl_type, &value);
509
510 ioctl_type is one of the following:
511
512 SIOCINQ
513 Returns the amount of queued unread data in the receive buffer.
514 The socket must not be in LISTEN state, otherwise an error (EIN‐
515 VAL) is returned.
516
517 SIOCATMARK
518 Returns true (i.e., value is non-zero) if the inbound data
519 stream is at the urgent mark.
520
521 If the SO_OOBINLINE socket option is set, and SIOCATMARK returns
522 true, then the next read from the socket will return the urgent
523 data. If the SO_OOBINLINE socket option is not set, and SIOCAT‐
524 MARK returns true, then the next read from the socket will
525 return the bytes following the urgent data (to actually read the
526 urgent data requires the recv(MSG_OOB) flag).
527
528 Note that a read never reads across the urgent mark. If an
529 application is informed of the presence of urgent data via
530 select(2) (using the exceptfds argument) or through delivery of
531 a SIGURG signal, then it can advance up to the mark using a loop
532 which repeatedly tests SIOCATMARK and performs a read (request‐
533 ing any number of bytes) as long as SIOCATMARK returns false.
534
535 SIOCOUTQ
536 Returns the amount of unsent data in the socket send queue. The
537 socket must not be in LISTEN state, otherwise an error (EINVAL)
538 is returned.
539
541 When a network error occurs, TCP tries to resend the packet. If it
542 doesn't succeed after some time, either ETIMEDOUT or the last received
543 error on this connection is reported.
544
545 Some applications require a quicker error notification. This can be
546 enabled with the IPPROTO_IP level IP_RECVERR socket option. When this
547 option is enabled, all incoming errors are immediately passed to the
548 user program. Use this option with care — it makes TCP less tolerant
549 to routing changes and other normal network conditions.
550
552 TCP has no real out-of-band data; it has urgent data. In Linux this
553 means if the other end sends newer out-of-band data the older urgent
554 data is inserted as normal data into the stream (even when SO_OOBINLINE
555 is not set). This differs from BSD-based stacks.
556
557 Linux uses the BSD compatible interpretation of the urgent pointer
558 field by default. This violates RFC 1122, but is required for interop‐
559 erability with other stacks. It can be changed by the tcp_stdurg
560 sysctl.
561
563 EPIPE The other end closed the socket unexpectedly or a read is exe‐
564 cuted on a shut down socket.
565
566 ETIMEDOUT
567 The other end didn't acknowledge retransmitted data after some
568 time.
569
570 EAFNOTSUPPORT
571 Passed socket address type in sin_family was not AF_INET.
572
573 Any errors defined for ip(7) or the generic socket layer may also be
574 returned for TCP.
575
577 Not all errors are documented.
578 IPv6 is not described.
579
581 Support for Explicit Congestion Notification, zero-copy sendfile(),
582 reordering support and some SACK extensions (DSACK) were introduced in
583 2.4. Support for forward acknowledgement (FACK), TIME_WAIT recycling,
584 per connection keepalive socket options and sysctls were introduced in
585 2.3.
586
587 The default values and descriptions for the sysctl variables given
588 above are applicable for the 2.4 kernel.
589
591 This man page was originally written by Andi Kleen. It was updated for
592 2.4 by Nivedita Singhvi with input from Alexey Kuznetsov's Documenta‐
593 tion/networking/ip-sysctls.txt document.
594
596 accept(2), bind(2), connect(2), getsockopt(2), listen(2), recvmsg(2),
597 sendfile(2), sendmsg(2), socket(2), sysctl(2), ip(7), socket(7)
598
599 RFC 793 for the TCP specification.
600 RFC 1122 for the TCP requirements and a description of the Nagle algo‐
601 rithm.
602 RFC 1323 for TCP timestamp and window scaling options.
603 RFC 1644 for a description of TIME_WAIT assassination hazards.
604 RFC 3168 for a description of Explicit Congestion Notification.
605 RFC 2581 for TCP congestion control algorithms.
606 RFC 2018 and RFC 2883 for SACK and extensions to SACK.
607
608
609
610Linux Man Page 2005-06-15 TCP(7)