tcp(7) - f7

1TCP(7)                     Linux Programmer's Manual                    TCP(7)
2
3
4

NAME

6       tcp - TCP protocol
7

SYNOPSIS

9       #include <sys/socket.h>
10       #include <netinet/in.h>
11       #include <netinet/tcp.h>
12       tcp_socket = socket(PF_INET, SOCK_STREAM, 0);
13

DESCRIPTION

15       This  is  an  implementation  of  the  TCP protocol defined in RFC 793,
16       RFC 1122 and RFC 2001 with the NewReno and SACK  extensions.   It  pro‐
17       vides  a  reliable, stream-oriented, full-duplex connection between two
18       sockets on top of ip(7), for both v4 and v6 versions.   TCP  guarantees
19       that the data arrives in order and retransmits lost packets.  It gener‐
20       ates and checks a per-packet checksum  to  catch  transmission  errors.
21       TCP does not preserve record boundaries.
22
23       A  newly  created  TCP socket has no remote or local address and is not
24       fully specified.  To create an outgoing TCP connection  use  connect(2)
25       to establish a connection to another TCP socket.  To receive new incom‐
26       ing connections, first bind(2) the socket to a local address  and  port
27       and  then  call  listen(2)  to put the socket into the listening state.
28       After that a new socket for each incoming connection  can  be  accepted
29       using accept(2).  A socket which has had accept() or connect() success‐
30       fully called on it is fully specified and may transmit data.  Data can‐
31       not be transmitted on listening or not yet connected sockets.
32
33       Linux supports RFC 1323 TCP high performance extensions.  These include
34       Protection Against Wrapped Sequence Numbers (PAWS), Window Scaling  and
35       Timestamps.  Window scaling allows the use of large (> 64K) TCP windows
36       in order to support links with high latency or bandwidth.  To make  use
37       of them, the send and receive buffer sizes must be increased.  They can
38       be set globally with the net.ipv4.tcp_wmem and net.ipv4.tcp_rmem sysctl
39       variables,  or  on  individual  sockets  by  using  the  SO_SNDBUF  and
40       SO_RCVBUF socket options with the setsockopt(2) call.
41
42       The maximum sizes for socket buffers declared  via  the  SO_SNDBUF  and
43       SO_RCVBUF  mechanisms  are  limited by the global net.core.rmem_max and
44       net.core.wmem_max sysctls.  Note that TCP actually allocates twice  the
45       size  of  the buffer requested in the setsockopt(2) call, and so a suc‐
46       ceeding getsockopt(2) call will not return the same size of  buffer  as
47       requested  in  the  setsockopt(2)  call.   TCP uses the extra space for
48       administrative purposes and internal kernel structures, and the  sysctl
49       variables  reflect the larger sizes compared to the actual TCP windows.
50       On individual connections, the socket buffer size must be set prior  to
51       the  listen()  or  connect() calls in order to have it take effect. See
52       socket(7) for more information.
53
54       TCP supports urgent data.  Urgent data is used to signal  the  receiver
55       that  some  important  message  is  part of the data stream and that it
56       should be processed as soon as possible.  To send urgent  data  specify
57       the  MSG_OOB option to send(2).  When urgent data is received, the ker‐
58       nel sends a SIGURG signal to the process or process group that has been
59       set  as  the socket "owner" using the SIOCSPGRP or FIOSETOWN ioctls (or
60       the POSIX.1-2001-specified  fcntl(2)  F_SETOWN  operation).   When  the
61       SO_OOBINLINE socket option is enabled, urgent data is put into the nor‐
62       mal data stream (a program can test for its location using the  SIOCAT‐
63       MARK ioctl described below), otherwise it can be only received when the
64       MSG_OOB flag is set for recv(2) or recvmsg(2).
65
66       Linux 2.4 introduced a number of changes for  improved  throughput  and
67       scaling,  as  well  as  enhanced functionality.  Some of these features
68       include support for zero-copy sendfile(2), Explicit Congestion  Notifi‐
69       cation,  new management of TIME_WAIT sockets, keep-alive socket options
70       and support for Duplicate SACK extensions.
71

ADDRESS FORMATS

73       TCP is built on top of IP (see ip(7)).  The address formats defined  by
74       ip(7)  apply  to  TCP.  TCP only supports point-to-point communication;
75       broadcasting and multicasting are not supported.
76

SYSCTLS

78       These variables can be accessed by the  /proc/sys/net/ipv4/*  files  or
79       with  the sysctl(2) interface.  In addition, most IP sysctls also apply
80       to TCP; see ip(7).  Variables described  as  Boolean  take  an  integer
81       value,  with  a  non-zero value ("true") meaning that the corresponding
82       option is enabled, and a zero value ("false") meaning that  the  option
83       is disabled.
84
85       tcp_abort_on_overflow (Boolean; default: disabled)
86              Enable  resetting  connections  if  the listening service is too
87              slow and unable to keep up and accept them.  It  means  that  if
88              overflow  occurred  due to a burst, the connection will recover.
89              Enable this option only if you are really sure that the  listen‐
90              ing  daemon  cannot  be  tuned  to  accept  connections  faster.
91              Enabling this option can harm the clients of your server.
92
93       tcp_adv_win_scale (integer; default: 2)
94              Count  buffering  overhead  as   bytes/2^tcp_adv_win_scale   (if
95              tcp_adv_win_scale > 0) or bytes-bytes/2^(-tcp_adv_win_scale), if
96              it is <= 0.
97
98              The socket receive buffer space is shared between  the  applica‐
99              tion  and  kernel.   TCP maintains part of the buffer as the TCP
100              window, this is the size of the receive window advertised to the
101              other  end.   The rest of the space is used as the "application"
102              buffer, used to isolate the network from scheduling and applica‐
103              tion  latencies.   The  tcp_adv_win_scale  default  value  of  2
104              implies that the space used for the application  buffer  is  one
105              fourth that of the total.
106
107       tcp_app_win (integer; default: 31)
108              This  variable  defines  how  many  bytes  of the TCP window are
109              reserved for buffering overhead.
110
111              A maximum of (window/2^tcp_app_win, mss) bytes in the window are
112              reserved  for the application buffer.  A value of 0 implies that
113              no amount is reserved.
114
115       tcp_bic (Boolean; default: disabled)
116              Enable BIC TCP  congestion  control  algorithm.   BIC-TCP  is  a
117              sender-side only change that ensures a linear RTT fairness under
118              large windows while offering both scalability and  bounded  TCP-
119              friendliness.  The protocol combines two schemes called additive
120              increase and binary search increase. When the congestion  window
121              is  large, additive increase with a large increment ensures lin‐
122              ear RTT fairness as well as good scalability. Under  small  con‐
123              gestion  windows,  binary search increase provides TCP friendli‐
124              ness.
125
126       tcp_bic_low_window (integer; default: 14)
127              Sets the threshold window (in packets) where BIC TCP  starts  to
128              adjust  the  congestion  window.  Below  this  threshold BIC TCP
129              behaves the same as the default TCP Reno.
130
131       tcp_bic_fast_convergence (Boolean; default: enabled)
132              Forces BIC TCP to more quickly respond to changes in  congestion
133              window. Allows two flows sharing the same connection to converge
134              more rapidly.
135
136       tcp_dsack (Boolean; default: enabled)
137              Enable RFC 2883 TCP Duplicate SACK support.
138
139       tcp_ecn (Boolean; default: disabled)
140              Enable RFC 2884 Explicit Congestion Notification.  When enabled,
141              connectivity  to  some  destinations  could  be  affected due to
142              older, misbehaving routers along the path causing connections to
143              be dropped.
144
145       tcp_fack (Boolean; default: enabled)
146              Enable TCP Forward Acknowledgement support.
147
148       tcp_fin_timeout (integer; default: 60)
149              This  specifies  how many seconds to wait for a final FIN packet
150              before the socket is forcibly closed.  This is strictly a viola‐
151              tion  of  the TCP specification, but required to prevent denial-
152              of-service attacks.  In Linux 2.2, the default value was 180.
153
154       tcp_frto (Boolean; default: disabled)
155              Enables F-RTO, an enhanced recovery algorithm for  TCP  retrans‐
156              mission  timeouts.   It  is  particularly beneficial in wireless
157              environments where packet loss is typically due to random  radio
158              interference rather than intermediate router congestion.
159
160       tcp_keepalive_intvl (integer; default: 75)
161              The number of seconds between TCP keep-alive probes.
162
163       tcp_keepalive_probes (integer; default: 9)
164              The  maximum number of TCP keep-alive probes to send before giv‐
165              ing up and killing the connection if  no  response  is  obtained
166              from the other end.
167
168       tcp_keepalive_time (integer; default: 7200)
169              The  number  of seconds a connection needs to be idle before TCP
170              begins sending out keep-alive probes.  Keep-alives are only sent
171              when  the  SO_KEEPALIVE  socket  option is enabled.  The default
172              value is 7200 seconds (2 hours).  An idle connection  is  termi‐
173              nated  after approximately an additional 11 minutes (9 probes an
174              interval of 75 seconds apart) when keep-alive is enabled.
175
176              Note that underlying connection tracking mechanisms and applica‐
177              tion timeouts may be much shorter.
178
179       tcp_low_latency (Boolean; default: disabled)
180              If  enabled,  the  TCP  stack  makes decisions that prefer lower
181              latency as opposed to higher throughput.  It this option is dis‐
182              abled,  then  higher  throughput is preferred.  An example of an
183              application where this default should  be  changed  would  be  a
184              Beowulf compute cluster.
185
186       tcp_max_orphans (integer; default: see below)
187              The  maximum  number  of orphaned (not attached to any user file
188              handle) TCP sockets allowed in the system.  When this number  is
189              exceeded,  the  orphaned  connection  is  reset and a warning is
190              printed.  This limit exists only to  prevent  simple  denial-of-
191              service  attacks.   Lowering this limit is not recommended. Net‐
192              work conditions might require you  to  increase  the  number  of
193              orphans allowed, but note that each orphan can eat up to ~64K of
194              unswappable memory.  The default initial value is set  equal  to
195              the  kernel parameter NR_FILE.  This initial default is adjusted
196              depending on the memory in the system.
197
198       tcp_max_syn_backlog (integer; default: see below)
199              The maximum number of  queued  connection  requests  which  have
200              still  not  received  an  acknowledgement  from  the  connecting
201              client.  If this number is exceeded, the kernel will begin drop‐
202              ping  requests.   The  default value of 256 is increased to 1024
203              when the memory present in the system is adequate or greater (>=
204              128Mb),  and reduced to 128 for those systems with very low mem‐
205              ory (<= 32Mb).  It is recommended  that  if  this  needs  to  be
206              increased  above  1024,  TCP_SYNQ_HSIZE  in include/net/tcp.h be
207              modified to keep TCP_SYNQ_HSIZE*16<=tcp_max_syn_backlog, and the
208              kernel be recompiled.
209
210       tcp_max_tw_buckets (integer; default: see below)
211              The  maximum number of sockets in TIME_WAIT state allowed in the
212              system.  This limit exists only to prevent simple denial-of-ser‐
213              vice  attacks.   The  default  value  of  NR_FILE*2  is adjusted
214              depending on the memory  in  the  system.   If  this  number  is
215              exceeded, the socket is closed and a warning is printed.
216
217       tcp_mem
218              This  is  a  vector of 3 integers: [low, pressure, high].  These
219              bounds are used by TCP to track its memory usage.  The  defaults
220              are calculated at boot time from the amount of available memory.
221              (TCP can only use low memory  for  this,  which  is  limited  to
222              around  900  megabytes on 32-bit systems.  64-bit systems do not
223              suffer this limitation.)
224
225              low - TCP doesn't regulate its memory allocation when the number
226              of pages it has allocated globally is below this number.
227
228              pressure  -  when  the amount of memory allocated by TCP exceeds
229              this number of pages,  TCP  moderates  its  memory  consumption.
230              This  memory  pressure  state is exited once the number of pages
231              allocated falls below the low mark.
232
233              high - the maximum number of  pages,  globally,  that  TCP  will
234              allocate.   This value overrides any other limits imposed by the
235              kernel.
236
237       tcp_orphan_retries (integer; default: 8)
238              The maximum number of attempts made to probe the other end of  a
239              connection which has been closed by our end.
240
241       tcp_reordering (integer; default: 3)
242              The  maximum  a  packet  can be reordered in a TCP packet stream
243              without TCP assuming packet loss and going into slow start.   It
244              is  not  advisable  to  change  this  number.   This is a packet
245              reordering detection metric  designed  to  minimize  unnecessary
246              back  off and retransmits provoked by reordering of packets on a
247              connection.
248
249       tcp_retrans_collapse (Boolean; default: enabled)
250              Try to send full-sized packets during retransmit.
251
252       tcp_retries1 (integer; default: 3)
253              The number of times TCP will attempt to retransmit a  packet  on
254              an  established connection normally, without the extra effort of
255              getting the network layers involved.  Once we exceed this number
256              of retransmits, we first have the network layer update the route
257              if possible before each new retransmit.  The default is the  RFC
258              specified minimum of 3.
259
260       tcp_retries2 (integer; default: 15)
261              The  maximum  number  of  times a TCP packet is retransmitted in
262              established state before giving up.  The default  value  is  15,
263              which  corresponds  to a duration of approximately between 13 to
264              30  minutes,  depending  on  the  retransmission  timeout.   The
265              RFC 1122  specified  minimum  limit  of 100 seconds is typically
266              deemed too short.
267
268       tcp_rfc1337 (Boolean; default: disabled)
269              Enable TCP behaviour conformant with RFC 1337.   When  disabled,
270              if  a  RST  is  received in TIME_WAIT state, we close the socket
271              immediately without waiting for the end of the TIME_WAIT period.
272
273       tcp_rmem
274              This is a vector of 3  integers:  [min,  default,  max].   These
275              parameters  are  used  by  TCP to regulate receive buffer sizes.
276              TCP dynamically adjusts the size of the receive buffer from  the
277              defaults  listed  below, in the range of these sysctl variables,
278              depending on memory available in the system.
279
280              min - minimum size of  the  receive  buffer  used  by  each  TCP
281              socket.   The  default  value is 4K, and is lowered to PAGE_SIZE
282              bytes in low-memory systems.  This value is used to ensure  that
283              in  memory pressure mode, allocations below this size will still
284              succeed.  This is not used to bound the size of the receive buf‐
285              fer declared using SO_RCVBUF on a socket.
286
287              default  -  the  default  size  of  the receive buffer for a TCP
288              socket.  This value overwrites the initial default  buffer  size
289              from  the  generic  global net.core.rmem_default defined for all
290              protocols.  The default value is 87380 bytes, and is lowered  to
291              43689 in low-memory systems.  If larger receive buffer sizes are
292              desired, this value should be increased (to affect all sockets).
293              To  employ  large  TCP  windows, the net.ipv4.tcp_window_scaling
294              must be enabled (default).
295
296              max - the maximum size of the receive buffer used  by  each  TCP
297              socket.     This    value   does   not   override   the   global
298              net.core.rmem_max.  This is not used to limit the  size  of  the
299              receive  buffer  declared  using  SO_RCVBUF  on  a  socket.  The
300              default value of 87380*2 bytes is lowered to 87380 in low-memory
301              systems.
302
303       tcp_sack (Boolean; default: enabled)
304              Enable RFC 2018 TCP Selective Acknowledgements.
305
306       tcp_stdurg (Boolean; default: disabled)
307              If  this option is enabled, then use the RFC 1122 interpretation
308              of the TCP urgent-pointer field.  According to this  interpreta‐
309              tion, the urgent pointer points to the last byte of urgent data.
310              If this option is disabled, then use the  BSD-compatible  inter‐
311              pretation  of  the  urgent pointer: the urgent pointer points to
312              the first byte after the urgent data.  Enabling this option  may
313              lead to interoperability problems.
314
315       tcp_synack_retries (integer; default: 5)
316              The  maximum number of times a SYN/ACK segment for a passive TCP
317              connection will be retransmitted.  This  number  should  not  be
318              higher than 255.
319
320       tcp_syncookies (Boolean)
321              Enable  TCP  syncookies.   The kernel must be compiled with CON‐
322              FIG_SYN_COOKIES.  Send out syncookies when the syn backlog queue
323              of  a socket overflows.  The syncookies feature attempts to pro‐
324              tect a socket from a SYN flood attack.  This should be used as a
325              last  resort,  if at all.  This is a violation of the TCP proto‐
326              col, and conflicts with other areas of TCP such  as  TCP  exten‐
327              sions.  It can cause problems for clients and relays.  It is not
328              recommended as a tuning mechanism for heavily loaded servers  to
329              help  with  overloaded  or misconfigured conditions.  For recom‐
330              mended alternatives see tcp_max_syn_backlog, tcp_synack_retries,
331              and tcp_abort_on_overflow.
332
333       tcp_syn_retries (integer; default: 5)
334              The  maximum number of times initial SYNs for an active TCP con‐
335              nection attempt will be retransmitted.  This value should not be
336              higher  than  255.  The default value is 5, which corresponds to
337              approximately 180 seconds.
338
339       tcp_timestamps (Boolean; default: enabled)
340              Enable RFC 1323 TCP timestamps.
341
342       tcp_tw_recycle (Boolean; default: disabled)
343              Enable fast  recycling  of  TIME-WAIT  sockets.   Enabling  this
344              option  is not recommended since this causes problems when work‐
345              ing with NAT (Network Address Translation).
346
347       tcp_tw_reuse (Boolean; default: disabled)
348              Allow to reuse TIME-WAIT sockets for new connections when it  is
349              safe  from protocol viewpoint.  It should not be changed without
350              advice/request of technical experts.
351
352       tcp_window_scaling (Boolean; default: enabled)
353              Enable RFC 1323 TCP window scaling.  This feature allows the use
354              of  a large window (> 64K) on a TCP connection, should the other
355              end support it.  Normally, the 16 bit window length field in the
356              TCP  header  limits  the window size to less than 64K bytes.  If
357              larger windows are desired, applications can increase  the  size
358              of  their  socket  buffers and the window scaling option will be
359              employed.  If tcp_window_scaling is disabled, TCP will not nego‐
360              tiate  the  use of window scaling with the other end during con‐
361              nection setup.
362
363       tcp_vegas_cong_avoid (Boolean; default: disabled)
364              Enable TCP Vegas congestion avoidance algorithm.  TCP Vegas is a
365              sender-side  only  change  to  TCP that anticipates the onset of
366              congestion by estimating the bandwidth. TCP  Vegas  adjusts  the
367              sending  rate  by  modifying  the  congestion  window. TCP Vegas
368              should provide less packet loss, but it is not as aggressive  as
369              TCP Reno.
370
371       tcp_westwood (Boolean; default: disabled)
372              Enable  TCP  Westwood+  congestion control algorithm.  TCP West‐
373              wood+ is a sender-side only modification of the TCP Reno  proto‐
374              col  stack that optimizes the performance of TCP congestion con‐
375              trol. It is based on end-to-end bandwidth estimation to set con‐
376              gestion  window  and  slow  start  threshold  after a congestion
377              episode. Using this estimation, TCP Westwood+ adaptively sets  a
378              slow  start  threshold  and a congestion window which takes into
379              account the bandwidth used  at the time  congestion  is  experi‐
380              enced.   TCP  Westwood+  significantly  increases  fairness with
381              respect to TCP Reno in wired networks and throughput over  wire‐
382              less links.
383
384       tcp_wmem
385              This  is  a  vector  of  3 integers: [min, default, max].  These
386              parameters are used by TCP to regulate send buffer  sizes.   TCP
387              dynamically adjusts the size of the send buffer from the default
388              values listed below, in the range  of  these  sysctl  variables,
389              depending on memory available.
390
391              min  -  minimum size of the send buffer used by each TCP socket.
392              The default value is 4K bytes.  This value  is  used  to  ensure
393              that  in  memory pressure mode, allocations below this size will
394              still succeed.  This is not used to bound the size of  the  send
395              buffer declared using SO_SNDBUF on a socket.
396
397              default  - the default size of the send buffer for a TCP socket.
398              This value overwrites the initial default buffer size  from  the
399              generic  global net.core.wmem_default defined for all protocols.
400              The default value is 16K bytes.  If larger send buffer sizes are
401              desired, this value should be increased (to affect all sockets).
402              To   employ   large   TCP   windows,   the    sysctl    variable
403              net.ipv4.tcp_window_scaling must be enabled (default).
404
405              max  -  the  maximum  size  of  the send buffer used by each TCP
406              socket.    This   value   does   not   override    the    global
407              net.core.wmem_max.   This  is  not used to limit the size of the
408              send buffer declared using SO_SNDBUF on a socket.   The  default
409              value is 128K bytes.  It is lowered to 64K depending on the mem‐
410              ory available in the system.
411

SOCKET OPTIONS

413       To set or get a TCP socket option, call getsockopt(2) to read  or  set‐
414       sockopt(2)  to  write  the option with the option level argument set to
415       IPPROTO_TCP.  In addition, most IPPROTO_IP socket options are valid  on
416       TCP sockets. For more information see ip(7).
417
418       TCP_CORK
419              If  set,  don't  send  out  partial  frames.  All queued partial
420              frames are sent when the option is cleared again.  This is  use‐
421              ful  for  prepending  headers before calling sendfile(2), or for
422              throughput optimization.  As currently implemented, there  is  a
423              200  millisecond  ceiling on the time for which output is corked
424              by TCP_CORK.  If this ceiling is reached, then  queued  data  is
425              automatically  transmitted.   This  option  can be combined with
426              TCP_NODELAY only since Linux 2.5.71.  This option should not  be
427              used in code intended to be portable.
428
429       TCP_DEFER_ACCEPT
430              Allows  a  listener to be awakened only when data arrives on the
431              socket.  Takes an integer value (seconds), this  can  bound  the
432              maximum number of attempts TCP will make to complete the connec‐
433              tion.  This option should not be used in  code  intended  to  be
434              portable.
435
436       TCP_INFO
437              Used  to  collect  information  about  this  socket.  The kernel
438              returns   a   struct   tcp_info   as   defined   in   the   file
439              /usr/include/linux/tcp.h.   This  option  should  not be used in
440              code intended to be portable.
441
442       TCP_KEEPCNT
443              The maximum number of keepalive probes TCP  should  send  before
444              dropping the connection.  This option should not be used in code
445              intended to be portable.
446
447       TCP_KEEPIDLE
448              The time (in seconds) the connection needs to remain idle before
449              TCP  starts  sending  keepalive  probes,  if  the  socket option
450              SO_KEEPALIVE has been set on this socket.   This  option  should
451              not be used in code intended to be portable.
452
453       TCP_KEEPINTVL
454              The time (in seconds) between individual keepalive probes.  This
455              option should not be used in code intended to be portable.
456
457       TCP_LINGER2
458              The lifetime of orphaned FIN_WAIT2 state sockets.   This  option
459              can  be  used to override the system wide sysctl tcp_fin_timeout
460              on this socket.  This is not to be confused with  the  socket(7)
461              level  option SO_LINGER.  This option should not be used in code
462              intended to be portable.
463
464       TCP_MAXSEG
465              The maximum segment size for  outgoing  TCP  packets.   If  this
466              option  is  set before connection establishment, it also changes
467              the MSS value announced to the other end in the initial  packet.
468              Values greater than the (eventual) interface MTU have no effect.
469              TCP will also impose its minimum and  maximum  bounds  over  the
470              value provided.
471
472       TCP_NODELAY
473              If  set,  disable the Nagle algorithm.  This means that segments
474              are always sent as soon as possible, even if  there  is  only  a
475              small  amount  of  data.   When  not set, data is buffered until
476              there is a sufficient amount to send out, thereby  avoiding  the
477              frequent  sending  of  small packets, which results in poor uti‐
478              lization of the network.  This option is overridden by TCP_CORK;
479              however, setting this option forces an explicit flush of pending
480              output, even if TCP_CORK is currently set.
481
482       TCP_QUICKACK
483              Enable quickack mode if set or disable quickack mode if cleared.
484              In quickack mode, acks are sent immediately, rather than delayed
485              if needed in accordance to normal TCP operation.  This  flag  is
486              not  permanent,  it  only  enables  a switch to or from quickack
487              mode.  Subsequent operation of the TCP protocol will once  again
488              enter/leave  quickack  mode  depending on internal protocol pro‐
489              cessing and factors such as delayed ack timeouts  occurring  and
490              data  transfer.  This option should not be used in code intended
491              to be portable.
492
493       TCP_SYNCNT
494              Set the number of SYN retransmits that TCP  should  send  before
495              aborting  the  attempt  to connect.  It cannot exceed 255.  This
496              option should not be used in code intended to be portable.
497
498       TCP_WINDOW_CLAMP
499              Bound the size of the advertised window to this value.  The ker‐
500              nel  imposes  a  minimum size of SOCK_MIN_RCVBUF/2.  This option
501              should not be used in code intended to be portable.
502

IOCTLS

504       These following ioctl(2) calls return information in value.   The  cor‐
505       rect syntax is:
506
507              int value;
508              error = ioctl(tcp_socket, ioctl_type, &value);
509
510       ioctl_type is one of the following:
511
512       SIOCINQ
513              Returns  the amount of queued unread data in the receive buffer.
514              The socket must not be in LISTEN state, otherwise an error (EIN‐
515              VAL) is returned.
516
517       SIOCATMARK
518              Returns  true  (i.e.,  value  is  non-zero)  if the inbound data
519              stream is at the urgent mark.
520
521              If the SO_OOBINLINE socket option is set, and SIOCATMARK returns
522              true,  then the next read from the socket will return the urgent
523              data.  If the SO_OOBINLINE socket option is not set, and SIOCAT‐
524              MARK  returns  true,  then  the  next  read from the socket will
525              return the bytes following the urgent data (to actually read the
526              urgent data requires the recv(MSG_OOB) flag).
527
528              Note  that  a  read  never  reads across the urgent mark.  If an
529              application is informed of  the  presence  of  urgent  data  via
530              select(2)  (using the exceptfds argument) or through delivery of
531              a SIGURG signal, then it can advance up to the mark using a loop
532              which  repeatedly tests SIOCATMARK and performs a read (request‐
533              ing any number of bytes) as long as SIOCATMARK returns false.
534
535       SIOCOUTQ
536              Returns the amount of unsent data in the socket send queue.  The
537              socket  must not be in LISTEN state, otherwise an error (EINVAL)
538              is returned.
539

ERROR HANDLING

541       When a network error occurs, TCP tries to resend  the  packet.   If  it
542       doesn't  succeed after some time, either ETIMEDOUT or the last received
543       error on this connection is reported.
544
545       Some applications require a quicker error notification.   This  can  be
546       enabled  with the IPPROTO_IP level IP_RECVERR socket option.  When this
547       option is enabled, all incoming errors are immediately  passed  to  the
548       user  program.   Use this option with care — it makes TCP less tolerant
549       to routing changes and other normal network conditions.
550

NOTES

552       TCP has no real out-of-band data; it has urgent  data.  In  Linux  this
553       means  if  the  other end sends newer out-of-band data the older urgent
554       data is inserted as normal data into the stream (even when SO_OOBINLINE
555       is not set). This differs from BSD-based stacks.
556
557       Linux  uses  the  BSD  compatible  interpretation of the urgent pointer
558       field by default.  This violates RFC 1122, but is required for interop‐
559       erability  with  other  stacks.   It  can  be changed by the tcp_stdurg
560       sysctl.
561

ERRORS

563       EPIPE  The other end closed the socket unexpectedly or a read  is  exe‐
564              cuted on a shut down socket.
565
566       ETIMEDOUT
567              The  other  end didn't acknowledge retransmitted data after some
568              time.
569
570       EAFNOTSUPPORT
571              Passed socket address type in sin_family was not AF_INET.
572
573       Any errors defined for ip(7) or the generic socket layer  may  also  be
574       returned for TCP.
575

BUGS

577       Not all errors are documented.
578       IPv6 is not described.
579

VERSIONS

581       Support  for  Explicit  Congestion  Notification, zero-copy sendfile(),
582       reordering support and some SACK extensions (DSACK) were introduced  in
583       2.4.   Support for forward acknowledgement (FACK), TIME_WAIT recycling,
584       per connection keepalive socket options and sysctls were introduced  in
585       2.3.
586
587       The  default  values  and  descriptions  for the sysctl variables given
588       above are applicable for the 2.4 kernel.
589

AUTHORS

591       This man page was originally written by Andi Kleen.  It was updated for
592       2.4  by  Nivedita Singhvi with input from Alexey Kuznetsov's Documenta‐
593       tion/networking/ip-sysctls.txt document.
594