1tcp(7P)                            Protocols                           tcp(7P)
2
3
4

NAME

6       tcp, TCP - Internet Transmission Control Protocol
7

SYNOPSIS

9       #include <sys/socket.h>
10
11
12       #include <netinet/in.h>
13
14
15       s = socket(AF_INET, SOCK_STREAM, 0);
16
17
18       s = socket(AF_INET6, SOCK_STREAM, 0);
19
20
21       t = t_open("/dev/tcp", O_RDWR);
22
23
24       t = t_open("/dev/tcp6", O_RDWR);
25
26

DESCRIPTION

28       TCP is the virtual circuit protocol of the Internet protocol family. It
29       provides reliable, flow-controlled, in order, two-way  transmission  of
30       data.  It is a byte-stream protocol layered above the Internet Protocol
31       (IP), or the Internet Protocol Version 6 (IPv6), the Internet  protocol
32       family's internetwork datagram delivery protocol.
33
34
35       Programs  can  access  TCP  using the socket interface as a SOCK_STREAM
36       socket type, or using the Transport Level Interface (TLI) where it sup‐
37       ports the connection-oriented (T_COTS_ORD) service type.
38
39
40       TCP  uses  IP's host-level addressing and adds its own per-host collec‐
41       tion of "port addresses." The endpoints of a TCP connection are identi‐
42       fied by the combination of an IP or IPv6 address and a TCP port number.
43       Although other protocols, such as the User Datagram Protocol (UDP), may
44       use the same host and port address format, the port space of these pro‐
45       tocols is distinct. See inet(7P) and inet6(7P) for details on the  com‐
46       mon aspects of addressing in the Internet protocol family.
47
48
49       Sockets  utilizing TCP are either "active" or "passive." Active sockets
50       initiate connections to passive sockets. Both  types  of  sockets  must
51       have  their local IP or IPv6 address and TCP port number bound with the
52       bind(3SOCKET) system call after the socket is created. By default,  TCP
53       sockets  are  active.  A  passive socket is created by calling the lis‐
54       ten(3SOCKET) system call after binding the  socket  with  bind().  This
55       establishes  a  queueing  parameter for the passive socket. After this,
56       connections  to  the  passive  socket  can   be   received   with   the
57       accept(3SOCKET)  system  call.  Active sockets use the connect(3SOCKET)
58       call after binding to initiate connections.
59
60
61       By using the special value  INADDR_ANY  with  IP,  or  the  unspecified
62       address  (all  zeroes)  with  IPv6,  the  local  IP address can be left
63       unspecified in the bind() call by either active or passive TCP sockets.
64       This  feature is usually used if the local address is either unknown or
65       irrelevant. If left unspecified, the local IP or IPv6 address  will  be
66       bound  at  connection time to the address of the network interface used
67       to service the connection.
68
69
70       Note that no two TCP sockets can be bound to the same port  unless  the
71       bound IP addresses are different.  IPv4 INADDR_ANY and IPv6 unspecified
72       addresses compare as equal to any IPv4 or IPv6 address. For example, if
73       a  socket  is bound to INADDR_ANY or unspecified address and port X, no
74       other socket can bind to port X, regardless  of  the  binding  address.
75       This special consideration of INADDR_ANY and unspecified address can be
76       changed using the socket option SO_REUSEADDR. If SO_REUSEADDR is set on
77       a  socket doing a bind, IPv4 INADDR_ANY and IPv6 unspecified address do
78       not compare as equal to any IP address. This means that as long as  the
79       two sockets are not both bound to INADDR_ANY/unspecified address or the
80       same IP address, the two sockets can be bound to the same port.
81
82
83        If an application does not want to allow  another  socket   using  the
84       SO_REUSEADDR  option  to  bind  to  a  port its socket is bound to, the
85       application can set the socket level option SO_EXCLBIND  on  a  socket.
86       The  option  values  of  0 and 1 mean enabling and disabling the option
87       respectively.  Once this option is enabled on a socket, no other socket
88       can be bound to the same port.
89
90
91       Once a connection has been established, data can be exchanged using the
92       read(2) and write(2) system calls.
93
94
95       Under most circumstances, TCP sends data when  it  is  presented.  When
96       outstanding  data  has  not  yet  been  acknowledged, TCP gathers small
97       amounts of output to be sent in a single packet once an acknowledgement
98       has  been  received. For a small number of clients, such as window sys‐
99       tems that send a stream of mouse events which receive no replies,  this
100       packetization may cause significant delays. To circumvent this problem,
101       TCP provides a socket-level boolean option, TCP_NODELAY. TCP_NODELAY is
102       defined  in  <netinet/tcp.h>,  and  is set with setsockopt(3SOCKET) and
103       tested with getsockopt(3SOCKET). The option level for the  setsockopt()
104       call  is  the  protocol  number  for  TCP,  available  from getprotoby‐
105       name(3SOCKET).
106
107
108       For some applications, it may be desirable for TCP not to send out data
109       unless  a  full  TCP  segment  can be sent. To enable this behavior, an
110       application can use the TCP_CORK socket option. When  TCP_CORK  is  set
111       with  a  non-zero  value,  TCP  sends out a full TCP segment only. When
112       TCP_CORK is set to zero after it has been enabled, all buffered data is
113       sent  out  (as  permitted  by the peer's receive window and the current
114       congestion window). TCP_CORK is defined in <netinet/tcp.h>,  and is set
115       with  setsockopt(3SOCKET)  and  tested  with  getsockopt(3SOCKET).  The
116       option level for the setsockopt() call is  the  protocol   number   for
117       TCP, available from getprotobyname(3SOCKET).
118
119
120       The  SO_RCVBUF  socket  level  option can be used to control the window
121       that TCP advertises to the peer. IP level options may also be used with
122       TCP. See ip(7P) and ip6(7P).
123
124
125       Another socket level option, SO_RCVBUF, can be used to control the win‐
126       dow that TCP advertises to the peer. IP level options may also be  used
127       with TCP. See ip(7P) and ip6(7P).
128
129
130       TCP  provides  an urgent data mechanism, which may be invoked using the
131       out-of-band provisions of send(3SOCKET). The caller may mark  one  byte
132       as  "urgent"  with  the  MSG_OOB  flag  to  send(3SOCKET). This sets an
133       "urgent pointer" pointing to this byte in the TCP stream. The  receiver
134       on  the  other  side  of the stream is notified of the urgent data by a
135       SIGURG signal. The SIOCATMARK ioctl(2) request returns a value indicat‐
136       ing  whether the stream is at the urgent mark. Because the system never
137       returns data across the urgent mark in a single  read(2)  call,  it  is
138       possible  to  advance  to  the urgent data in a simple loop which reads
139       data, testing the socket with the SIOCATMARK ioctl() request, until  it
140       reaches the mark.
141
142
143       Incoming connection requests that include an IP source route option are
144       noted, and the reverse source route is used in responding.
145
146
147       A checksum over all data helps TCP implement reliability. Using a  win‐
148       dow-based  flow  control  mechanism that makes use of positive acknowl‐
149       edgements, sequence numbers, and a  retransmission  strategy,  TCP  can
150       usually  recover  when  datagrams  are  damaged, delayed, duplicated or
151       delivered out of order by the underlying communication medium.
152
153
154       If the local TCP receives no  acknowledgements  from  its  peer  for  a
155       period  of time, (for example, if the remote machine crashes), the con‐
156       nection is closed and an error is returned.
157
158
159       TCP follows the congestion control algorithm described in RFC 2581, and
160       also supports the initial congestion window (cwnd) changes in RFC 3390.
161       The initial cwnd calculation can be overridden  by  the  socket  option
162       TCP_INIT_CWND.  An  application  can use this option to set the initial
163       cwnd to a specified number of TCP segments. This applies to  the  cases
164       when  the  connection  first  starts and restarts after an idle period.
165       The process must have the PRIV_SYS_NET_CONFIG privilege if it wants  to
166       specify a number greater than that calculated by RFC 3390.
167
168
169       SunOS  supports  TCP  Extensions  for High Performance (RFC 1323) which
170       includes the window  scale  and  time  stamp  options,  and  Protection
171       Against Wrap Around Sequence Numbers (PAWS). SunOS also supports Selec‐
172       tive Acknowledgment (SACK) capabilities (RFC 2018) and Explicit Conges‐
173       tion Notification (ECN) mechanism (RFC 3168).
174
175
176       Turn on the window scale option in one of the following ways:
177
178           o      An  application  can  set SO_SNDBUF or SO_RCVBUF size in the
179                  setsockopt() option to be larger than 64K. This must be done
180                  before  the program calls listen() or connect(), because the
181                  window scale option is negotiated  when  the  connection  is
182                  established.  Once  the  connection has been made, it is too
183                  late to increase the  send  or  receive  window  beyond  the
184                  default TCP limit of 64K.
185
186           o      For  all  applications, use ndd(1M) to modify the configura‐
187                  tion parameter tcp_wscale_always.  If  tcp_wscale_always  is
188                  set  to  1,  the window scale option will always be set when
189                  connecting to a remote system. If  tcp_wscale_always  is  0,
190                  the  window  scale  option  will be set only if the user has
191                  requested a send or receive  window  larger  than  64K.  The
192                  default value of tcp_wscale_always is 1.
193
194           o      Regardless  of  the  value  of tcp_wscale_always, the window
195                  scale option will always be included in a  connect  acknowl‐
196                  edgement if the connecting system has used the option.
197
198
199       Turn on SACK capabilities in the following way:
200
201           o      Use  ndd to modify the configuration parameter tcp_sack_per‐
202                  mitted. If tcp_sack_permitted is set  to  0,  TCP  will  not
203                  accept  SACK  or send out SACK information. If tcp_sack_per‐
204                  mitted is set to 1, TCP will not initiate a connection  with
205                  SACK  permitted  option in the SYN segment, but will respond
206                  with SACK permitted option in  the  SYN|ACK  segment  if  an
207                  incoming  connection request has the SACK  permitted option.
208                  This means that TCP will only accept SACK information if the
209                  other  side of the connection also accepts SACK information.
210                  If tcp_sack_permitted is set to 2, it will both initiate and
211                  accept  connections  with  SACK information. The default for
212                  tcp_sack_permitted is 2 (active enabled).
213
214
215       Turn on TCP ECN mechanism in the following way:
216
217           o      Use ndd to modify the configuration  parameter  tcp_ecn_per‐
218                  mitted. If tcp_ecn_permitted is set to 0, TCP will not nego‐
219                  tiate  with  a  peer  that  supports   ECN   mechanism.   If
220                  tcp_ecn_permitted  is set to 1 when initiating a connection,
221                  TCP will not tell a peer that  it  supports  ECN  mechanism.
222                  However,  it will tell a peer that it supports ECN mechanism
223                  when accepting a new incoming connection request if the peer
224                  indicates that it supports ECN mechanism in the SYN segment.
225                  If tcp_ecn_permitted is set to 2, in addition to negotiating
226                  with a peer on ECN mechanism when accepting connections, TCP
227                  will indicate in the outgoing SYN segment that  it  supports
228                  ECN  mechanism  when  TCP makes active outgoing connections.
229                  The default for tcp_ecn_permitted is 1.
230
231
232       Turn on the time stamp option in the following way:
233
234           o      Use   ndd   to   modify    the    configuration    parameter
235                  tcp_tstamp_always. If tcp_tstamp_always is 1, the time stamp
236                  option will always  be  set  when  connecting  to  a  remote
237                  machine.  If  tcp_tstamp_always  is  0, the timestamp option
238                  will not be set when connecting  to  a  remote  system.  The
239                  default for tcp_tstamp_always is 0.
240
241           o      Regardless of the value of tcp_tstamp_always, the time stamp
242                  option will always be included in a connect  acknowledgement
243                  (and  all  succeeding  packets) if the connecting system has
244                  used the time stamp option.
245
246
247       Use the following procedure to turn on the time stamp option only  when
248       the window scale option is in effect:
249
250           o      Use    ndd    to    modify   the   configuration   parameter
251                  tcp_tstamp_if_wscale. Setting tcp_tstamp_if_wscale to 1 will
252                  cause  the  time stamp option to be set when connecting to a
253                  remote system, if the window scale option has been  set.  If
254                  tcp_tstamp_if_wscale is 0, the time stamp option will not be
255                  set when connecting to a  remote  system.  The  default  for
256                  tcp_tstamp_if_wscale is 1.
257
258
259       Protection  Against  Wrap Around Sequence Numbers (PAWS) is always used
260       when the time stamp option is set.
261
262
263       SunOS also supports multiple methods  of  generating  initial  sequence
264       numbers.  One  of  these methods is the improved technique suggested in
265       RFC 1948. We HIGHLY recommend that you set sequence  number  generation
266       parameters  as  close  to boot time as possible. This prevents sequence
267       number problems on connections that use the same connection-ID as  ones
268       that used a different sequence number generation. The svc:/network/ini‐
269       tial:default service configures the initial sequence number generation.
270       The  service  reads  the  value  contained  in  the  configuration file
271       /etc/default/inetinit to determine which method to use.
272
273
274       The /etc/default/inetinit file is an unstable interface, and may change
275       in future releases.
276
277
278       TCP  may  be  configured to report some information on connections that
279       terminate by means of an RST packet. By default, no logging is done. If
280       the  ndd(1M)  parameter  tcp_trace is set to 1, then trace data is col‐
281       lected for all new connections established after that time.
282
283
284       The trace data consists of the TCP headers and IP source  and  destina‐
285       tion  addresses  of  the last few packets sent in each direction before
286       RST occurred. Those packets are logged in a series of strlog(9F) calls.
287       This trace facility has a very low overhead, and so is superior to such
288       utilities as snoop(1M) for non-intrusive debugging for connections ter‐
289       minating by means of an RST.
290
291
292       SunOS  supports  the  keep-alive mechanism described in RFC 1122. It is
293       enabled using the socket option SO_KEEPALIVE. When enabled,  the  first
294       keep-alive  probe  is sent out after a TCP is idle for two hours If the
295       peer does not respond to the probe within eight minutes, the  TCP  con‐
296       nection  is  aborted.  You  can  alter the interval for sending out the
297       first probe using the socket option TCP_KEEPALIVE_THRESHOLD. The option
298       value  is  an  unsigned  integer in milliseconds. The system default is
299       controlled by the TCP ndd parameter tcp_keepalive_interval. The minimum
300       value is ten seconds. The maximum is ten days, while the default is two
301       hours. If you receive no  response  to  the  probe,  you  can  use  the
302       TCP_KEEPALIVE_ABORT_THRESHOLD  socket option to change the time thresh‐
303       old for aborting a TCP connection. The  option  value  is  an  unsigned
304       integer in milliseconds. The value zero indicates that TCP should never
305       time out and abort the connection when probing. The system  default  is
306       controlled  by  the TCP ndd parameter tcp_keepalive_abort_interval. The
307       default is eight minutes.
308

SEE ALSO

310       svcs(1),   ndd(1M),   ioctl(2),    read(2),    svcadm(1M),    write(2),
311       accept(3SOCKET),     bind(3SOCKET),    connect(3SOCKET),    getprotoby‐
312       name(3SOCKET),  getsockopt(3SOCKET),  listen(3SOCKET),   send(3SOCKET),
313       smf(5), inet(7P), inet6(7P), ip(7P), ip6(7P)
314
315
316       Ramakrishnan,  K.,  Floyd,  S.,  Black,  D.,  RFC 3168, The Addition of
317       Explicit Congestion Notification (ECN) to IP, September 2001.
318
319
320       Mathias, M. and Hahdavi, J. Pittsburgh Supercomputing Center; Ford,  S.
321       Lawrence  Berkeley  National  Laboratory; Romanow, A. Sun Microsystems,
322       Inc. RFC 2018, TCP Selective Acknowledgement Options, October 1996.
323
324
325       Bellovin, S., RFC 1948, Defending Against Sequence Number Attacks,  May
326       1996.
327
328
329       Jacobson,  V., Braden, R., and Borman, D., RFC 1323, TCP Extensions for
330       High Performance, May 1992.
331
332
333       Postel, Jon, RFC 793, Transmission Control Protocol  -  DARPA  Internet
334       Program  Protocol Specification, Network Information Center, SRI Inter‐
335       national, Menlo Park, CA., September 1981.
336

DIAGNOSTICS

338       A socket operation may fail if:
339
340       EISCONN          A connect() operation was attempted  on  a  socket  on
341                        which  a  connect()  operation  had  already been per‐
342                        formed.
343
344
345       ETIMEDOUT        A connection was dropped due to excessive  retransmis‐
346                        sions.
347
348
349       ECONNRESET       The  remote  peer  forced  the connection to be closed
350                        (usually because the remote  machine  has  lost  state
351                        information about the connection due to a crash).
352
353
354       ECONNREFUSED     The remote peer actively refused connection establish‐
355                        ment (usually because no process is listening  to  the
356                        port).
357
358
359       EADDRINUSE       A  bind()  operation  was attempted on a socket with a
360                        network address/port pair that has already been  bound
361                        to another socket.
362
363
364       EADDRNOTAVAIL    A  bind()  operation  was attempted on a socket with a
365                        network address for which no network interface exists.
366
367
368       EACCES           A bind() operation was  attempted  with  a  "reserved"
369                        port  number  and the effective user ID of the process
370                        was not the privileged user.
371
372
373       ENOBUFS          The system ran out of memory for internal data  struc‐
374                        tures.
375
376

NOTES

378       The  tcp service is managed by the service management facility, smf(5),
379       under the service identifier:
380
381         svc:/network/initial:default
382
383
384
385
386       Administrative actions on this service, such as enabling, disabling, or
387       requesting  restart,  can  be performed using svcadm(1M). The service's
388       status can be queried using the svcs(1) command.
389
390
391
392SunOS 5.11                       30 June 2006                          tcp(7P)
Impressum