Sortix nightly manual
This manual documents Sortix nightly, a development build that has not been officially released. You can instead view this document in the latest official manual.
TCP(4) | Device Drivers Manual | TCP(4) |
NAME
tcp
— transmission
control protocol
SYNOPSIS
#include
<sys/socket.h>
#include <netinet/in.h>
#include <netinet/tcp.h>
int
socket
(AF_INET,
SOCK_STREAM,
IPPROTO_TCP);
DESCRIPTION
The Transmission Control Protocol (TCP) is a connection-oriented
transport layer for the Internet Protocol
ip(4) that provides a reliable
byte stream connection between two hosts. It is designed for packet-switched
networks and provides sequenced data, retransmissions on packet loss,
handling of duplicated packets, flow control, basic data integrity checks,
multiplexing with a 16-bit port number, support for out-of-band urgent data,
and detection of lost connection. TCP provides the
SOCK_STREAM
abstraction for the
inet(4) protocol family.
TCP sockets are made with
socket(2) by passing an
appropriate domain (AF_INET
),
SOCK_STREAM
as the type, and 0
or IPPROTO_TCP
as the
protocol. Newly created TCP sockets are not bound to a
local address nor connected to a remote socket.
Port numbers are 16-bit and range from 1 to 65535. Port 0 is not
valid. Binding to port 0 will assign an available port on the requested
address. Connecting to port 0 will fail with
EADDRNOTAVAIL
. Received packets whose source or
destination address is port 0 will be silently dropped. TCP ports are
distinct from ports in other transport layer protocols.
Packets contain a 16-bit ones' complement checksum. Received packets will be silently discarded if their checksum does not match the contents.
Sockets can be bound to a local address and port with bind(2) (if not already bound), or an local address and port will be automatically assigned when connected. The local address and port can be read with getsockname(2). If the socket hasn't been bound, the local address and port is reported as the any address on port 0. Binding to a well-known port (port 1 through port 1023) requires superuser privileges.
Sockets can be bound to the any address, the broadcast address,
the address of a network interface, or the broadcast address of a network
interface. Binding to port 0 will automatically assign an available port on
the requested local address or fail with EAGAIN
if
no port is available. No two sockets can bind to the same local address and
port. No two sockets can be bound such that one is bound to the any address
and a port, and the other socket is bound to another address and the same
port; unless both sockets had the SO_REUSEADDR
socket option set when the second socket was bound, and the current user is
the same that bound the first socket or the current user has superuser
privileges.
A connection to a remote TCP socket can be established with connect(2). Connections can be established when both sides calls connect(2) on each other. If the socket is not bound, connect(2) will determine which network interface will be used to send to the remote address, and then bind to the address of that network interface together with an available port. connect(2) will fail if there is no route from the local address to the requested remote address.
Incoming connections can be received by binding to a local address with bind(2) and listening for connections with listen(2), after which incoming connections can be retrieved with accept(2).
Bytes can be received from the remote TCP socket with
recv(2),
recvmsg(2),
recvfrom(2),
read(2), or
readv(2). Bytes can be
transmitted to the remote TCP socket with
send(2),
sendmsg(2),
sendto(2),
write(2), or
writev(2). Transmitting when
the connection has broken will result in the process being sent the
SIGPIPE
signal and fail with
EPIPE
.
The receiving socket will acknowledge any received data. If no acknowledgement is received in a timely manner, the transmitting socket will transmit the data again. If a acknowledgement still isn't received after a while, the connection is considered broken and no further receipt or transmission is possible.
The condition of the socket can be tested with
poll(2) where
POLLIN
signifies new data been received or the
remote socket has shut down for writing or an incoming connection can be
retrieved with accept(2),
POLLOUT
signifies new data can be sent now (and the
socket is not shut down for writing), POLLHUP
signifies the socket is shut down for writing, and
POLLERR
signifies an asynchronous error is
pending.
The connection can be shut down with shutdown(2) in either the reading direction (discarding further received data) or the writing direction (which sends the finish control flag). The connection is closed when both sockets have sent and acknowledged the finish control flag. Upon the close(2) of the last file descriptor for a connected socket, the socket is shut down in both directions.
Socket options can be set with
setsockopt(2) and read
with getsockopt(2) and
exist on the IPPROTO_TCP
level as well as applicable
underlying protocol levels.
SOCKET OPTIONS
TCP sockets support these
setsockopt(2) /
getsockopt(2) options at
level SOL_SOCKET
:
SO_BINDTODEVICE
char[]- Bind to a network interface by its name. (Described in if(4))
SO_BINDTOINDEX
unsigned int- Bind to a network interface by its index number. (Described in if(4))
SO_DEBUG
int- Whether the socket is in debug mode. This option is not implemented and is
initially 0. Attempting to set it to non-zero will fail with
EPERM
. (Described in if(4)) SO_DOMAIN
sa_family_t- The socket domain (the address family). This option can only be read. (Described in if(4))
SO_ERROR
int- The asynchronous pending error (an errno(3) value). Errors are permanent. This option can only be read. (Described in if(4))
SO_PROTOCOL
int- The socket protocol (
IPPROTO_TCP
). This option can only be read. (Described in if(4)) SO_RCVBUF
int- How many bytes the receive queue can use (default is 64 KiB). (Described in if(4))
SO_REUSEADDR
int- Whether binding to the any address on a port doesn't conflict with binding to another address and the same port, if both sockets have this option set and the user binding the second socket is the same that bound the first socket or the user binding the second socket has superuser privileges. (Described in if(4))
SO_SNDBUF
int- How many bytes the send queue can use (default is 64 KiB). (Described in if(4))
SO_TYPE
int- The socket type (
SOCK_STREAM
). This option can only be read. (Described in if(4))
TCP sockets currently implement no
setsockopt(2) /
getsockopt(2) options at
level IPPROTO_TCP
.
IMPLEMENTATION NOTES
Connections time out when a segment has not been acknowledged by the remote socket after 6 attempts to deliver the segment. Each retransmission happens after 1 second plus 1 second per failed transmissions so far. Successful delivery of any segment resets the retransmission count to 0.
The receive and transmission buffers are both 64 KiB by default.
If no specific port is requested, one is randomly selected in the dynamic port range 32768 (inclusive) through 61000 (exclusive).
The Maximum Segment Lifetime (MSL) is set to 30 seconds and the quiet time of two MSLs before reusing sockets is 60 seconds.
ERRORS
Socket operations can fail due to these error conditions, in addition to the error conditions of the network and link layer, and the error conditions of the invoked function.
- [
EADDRINUSE
] - The socket cannot be bound to the requested address and port because
another socket was already bound to 1) the same address and port 2) the
any address and the same port (and
SO_REUSEADDR
was not set on both sockets), or 3) some address and the same port but the requested address was the any address (andSO_REUSEADDR
was not set on both sockets). - [
EADDRNOTAVAIL
] - The socket cannot be bound to the requested address because no network interface had that address or broadcast address.
- [
EADDRNOTAVAIL
] - The socket was connected to port 0.
- [
EAGAIN
] - A port could not be assigned because each port in the dynamic port range had already been bound to a socket in a conflicting manner.
- [
ECONNREFUSED
] - The destination host refused the connection.
- [
ECONNRESET
] - The connection was reset by the remote socket.
- [
EHOSTDOWN
] - The destination host is not up. This error can happen asynchronously.
- [
EHOSTUNREACH
] - The destination host was unreachable. This error can happen asynchronously.
- [
ENETDOWN
] - The network interface isn't up. This error can happen asynchronously.
- [
ENETUNREACH
] - The destination network was unreachable. This error can happen asynchronously.
- [
ENETUNREACH
] - The remote address could not be connected because there was no route from the local address to the remote address.
- [
ENOBUFS
] - There was not enough memory available for network packets.
- [
EPERM
] - The unimplemented
SO_DEBUG
socket options was attempted to be set to a non-zero value. - [
EPIPE
] - The transmission failed because the connection is broken. The
SIGPIPE
signal is sent as well unless disabled. - [
ETIMEDOUT
] - The connection timed out delivering a segment. This error can happen asynchronously.
SEE ALSO
accept(2), bind(2), connect(2), getpeername(2), getsockname(2), getsockopt(2), poll(2), recv(2), recvfrom(2), recvmsg(2), send(2), sendmsg(2), sendto(2), setsockopt(2), shutdown(2), socket(2), if(4), inet(4), ip(4), kernel(7)
STANDARDS
J. Postel (ed.), Transmission Control Protocol, STD 7, RFC 793, USC/Information Sciences Institute, September 1981.
Internet Engineering Task Force and R. Braden (ed.), Requirements for Internet Hosts -- Communication Layers, STD 3, RFC 1122, USC/Information Sciences Institute, October 1989.
IEEE Std 1003.1-2008 (“POSIX.1”) specifies the TCP socket programming interface.
BUGS
The implementation is incomplete and has known bugs.
Out-of-band data is not yet supported and is ignored on receipt.
The round trip time is not estimated which prevents efficient retransmission when data is lost. Retransmissions happen after a second, which means unnecessary retransmissions happen if the round trip time is more than a second.
Options are not supported and are ignored on receipt.
No extensions are implemented yet that improve efficiency for long fast networks with large bandwidth * delay products.
There is not yet any support for sending keep-alive packets.
There is not yet any support for respecting icmp(4) condition such as destination unreachable or source quench.
Half-open connections use memory, but until the handshake is complete, it is not confirmed whether the remote is actually able to transmit from the source qaddress. An attacker may be able to transmit many packets from forged addresses, reaching the limit on pending TCP sockets in the listen queue and thus deny service to further legitimate connections. A SYN queue or SYN cookies would mitigate this problem, but neither is yet implemented.
bind(2) does not yet enforce that binding to a well-known port (port 1 through port 1023) requires superuser privileges.
The automatic assignment of ports is random, but is statistically biased. A random port is picked, and if it is taken, the search sequentially iterates ports in ascending order until an available port is found or the search terminates.
June 3, 2017 | Sortix 1.1.0-dev |