Linux’s Internet domain socket, transport protocols (TCP/UDP)’s socket and port

portsockettcpudp

I have been bothered for a long while by some confusions among

  • internet domain socket provided by Linux,
  • transport protocols (TCP/UDP)'s socket and
  • transport protocols (TCP/UDP)'s port.

Replies on some related posts on SO have lots of ambiguities and inconsistencies and make my confusions even more.

  1. Both Linux and transport protocols (TCP/UDP) have concepts "socket".
    How do the two concepts differ?
    Is internet domain socket (represented as a file?) provided by Linux a (faithful) implementation of socket in transport protocols (TCP/UDP)? (I guess yes, and if that is true, we can interchangeably use the two terms.)

  2. Conceptually, is it correct to think of a port in a transport protocol (TCP/UDP) as a tuple (IP address, transport protocol, port number) or just port number? (I guess a port is a tuple (IP address, transport protocol, port number), because I have been educated several times that the same port number with a different IP address or a different transport protocol represents a different port. In that sense, port and socket (in transport protocols) seem to be an identical concept.) It seems the established name "port" means "port number" only, and I will explicitly use "port number" in the following to avoid unnecessary confusions.

  3. What are the relations between socket (in transport protocols) and tuple (IP address, transport protocol, port number)? Is there a bijective mapping between the set of sockets and the set of tuples (IP address, transport protocol, port number)? Must there be one or more sockets for each tuple (IP address, transport protocol, port number), and must there be one or more tuples (IP address, transport protocol, port number) for each socket? Can two sockets share the same tuple (IP address, transport protocol, port number)? Can two tuples (IP address, transport protocol, port number) share the same socket?

  4. I heard that two processes can share the same socket (which I understand it in the way that two processes can share a file, assuming Linux's internet domain socket and transport protocols (TCP/UDP)'s socket can be used interchangeably). Can two processes share the same tuple (IP address, transport protocol, port number)?

  5. I heard that two connections can't share the same socket (assuming Linux's internet domain socket and transport protocols (TCP/UDP)'s socket can be used interchangeably). Can two connections share the same tuple (IP address, transport protocol, port number)?

Thanks.

Best Answer

  1. Sockets are an operating system API. This API lets applications on same or different systems communicate over the TCP and UDP (and other) protocols. UNIX domain sockets (not internet domain sockets as you write) provide similar functionality for communicating with applications on the same system only. The concepts for both are similar: the API provides ways to create a socket, bind, listen+accept and connect a socket, to read and write on it and to shut it down. Regarding read and write they match other file descriptors which relate to regular files, named pipes, anonymous pipes etc but the creating of the file descriptor is different and there are some more operations on the file descriptor compared to for example regular files.
  2. A port number in TCP and UDP is an integer between 1 and 65535. The word "port" is used as short for "port number". The tuple of IP address and port number and protocol describes the endpoint address. Calling it port instead will cause confusion when reading other literature.
  3. An unconnected (but already bound) socket represents only a single endpoint (ip,port,protocol). A connected socket represents a local endpoint and another (local or remote) endpoint, i.e. a connection. One cannot have multiple in-kernel sockets for the same connection but one can have multiple file descriptors for the same in-kernel socket. One can have the same endpoint in multiple connected sockets but not for the same connection, i.e. the other endpoint of the connection must be different. One can actually have multiple unconnected sockets representing the same endpoint but this is very unusual.
  4. Sockets can be shared between processes since sockets are file descriptors and file descriptors can be shared. Sharing is typically done by forking, i.e. the parent opens some file or socket and the child inherits it. But there are also ways to send a file descriptor/socket from one process to another. Sharing means that both can write and read but no data will be duplicated, i.e. if the parent reads some data these data are taken from the socket and cannot be also read by the child. But it is not possible that one process creates a new socket (instead of sharing an existing one) which represents exactly the same connection as an existing socket on the same system.
  5. Two sockets/connections can share the same port on one endpoint but they cannot share both endpoints, i.e. at least one of source IP, source port, destination IP, destination port or protocol needs to be different.
Related Question