We all know by now that the basic function of the TCP protocol is to send a stream of bytes that has no shape or fixed size over a network reliably to a receiver.
We all also know that reliable delivery involves building a connection between two end-hosts, and this will be the first step that the TCP stack do to exchange data.
Above figure, has a host on a side and a server on the other. Host needs to get/download a file from the server via FTP protocol which is dependent on TCP.
The Host will start off by allocating the required socket and buffers for this connection. But what is a socket anyway?
Imagine that the host will talk to the server to download a file via FTP, and will also talk to the same server to view a web page via HTTP. If you thought about it, you will find that the same host (X) will talk to the same server (Y) to request two different data streams using two different protocols, FTP and HTTP, which are dependent on TCP. If there was no way of multiplexing here, the two different data streams can be mixed and neither the host nor the server will be able to process them correctly. So, how would the host and the server differentiate between these two data streams? The answer is sockets binding.
The socket is an endpoint of the TCP connection. Here in our example, the host will initialize a socket to the server, the socket is somehow a virtual place holder in the host memory, marked with a pointer and will include 4 types of information that will work all combined to make this connection unique.
Continuing our example; the host with the IP address (X) will try to communicate with the server with IP address (Y) over source port (Px) and destination port (Py); this is basically the socket that the TCP needs to allocate and open.
Host (X) will look into its source port (Px) and see if it’s in use or not, if it’s not, TCP will transition the state of this port from being closed to open, it will also bind it to the socket allocated before and then will send the first TCP message that will build the TCP connection; which is TCP SYN packet.
TCP SYN packet, is a TCP packet with its “SYN” flag bit set. It works as a polite request from the host to the server to build virtual connection.
Let us stop here and think about it. When, we humans interact with each others, we implicitly have agreed upon some words, gestures and voice tones for us to understand each others. The speaker who speaks now, will not be interrupted by the listener until it stops talking, the listener might reply back and it might have been the end of discussion, so the listener will not reply.
Host and server in this example, do not have this implicit agreement with one another. Basically, the host and server don’t know who is going to talk first, because the host might upload or download files from the server, and the server will have different roles in this conversation.
Also, the host and the server have to agree upon the amount of data they can both handle, because the buffers allocated on the host may not be equal to the buffers allocated on the server.
They also need to agree upon how would the receiver acknowledge the data received, and how would the receiver notify the sender if it did’t receive the data because of some drops in the middle.
These questions can be answered using Sequence and Acknowledgement Numbers. Sequence and Acknowledgement numbers are 32-bit numbers (ranging from 0 – 4294967296). The concept and usage of these numbers will be in the next posts. However, you should know that in this stage TCP SYN packet will be sent to synchronize these numbers.
Back to our example, host (X) will initialize and allocate a random sequence number, let’s say it is “1234” in its SYN packet over source port (Px) destined to server (Y) over port (Py) to politely asks the server to build a connection and synchronize their sequence numbers. Synchronize here means that the ISN (Initial Sequence Number) that the host will use is 1234, and you, as the server will increment thereafter.
The server in the other hand will receive the SYN packet if and only if the port (Py) is open. After receiving this packet, the server as the passive end, will use the same information provided and send a TCP ACK packet to the host. The server (Y) will send the ACK packet from the source port (Py) destined to host (X) and destination port (Px) to politely grant the access to the host and acknowledge that it will use and increment the ISN (Initial Sequence Number) sent by the host without a problem. The server will also bind its local port (Py) to a socket allocated for this connection.
On a separate packet; but usually in the same TCP ACK packet replied from the server to the host, the server will also tell the host to synchronize its ISN.
Yes, the ISN allocated from the host can be different from the ISN allocated from the server. That’s why sequence numbers tracking is somehow a hard process to do in Wireshark, and that’s why Wireshark uses “relative sequence number” which is not the real sequence number that the host and server used. Wireshark will always start the ISN with 0 (which in fact is relative).
Let us suppose that the server allocated ISN 5678, so the server will send a SYN packet with its ISN to the host, and the host will reply with an ACK packet confirming it’s availability to use this ISN.
As I mentioned in the previous paragraph; the server might send two different packets, the first one is an ACK to the host’s SYN packet followed by its own SYN packet. Else; which is the usual case, the server will send a single TCP SYN+ACK packet, which works as an ACK packet and SYN packet.
This the end of today’s post. I will explain the details of the TCP 3-way handshaking in the next post with the TCP finite-state machine and how will TCP increment the ISN numbers in the next consecutive packets.