- 1 TCP
- 1.1 1.1 Overview
- 1.2 1.2 Introduction
- 1.3 1.3 The TCP protocol
- 1.4 1.4 The TCP segment Header
- 1.5 1.5 TCP Segment
- 1.6 1.6 TCP Connection
- 1.7 1.7 Sequence Numbers
- 1.8 1.8 Error Control
- 1.9 1.9 Flow Control and Congestion Control
- 1.10 Conclusion
- 1.11 Self Test
- 1.12 References
Since the transport layer service is very similar to the network layer service so the question: “ Why there are two separate protocols instead of single IP layer protocols? “ is very obvious. The requirement of the separate Transport layer protocol is subtle but crucial. IP is partly implemented on third party machines (routers). So if part of the network is very poor in terms of service (e.g., too much errors) the users cannot simply replace them with a better one or choose a better route. So the need for the users to have their own mechanism to handle errors and control the flow of data is obvious in this scenario. This is why TCP is a separate protocol implemented totally on users' machine. TCP allows issues associated with the network to be hidden from the processes communicating through the internet.
Transmission Control Protocol (commonly known as TCP) is a transport layer protocol which resides on top IP in the TCP/IP protocol stack. While IP is responsible for end to end (i.e., from source to destination) delivery of data over the internet, TCP is responsible for process to process communication. TCP is a reliable, connection oriented service unlike IP (which is merely a best effort service). The lack of reliability and connection-less nature of IP urges the need for a different protocol which could guarantee delivery of packets over the internet no matter how noisy or congested the channel is. Other than facilitating reliability, error control and flow control TCP also provides a mechanism to route data segments to the right process on particular host. As IP uses IP address to deliver packets to the right destination machine, TCP uses port number for process to process delivery.
1.3 The TCP protocol
1.3.1 Process to Process Communication and Port Numbers:
As mentioned in the introduction section that IP is solely responsible for delivering datagram (or packets) from source machine to destination machines. It does not provide any information about which process on the destination machine the packet should be transferred to. Let us consider a client (which could be a web browser) with an IP address x.y.x.v (machine A) requests a page from a server with an IP address w.y.h.v (Machine B). The IP routes the request to the destination successfully . But the destination (machine B) has several processes running (e.g, HTTP, FTP, SMTP etc.) along with the one (the process) which is supposed to respond to the machine A. Now the question is- "who sends the datagram (request) to the correct destination process on the destination machine?". TCP solves this problem with the help of special kind of address known as port numbers. Port numbers are 16 bits long. Each process in a host machine (using the services of TCP/IP) is assigned a unique port number. So whenever packets arrive at the destination machine, IP transfers them to TCP for the subsequent steps. TCP then peeps into the TCP segment header (we will look into the details of TCP segment header in subsequent sections) and finds out the intended port number and maps the port number to the destination process (or application). The whole process looks simple at the first glance. But an observant reader might have already realized that several questions may arise till this point. The very first question that sounds reasonable is :"Why cant we directly embed the name of the destination process itself in the TCP header rather than providing a number which is anyway mapped to the process at last?". The reason for not using the process name (or identifier) directly is to keep the TCP header as short as possible. As mentioned above port numbers are 16 bit long. Had the names been directly used in place of port numbers, the TCP header would have been significantly longer. For example let us suppose a client machine (or more specifically a Web Browser) needs to send a request to an application called HTTPS. To encode the name HTTPS within the TCP header would require at-least 5 bytes (i.e., 40 bits). These extra bits (40 instead of 16 bits) incurs extra overhead. Another question that one may ask is :" How does the client machine knows which port numbers are assigned to which processes?". To resolve this problem the Internet Assigned Numbers Authority (IANA) has categorized the port numbers (A port number uniquely identifies a process) into three categories namely well-known port numbers, registered port numbers and dynamic or private port numbers.
Well Known Port Numbers
Well-known port numbers ( Numbers 0 to 1023) RFC 1700 are reserved by IANA for Service and applications which are most commonly used over the Internet (e.g., applications like HTTP, SMTP, FTP,POP etc). For example the most heavily used application HTTP has been assigned port number 80. So whoever is running an HTTP server on the internet is supposed to allocate port number 80 for the same. Thus if any client application running on any machine wants to communicate with an HTTP application can do so if the former knows the IP address (or alternatively the domain name) of the latter and uses destination port nuber 80.
Registered Port Numbers
" The Registered Ports are not controlled by the IANA and on most systems can be used by ordinary user processes or programs executed by ordinary users. " -- RFC 1700. Port numbers between 1024 and 49151 are called registered ports. The IANA maintains a listing of services using ports in this range to minimize conflicting uses. Unlike ports with lower numbers (0-1023), developers of new TCP/UDP services can select a specific number to register with IANA rather than having a number assigned to them.
Dynamic or Private Port Numbers
Dynamic or Private Ports ( Numbers 49152 to 65535) These are usually assigned dynamically to client applications ( by the respective Operating System on the client machine) when initiating a connection. For example when client application (HTTP or SMTP) communicates with a server application, the former has to provide a port number (source port number) so that the latter can use it as a destination port number while replying back. This port number (used by the client as source port number) is generated for a short period of time and picked up randomly from 49152-65535. But one can readily raise a question like this: "Why don't we use 80 for the source port number of an HTTP client?". To answer this question we have to move back to the Network layer. NAT - Network Address Translation (a ubiquitous Network Layer Protocol) uses these dynamic port numbers to find out the corresponding source IP addresses. To understand this mechanism readers should refer to NAT. Figure 1 (below) roughly shows how a NAT table looks like. To understand the use of port numbers let us consider the entries 1 and 5. Both the machines with IP address 192.168.8.2(A) and 192.168.8.4(B) makes TCP connection with the same server with IP address 22.214.171.124. Since the the 192.168.8.2 and 192.168.8.4 are local IP addresses (which is not visible outside the local network), 126.96.36.199 uses 188.8.131.52 (it is the IP address of the NAT box with which it is connected with the external world) as destination IP address when sending data to A as well as to B. So when some data with source IP address 184.108.40.206 arrives NAT looks at its table. It finds two addresses in the table (192.168.8.2 and 192.168.8.4) who recently made TCP connections with IP address 220.127.116.11. So based on the IP address alone NAT cannot decide whom that particular data is destined to . But together with port number (as marked by red circle in Fig 1) NAT can resolve who the packet was destined to.
1.4 The TCP segment Header
In the previous section we have learned what a TCP address (port Number) is and how it is used to make process to process communication. In this section we will dig deeper into the protocol to understand how TCP provides these services . For that we must refer to the TCP segment header RFC 793 first. As you might have already realized from the earlier protocols that "header" is the heart and soul of any protocol.Every TCP Segment contains 20 bytes fixed-format header fields in addition to the optional fields. This is why a 'header length' field is required or in other words length field is used to identify the start of data in a TCP segment . As it can be seen from the figure that TCP header contains several fields. So only the relevant ones will be discussed. Source port and Destination Port fields are self explanatory. We already have seen them in detail.
Header Length: The TCP Length is the TCP header length plus the data length in octets RFC 793 i.e., how many 32 bit words are there in the segment header. For example if the value of this field is 6 then the Header contains 6*32 bits = 6*4 byte i.e., 24 bytes.
The Unused 4-bit Field: These 4 flags (one bit each) are reserved for future modification or improvement of the protocol.
Eight One bit Flags:Out of these eight flags SYN , FIN, ACK, ECE and URG are more relevant.
SYN: Used to establish TCP connection. When a machine tries to communicate with another machine it sends some connection segments (TCP connection will be discussed later in this chapter). A connection segment is a normal TCP segment except that it has the SYN flag set to 1.
ACK : ACK bit is used to indicate whether the Acknowledgement Number field( another TCP Header Field) contains a valid acknowledgement number or a garbage. For example when a client machine connects to a server, it sets ACK=0 for the first (connection request segment) segment that it sends. The reason is obvious: “the client has not received anything from the server yet. So what would it acknowledge?”
FIN: This flag is used to release the TCP connection. During the entire communication it is set to 0. When a machine has set FIN =1, it indicates that it has no more data to send.
CWR: is used to signal Congestion Window Reduced. When CWR is set 1 the sender indicates the receiver that sender is slowing down by reducing the window size. (The concept will be discussed in the Sliding Window protocol in the Flow Control section).
ECE: It is set to 1 to tell the sender to slow down.
URG: Is set to 1 to indicate that the URGENT POINTER field is valid. If set to 0 the urgent pointer field is ignored.
RST: Is used to close the connection when an error occurs. PSH: Is set to 1 when data from the sending application needs to be sent to the receiving application immediately rather than making it (data segment) wait in the buffer. This flag is handy for real time applications like online video streaming.
Urgent Pointer Field: It is an offset (informal: distance) from the current sequence number where the urgent are found. This facilitates the sender application to interrupt the receiving application that certain segments (as indicated by the pointer) are more urgent than the current segments.
Window Size: It is a flow control tool used by TCP. It will be relevant to discuss this flag during Flow Control section.
Checksum: is similar to the one seen in the Ethernet Protocol. It includes TCP header, data and IP pseudo header
1.5 TCP Segment
Now that we have learned how a TCP header looks like it is worth discussing how TCP segment sizes are decided. Smaller segments sizes incurs overhead (Minimum 20 bytes of header field attached for each segment, which is totally useless for the application layer). So how big should a segment size be? Can we encapsulate the entire file (be it 10KB or 100MB) into a single segment? The answer is no. The reasons are rather associated with IP and Ethernet (or any other Link Layer protocol), not the TCP directly. Two limits restrict the segment size. First, each segment (data + header) must fit in the 65,515 byte IP payload (this is the max capacity of an IP payload field). Seond one comes from the Link Layer. As you might learned from Ethernet protocol that an Ethernet frame can have at best 1500 bytes (also known as MTU: Max Transfer Unit) in its payload field ( place where IP packets are encapsulated) . Each segment must fit in the MTU so that it can be sent and received in a single unfragmented packet (from IP you might have learned that fragmentation is an overhead and degrades performance).
1.6 TCP Connection
1.6.1 Connection Establishment
As discussed earlier TCP is a connection oriented protocol. So establishing connection is the very first step in any communication using TCP. Connection is established by a process called three way hand shaking RFC 793. First the client sends SYN message (refer to the SYN flag discussed earlier). A SYN segment is the one which contains SYN flag set to 1. Since this is the very first segment of communication session, it does not contain any Acknowledgement Number. So ACK bit is set to 0. In digital world no variable or place can stay empty it has to be either 1 or 0. So the field Acknowledgement Number cannot be sent empty. But the sender can educate the receiver that the value in the Acknowledgment field is a garbage value and hence should be ignored.
The sender does that by setting ACK bit to 1. This SYN segment also educates the receiver about various parameters of the connection like Sequence Number, Window Size, Maximum Segment Size (an optional field) etc. Upon receiving this SYN packet from the client the server replies back with an ACK (indicating that it has received the connection request and agreed the connection). And it also sends its own SYN request to get connected with the client. Since it is inefficient to send the SYN and the ACK segments separately, a single TCP segment is used for SYN and acknowledgement. In practice every acknowledgment in TCP is piggy backed in a usual TCP segment. It should be noted that the sequence number for the server-client communication ( more specifically reply from server end) need not be the same as the one used by the client to send request to the server. The acknowledgement Number is not an entirely different thing at all. It is merely the sender’s sequence number incremented by one. For example when a server receives a segment with sequence number x, the server acknowledges this arrival with acknowledgment number x+1 (by setting its Acknowledgement number field by x+1 in its next reply segment). Figure 3(above) demonstrates this three way handshaking.
1.6.2 Connection Termination
TCP connections are duplex. To release a TCP connection both sides have to terminate their connections individually. For example as shown in Figure 4 when host A has no more data to send it sends a FIN segment (i.e., simply setting FIN flag to). On receiving this FIN segment host B acknowledges and frees all the allocated resources (e.g., clearing the buffer etc). Similarly when host B does not have anything to send it also sends FIN segment to A. When acknowledgement is received from A , B knows that connection has been successfully terminated.
1.7 Sequence Numbers
So far we have heard about sequence number sequence number several times. In this section we will learn why sequence numbers are so important and how they serve their purposes.
TCP sequence numbers are 32 bit numbers maintained by both client and server throughout a session. When a host initiates a TCP session, its initial sequence number is effectively random; it may be any value between 0 and 4,294,967,295, inclusive. So sequence numbers can be referred to as an identifier for a particular connection. It is used to keep track of which segment has been sent and which segment has to be re-transmitted.To make the subsequent discussion clearer let’s consider the practical scenario where machine A (a client) sends an HTTP request to machine B (the server) asking an html file. The server replies by sending the requested page and the communication ends. Now let’s look into the various steps more closely. First, machine A sends connection request (SYN) using a sequence number 1023 (assumed randomly). To deliver the requested page the server B needs to get connected to machine A. Let’s suppose B uses initial sequence number 2131691.If we look into the http data (of either side) from the TCP point of view it is simply a raw stream of bits. TCP chops this stream of bits into smaller pieces and prepares the corresponding segments. To maintain the order of these pieces TCP identifies each segment with a sequence number in the increasing order as shown in the figure. As we already know that TCP uses the services of IP. IP is a packet switched (does not have a dedicated path between source and destination) protocol. So it is not guaranteed that the segments will reach the destination at the correct order. Such a situation has been shown in the figure 5 . But by looking into the sequence numbers of the segments TCP can easily reorder them correctly.
It is essential to remember that the actual sequence number space is finite, though very large. This space ranges from 0 to 2^32 – 1 i.e., 4294967295. Since the space is finite, all arithmetic dealing with sequence numbers must be performed modulo 2*32. What happens if sequence number is regenerated within the same connection. Avoiding reuse of sequence numbers within the same connection is simple in principle: enforce a segment lifetime shorter than the time it takes to cycle the sequence space, whose size is effectively 2^31. If the maximum effective bandwidth at which TCP is able to transmit over a particular path is B bytes per second, then the following constraint must be satisfied for error-free operation: 2**31 / B > MSL (secs)the "Maximum Segment Lifetime" or MSL RFC 1323. An MSL is generally required by any reliable transport protocol, since every sequence number field must be finite, and therefore any sequence number may eventually be reused. In the Internet protocol suite, the MSL bound is enforced by an IP-layer mechanism, the "Time-to-Live" or TTL field (refer to IP Header).
1.8 Error Control
Reliability: TCP ensure reliability with the help of CRC (a checksum), Acknowledgement and Retransmission. If a bit is flipped, a byte mangled, or some other badness happens to a packet, then it is highly likely that the receiver of that broken packet will notice the problem due to a checksum mismatch. If you recall, IP uses a checksum which includes only the IP header. So if a bit inside a IP payload gets garbled, the IP checksum cannot detect this error. So it is extremely necessary for the transport layer to include checksum for its own header and data.
This is the reason TCP has its own checksum. So if due to noise in the channel or any other reason a bit is flipped (multi bit errors are extremely rare) the receiver does not acknowledge the corresponding segment. Note that TCP uses positive acknowledgment policy, that is it cannot explicitly tell the sender about the error. Rather the sender itself gets the hint that something is wrong with the transmission when it does not receive the ACK within specified time interval. So the sender retransmits the segment. Note that retransmission occurs in the same fashion as described above even if the segment gets lost entirely.
1.9 Flow Control and Congestion Control
Flow control ensures that the sender only sends what the receiver can handle. Think of a situation where on one side there is high capacity server with a fast fiber connection on the other side there is mobile phone or something similar. The sender would have the ability to send packets very quickly, but that would be useless to the receiver (because a mobile device cannot process the data as fast as it receives) , so they would need a way to throttle what the sending side can send. Flow control is mechanism to slow down an over enthusiastic sender.
TCP flow control is a window based mechanism. Which means there is only a limited number of unacknowledged segments allowed in transit. If there is a congestion TCP reduces the window size and increases otherwise. This reduction and increment is handled dynamically.
As we have seen in earlier sections that both the client and the server shares their window sizes at the time of connection. So the sender already knows how much data the receiver can handle. For example, in the above figure the receiver has a window size of 4096 bytes (4KB). Initially the sender sends a segment of 2 KB. But the application does not consume the data immediately (may be because its busy doing something else or waiting for CPU burst). So the receiver buffer size has 2 bytes of empty space and hence it changes its window size to 2048 bytes and sends this reduced window size along with the acknowledgment packet. The sender sends another 2KB segment after the arrival of the ack from the receiver. On the other hand the application process at the receiving end is still not responding and thus no space is left in the buffer. In other words the window size reduces to zero. Since the receiver cannot hold anymore segments it immediately advertises this information along with an acknowledgement segment and blocks the sender. After a while the receiver application wakes up frees 2KB from the buffer. The receiver TCP immediately sends this new window size to the sender so that the sender starts sending again.
TCP is a connection oriented protocol which ensures error free communication between hosts over an unreliable and connectionless protocol like IP in the internet. It also provides mechanism to control the flow of data so that machines/devices having different bandwidth can communicating with each other without overwhelming the slower one. We have seen how port numbers are used to deliver data to the legitimate process/application on a particular host. To identify a particular session of communication a 32 bit number, known as sequence number is embedded with every single segment. Besides serving as a session identifier, sequence numbers also help in maintaining the order of the data segments. Even though TCP has some huge advantages, it has some disadvantages in terms overhead ,inefficiency and security threats (e.g., TCP SEQUENCE NUMBER PREDICTION )as well. For example the even when a host needs to transfer only a few bits, TCP will make the three way handshake, associate a 32 bit number, append CRC bits etc. In other words TCP is highly inefficient in the case of short communication (e.g., DNS queries) or application (like Voice over IP, Video Streaming etc) where speed is preferred over error control.
- Computer Networks - Andrew S. Tanenbaum.
- Computer Networks: A Top Down Approach by Behrouz Forouzan and Firouz Mosharraf.
- Head First Networking - by Al Anderson, Ryan Benedetti.