Transport layer
Back to ece358
Services
- Provide logical communication between app processes running on different hosts
- Transport protocols run on end systems
- Send side: messages broken into segments, passed to network layer
- Receive side: reassembles segments into messages, passes to the app layer
 
- More than one transport protocol available
- Network layer provides logical communication between hosts, transport layer adds logical communication between processes
IP protocols:
- Reliable, in-order delivery: TCP
- Congestion control (network itself not overloaded)
- Flow control (receiver not overloaded)
 
- Unreliable, unordered delivery: UDP
- Services not available: bandwidth/delay guarantees
Multiplexing
- Multiplexing at sender: handle data from multiple sockets, add transport header
- Demultiplexing at receiver: use header info to deliver received segments to correct socket
- Host receives IP datagrams
- each datagram has one segment, which has source and destination IP & socket number
- UDP only differentiates segments based on port number
- TCP differentiates segments based on IP and port
 
UDP checksum
- If the receiver calculates the checksum and it doesn't match the checksum in the header, there was an error
- Checksum: 1's complement: Add the two 16-bit integers. If there is a carry, add it to the first bit. Invert the bits for the checksum.
Reliable Data Transfer
Stop-and-wait
- Reliable Data Transfer (RDT) protocol
- Sender application calls rdt_send(), which make calls toudt_send()- 
udt_sendis unreliable data transfer, used internally by RDT
 
- Recipient calls rdt_rcv()called when packed arrives on the receive side of the channel, which callsdeliver_data()to hand it off to the application
- Uses checksum for error detection
- Adds ACKs when received correctly
- Adds NAKs when received with an error
- ACKs and NAKs can be corrupted, too. We add a sequence number to each packet.
- If an ACK or NAK is corrupted, we treat it like it was a NAK
- If the receiver actually had sent an ACK, it will see that it got sent the same packet again, and can ignore it and send an ACK
- Sequence number can just be 0 or 1, alternating between them
- ACK is actually ACK of the current sequence number, and NAK is just an ACK for the other sequence number
 
- Packet loss is possible: use a timer. If a response hasn't been received by the time the timer runs out, resend.- This is an example of an Automatic Repeat Request (ARQ) protocol
 
- Not good
- Utilization: \(U_{sender} = \frac{L/R}{RTT + L/R}\)
- \(D_{trans} = \frac{L}{R}\)
- 
\(L/R\) is the time spend sending actual useful information, but then we also have to wait for a round trip to receive an ACK
- Do it better with pipelined protocols
Go-back-\(N\)
- Sender can have up to \(N\) packets in pipeline
- Receiver only sends cumulative ACK
- doesn't ACK a packet i there's a gap
- sender has timer for oldest un-acked packet- when timer expires, ALL are resent
 
Selective Repeat
- Sender can have up to \(N\) un-acked packets in the pipeline
- receiver sends individual ack for each packet
- sender maintains timer for each un-acked packet- When timer expires, retransmit only that un-acked packet
 
Say \(T_x\) and \(R_x\) have agreed on a window size \(W\), and \(L\) is the frame size. Then, if \(t_T\) is the total elapsed time to send a data frame and receive a corresponding ack, then the sender will send:
- 
\(W\) frames if \(\frac{WL}{C} \le t_T\), or
- if \(\frac{WL}{C} \gt t_T\), the "link" will be full and the number of sent frames will be \(\frac{t_T}{t_I}\), where \(t_I\) is the time to transmit one frame
- \(t_I = \frac{L}{C}\)
- Utilization = \(\min\left(1, \frac{W \frac{L}{C}}{t_T}\right)\)
 
- Best case performance happens when the first ack is received while still sending packets, so we are sending packets all the time.
Maximum window size:
\[W_{max} = \frac{t_I}{t_t} = \frac{\underbrace{\frac{L}{C} + RTT}_\text{period of time considered}}{\underbrace{\frac{L}{C}}_\text{transmission time of one segment}}\]
Throughput = utilization \(*\) link rate
TCP
- 
connection-oriented: sender and receiver have to handshake to initialize parameters, but not circuit switching because intermediate nodes don't keep state
- 
point-to-point: one sender, one receiver
- 
reliable, in-order byte stream: no message boundaries
- 
full duplex data: bi-directional data flow in same connection, MSS: maximum segment size
- 
pipelined: TCP congestion and flow control set window size
- 
flow controlled: sender will not overwhelm receiver
Segment structure
| width: 32 bits | 
|---|
| Source port # | Dest port # | 
| Sequence number | 
| acknowledgement number | 
| head len | not used | `U A P R S F` | receive window | 
| checksum | Urg data pointer | 
| options (variable length) | 
| application data (variable length) | 
- URG: urgent data (usually not used)
- Ack: declares the acked # as valid
- PSH: push data now (usually not used)
- RST, SYN, FIN: connection establish (Setup, teardown commands)
- sequence numbers: byte stream "number" of first byte in segment's data
- acknowledgements:
- sequence number of next byte expected from other side
- cumulative ACK
- Can ack while sending other data
 
Estimating round trip time
- What to set the timeout value to?
- Longer than RTT, but RTT varies
- If it's too short, premature timeout happens, and there can be unnecessary retransmissions
- if it's too long, it reacts slowly when packets are lost
Round trip time estimate
- 
SampleRTT: measured time from segment transmission until ACK receipt, ignoring retransmissions
- 
EstimatedRTT\(= (1 - \alpha) *\)EstimatedRTT\(+ \alpha *\)SampleRTT
- Exponential weighted moving average (EWMA)
- influence of past sample decreases exponentially fast. Recent sample better reflects current congestion in the network
- Typical \(\alpha\): 0.25
 
- Timeout interval: EstimatedRTT+ safety margin
- Large variation in estimation yields a larger safety margin
- \(DevRTT = (1-\beta)*DevRTT + \beta * |SampleRTT-EstimatedRTT|\)
- Typically, \(\beta=0.25\)
- \(TimeoutInterval = EstimatedRTT + 4*DevRTT\)
 
RDT
- Pipelined segments
- cumulative acks
- single retransmission timer
- retransmissions retriggered by timeout events, duplicate acks
Sender events
Initialization:
- nextSeqNum = InitialSeqNum
- SendBase = InitialSeqNum
- Data received from app
- Create segment with sequence number
- sequence number is byte stream number of first data byte in segment- NExtSeqNum = NextSeqNum + length(data)
 
- start timer if not already running (timer is for oldest unacked segment) with interval TimeOutInterval
 
- Timeout
- Retransmit segment that caused timeout
- Restart timer
 
- Ack received- If ack acknowledges previously unacked segments, update what is known to be acked, start timer if there are still unacked segments
 
Fast retransmission
- time out period is often long: delay before resending lost packet
- detect lost segments via duplicate ACKs
- sender often sends many segments back to back
- if a segment is lost, there will likely be many duplicate ACKs
 
- If sender receives 3 duplicate acks (the fourth ack for that same piece of data), resend unacked segment with smallest sequence number
Flow control
- receiver "Advertises" free buffer space by including rwnd(receiver window) value in TCP header of receiver-to-sender segments
- 
RcvBuffersize set via socket operations (Default 4096 bytes)
- many OS autoadjust RcvBuffer
 
- sender limits amount of unacked data to receiver's rwndvalue (16 bits, so max value is \(2^{16}-1\))
- guarantees receive buffer will not overflow
Connection management
Handshake:
- agree to establish connection
- agree on connection params
Simple 2-way handshake won't work:
- variable delays
- retransmitted messages due to loss
- message reordering
- can't "see" other side
Use a 3-way handshake instead:
- Establish connection
- client and server start in listening state
- Client sends a request to connect via TCP with SYN = 1(synchronized) set in the header, with a random initial sequence number
- Server chooses another random sequence number, and then sends a TCP SYNACK(SYN = 1,ACK = 1), acknowledging theSYN
- Client receives the SYNACK, knowing the server is live. Sends an ACK to acknowledge the SYNACK was received, plus maybe additional data.
- Server receives the ack, so it knows the client is live
 
- close connection
- Client sends FIN = 1, with sequence number \(x\)
- Server replies with ACK = 1,ACKNUM= \(x + 1\)
- Client waits for server to close
- Server sends FIN = 1, random sequence number \(y\)
- Client replies with ACK = 1,ACKNUM= \(y + 1\)
- Client waits for \(2*\) max segment lifetime as a safeguard
 
Congestion Control
- Congestion is when there are too many sources sending data too fast for the network to handle
- Problems due to congestion:
- Queueing delays as the packet arrival rate reaches the link cap (output link capacity is \(R\), so maximum per-connection throughput is \(\frac{R}{n}\) where there are \(n\) "customers" on the link)
- Packets can be lost when the router buffers are full and packets are dropped. Sender only resends if the packet is known to be lost.
- Sender times out prematurely and sends two copies of a packet, both of which get delivered
- more routers between source and destination means more wasted resources
 
TCP protocol
- Sender increases transmission rate (window size), probing for usable bandwidth, until loss occurs
- 
additive increase: increase cwndby 1 MSS (maximum segment size) for every acknowledged segment until loss detected
- 
multiplicative decrease: decrease cwndin half after loss
- Sender limits transmission: LastByteSend - LastByteAcked <= min(cwnd, rwnd)
- 
rwndis usually very large at the receiver
 
- Slow Start
- When connection begins, increase rate exponentially until first loss event
- Initially cwndis 1 MSS
- double cwndevery RTT
- Done by incrementing cwndfor every ACK received
 
- Initial rate is slow but ramps up exponentially fast
 
- Dealing with packet loss
- Loss indicated by timeout:
- 
cwndset to 1 MSS
- window then grows exponentially (as in slow start) to threshold, then grows exponentially
 
- Loss indicated by 3 duplicate ACKs: TCP Reno
- dup ACKs indicate network capable of delivering some segments
- 
cwndis cut in half window then grows linearly
 
- TCP Tahoe (older version) always sets cwnd to 1 (timeout or 3 duplicate acks)
- when cwndgets to one half of its value before timeout, switch from exponential to linear
- Above ssthresh, growth is linear; below, it is exponential
- on a loss event, ssthreshis set to one half ofcwndjust before the loss event
 
- Avg throughput = \(\frac{3}{4}\frac{W}{RTT}\)