UP Paper 1651 US-T-HDOWN
Concurrent Multipath Transfer Using Transport Layer Multihoming: Performance during Network Failures
Natarajan,PreethiUniversity of Delaware
Iyengar,JanardhanConnecticut College
Stewart,RandallCisco Systems
Amer,PaulUniversity of Delaware
Multihoming provides added fault-tolerance at the network layer. This network level fault-tolerance is crucial for mission critical systems such as Future Combat Systems (FCS) networks, where interrupted communication may be a matter of life or death. For example, in a battlefield network, end hosts that are multihomed can remain accessible even when one of their IP addresses becomes unreachable; perhaps due to intermediate routing failure on the path to the interface, failure of the interface itself, radio channel interference, or moving out-of-range. Additionally, end hosts may be simultaneously connected through multiple access technologies, and even multiple end-to-end paths to increase resilience to path failures. TCP does not support multihoming between two endpoints. When an end point’s IP address becomes unreachable, existing TCP connections will timeout and abort, forcing the application to recover. This recovery overhead and associated delay can be unacceptable during critical battlefield communications, where responsiveness is vital. The Stream Control Transmission Protocol (SCTP) [RFC2960] is an IETF (Internet Engineering Task Force) standards track transport layer protocol that provides TCP-like reliability, congestion, and flow-controlled data transfer to applications. Additionally, SCTP supports fault tolerance through transport layer multihoming. An SCTP association (SCTP’s term for a transport layer connection) binds multiple interfaces at each end of the association. The current SCTP specification designates one interface at each destination host as the primary interface, and all new data is transmitted to the primary interface. If ever the primary interface fails, new data transmission fails over to an alternate reachable destination interface. An SCTP sender infers reachability of the receiver’s interfaces in two ways: (i) through acks of data, and (ii) through acks of heartbeats, which are periodic probes sent specifically to check a destination interface’s status. A sender detects interface failures using a tunable threshold called Path Maximum Retransmit (PMR). After (PMR + 1) consecutive timeouts while trying to reach a destination interface (via data or probes, or a combination thereof), the sender marks the interface as failed. RFC2960 proposes a default PMR value of 5, which translates to 63 seconds (6 consecutive timeouts) for failure detection. Concurrent multipath transfer (CMT) proposes to achieve higher throughput in an SCTP association by concurrently using all independent paths between the sender and receiver for new data transfer. Previous work explored the receive buffer (rbuf) blocking problem in CMT, where TPDU losses halt the sender because the SCTP receiver’s buffer is filled with out-of-order data. Even though the congestion window would allow new data to be transmitted, rbuf blocking (i.e., flow control) stalls the sender, thus resulting in throughput degradation. However, previous work does not consider rbuf blocking’s impact on throughput during complete or short-term network failures, which is the focus of this paper. To improve CMT’s performance during failure, we introduce a new state for each destination called the “Potentially Failed” (PF) state, and propose a retransmission policy that takes into account the PF state. We implemented our solution, called CMT-PF, in University of Delaware’s SCTP/CMT module for the ns-2 network simulator. Using simulation, we evaluate CMT-PF, and demonstrate its improved throughput over CMT in failure-prone networks such as FCS battlefield networks. We also evaluate CMT and CMT-PF when the paths have widely differing loss rates. Our study shows that CMT-PF performs better than CMT during such congestion scenarios since CMT-PF is able to avoid back-to-back timeouts on data, and thus the associated recurring receive buffer blocking instances.

Preethi Natarajan received her MS in computer science in 2003 from University of Delware, Newark. She is currently pursuing her PhD in computer science, also at University of Delaware, advised by Dr. Paul Amer. Her areas of interest are in networking, especially end to end transport layer issues. She is currently exploring various aspects of Stream Control Transmission Protocol (SCTP) related to performance improvements in end to end communications such as the use of multiple independent paths between end hosts for concurrent data transfer and using SCTP instead of TCP for faster application delivery of independent web objects.