percikan teknologi | :: percik renik ::

October 22, 2007

Why TCP Over TCP Is A Bad Idea

Filed under: percikan teknologi — IndraEva @ 7:32 am

taken from http://sites.inka.de/~W1011/devel/tcp-tcp.html

A frequently occurring idea for IP tunneling applications is to run a protocol like PPP, which encapsulates IP packets in a format suited for a stream transport (like a modem line), over a TCP-based connection. This would be an easy solution for encrypting tunnels by running PPP over SSH, for which several recommendations already exist (one in the Linux HOWTO base, one on my own website, and surely several others). It would also be an easy way to compress arbitrary IP traffic, while datagram based compression has hard to overcome efficiency limits.

Unfortunately, it doesn’t work well. Long delays and frequent connection aborts are to be expected. Here is why.

TCP’s retransmission algorithm

TCP divides the data stream into segments which are sent as individual IP datagrams. The segments carry a sequence number which numbers the bytes in the stream, and an acknowledge number which tells the other side the last received sequence number. [RFC793]

Since IP datagrams may be lost, duplicated or reordered, the sequence numbers are used to reassemble the stream. The acknowledge number tells the sender, indirectly, if a segment was lost: when an acknowledge for a recently sent segment does not arrive in a certain amount of time, the sender assumes a lost packet and re-sends that segment.

Many other protocols using a similar approach, designed mostly for use over lines with relatively fixed bandwidth, have the “certain amount of time” fixed or configurable. In the Internet however, parameters like bandwidth, delay and loss rate are vastly different from one connection to another and even changing over time on a single connection. A fixed timeout in the seconds range would be inappropriate on a fast LAN and likewise inappropriate on a congested international link. In fact, it would increase the congestion and lead to an effect known as “meltdown”.

For this reason, TCP uses adaptive timeouts for all timing-related parameters. They start at conservative estimates and change dynamically with every received segment. The actual algorithms used are described in [RFC2001]. The details are not important here but one critical property: when a segment timeouts, the following timeout is increased (exponentially, in fact, because that has been shown to avoid the meltdown effect).

Stacking TCPs

The TCP timeout policy works fine in the Internet over a vast range of different connection characteristics. Because TCP tries very hard not to break connections, the timeout can increase up to the range of several minutes. This is just what is sensible for unattended bulk data transfer. (For interactive applications, such slow connections are of course undesirable and likely the user will terminate them.)

This optimization for reliability breaks when stacking one TCP connection on top of another, which was never anticipated by the TCP designers. But it happens when running PPP over SSH or another TCP-based protocol, because the PPP-encapsulated IP datagrams likely carry TCP-based payload, like this:

(TCP over IP over PPP over SSH over TCP over IP)

Note that the upper and the lower layer TCP have different timers. When an upper layer connection starts fast, its timers are fast too. Now it can happen that the lower connection has slower timers, perhaps as a leftover from a period with a slow or unreliable base connection.

Imagine what happens when, in this situation, the base connection starts losing packets. The lower layer TCP queues up a retransmission and increases its timeouts. Since the connection is blocked for this amount of time, the upper layer (i.e. payload) TCP won’t get a timely ACK, and will also queue a retransmission. Because the timeout is still less than the lower layer timeout, the upper layer will queue up more retransmissions faster than the lower layer can process them. This makes the upper layer connection stall very quickly and every retransmission just adds to the problem – an internal meltdown effect.

TCPs reliability provisions backfire here. The upper layer retransmissions are completely unnecessary, since the carrier guarantees delivery – but the upper layer TCP can’t know this, because TCP always assumes an unreliable carrier.

Practical experience

The whole problem was the original incentive to start the CIPE project, because I used a PPP over SSH solution for some time and it proved to be fairly unusable. At that time it had to run over an optical link which suffered frequent packet loss, sometimes 10-20% over an extended period of time. With plain TCP, this was just bearable (because the link was not congested), but with the stacked protocols, connections would get really slow and then break very frequently.

This is the detailed reason why CIPE uses a datagram carrier. (The choice for UDP, instead of another IP-level protocol like IPsec does, is for several reasons: this allows to distinguish tunnels by their port number, and it adds the ability to run over SOCKS.) The datagram carrier has exactly the same characteristics as plain IP, for which TCP was designed to run over.

Comments (7)

October 26, 2006

d*mn nameserver!

Filed under: percikan teknologi — IndraEva @ 8:13 am

For such a unix user, ssh is a commonly used application which provide some kind of remote access to some servers. I’ve used it since my first experienced using unix based OS computer. And from day on, I’ve never faced any serious problems while using it.

But late at yesterday’s nite, this stuff was starting disturbing me. I’ve just reconfigured one of my experimental server, an old hp-compaq proliant ML370 series running Debian operating system. Evertyhing runs normally before. It was in a NAT networks, but later I would like to move it directly in to my campus networks using public IP. And the problem arises right after I change its IP parameters and restart the machine. It’s no longer accessible by remote access. The machine is alive, ping results do well.

But wait a minute, there’re some more anomalies I found. I suddenly can log in to the machine using the same ssh method! *what the…* For not wasting time, I reconfigure sshd server configuration file, check all networks configuration files too. But suddenly my connection was stopped. *now what…* I try to reconnect; insert the username, and the server responses by gives challenge for related password. I wrote down the password, but the machine rejected it, said “wrong password..” *hello…???* I retried three times, but no positive result I got.

Then, I walk to the server room, log in to it directly. I recheck all network configuration once more, to make sure that there’s no such thing I’ve missed. But everything seems okay. I do ssh connection to another server from this machine, passed. And do ssh again from that machine back to the one that arising problems, it also passed. Hmmm…, how weird. I look at the clock, it’s already 23.40. D*mn!

And the next 2 hours were filled with google-related activity. 😀 It feels like I was in my daily works in the office, do some kind of Proof of Concept, PoC. And the sub theme is Root Cause Analysis. But different from the ordinary, this session was held in sleepy mode. 😛 *well, how can I show my sleepy mode face in front of my manager in daily works?*

In my point of view, the main problem is (by analyse syslog) that the machine periodically changes it DSA key. It’s weird since one machine with one IP address (and related mac address) would only generate one DSA key. This key would be accept by the opponent (in this case ssh client) and saved. It is describe more clearly while using debug option in ssh connection. My early asumption is that it related to some ssh configuration or firewall issue.

After randomly followed some instructions and methods from some webs, finally I decided to temporarily finished my works. Enough for tonight, I thought. Let’s bring them all to my dreams, and hope by tomorrow morning when I wake up, I’ve got some enlightment… *silly thought* 😛

In fact, even wake up late in the morning didn’t make any enlightment nor improvement. 😛 With my typical lazy move, I woke up, check my e-mail first, and during that, I try to arrange my time. Beside of this, I have to do another software test related to my company job works. Take a bath for a while and after that, I start to continue my pending works yesterday.

Another two hours was spent, until I realized one thing (that would be the ultimate hint). I did nslookup , and get shocked after read the result. My server IP’s has two different host name listed in the DNS server. Well, no wonder it could happened. One IP is registered to two different name and different machine. That’s the only reason why the DSA key periodically changes. In one time, the first machine is taking action. But in the other time,the other machine is. By this, I also state that it is the main reason why in the syslog, it sounds something related to ip_spoofing. Yeah… what else?

I could not remember how this silly thing can occur. I mean, there’re two host name listed in the nameserver for a same IP? Well.., my the DNS admin did some fault. After change the host’s IP address, ssh service then run normally. I also report this situation to the DNS admin, one of my campus colleague, and ask him to reconfigure the list. 😉

Anyway, it’s my first writing about some technical stuff. 😀

Comments (1)