Jump to content

reduce uTP PPS by Path MTU discovery (for 2.04)


rafi

Recommended Posts

During the recent 2.03 beta development, there was an attempt to try and minimize uTP packets' fragmentation, by forcing the DF flag . The motivating was to reduce the increase in PPS caused by this fragmentation.

This seemed to me a bit problematic and premature (to say the least) w/o some form of PMTUD to go with it.

Since connection-oriented PMTUD for uTP is intended to be added in the next 2.04, I'll list down here some ideas & questions. I hope they can trigger more thoughts on the issue by others.

Some points were posted in this forum-thread : http://forum.bittorrent.org/viewtopic.php?id=119

Some other issues/points:

1) Is using ICMP type 3 code 4 (datagram too big) on Windows possible at all under regular user privilege ? I couldn't see those incoming packets at all in Wireshark from the Internet. Only from my local router/gateway .

2) Provide a setting for the MTU (default to 0 = auto detect)

3) On startup try and check the MTU to the local gateway and set it as the start size (if the settings is 0)

4) If using a fragmentation flag = don't fragment… - start with the above size - (as in 3) , and modify the uTP data-retransmit mechanism. In case you see that the connection was OK, and just the data wouldn't go through – use the second try with the DF flag reverted (=do fragment), and try again. If the peer responds – set DF again, and fall-back from the initial size until you get a response.

5) It is good to take stats of fragmented packets count in 2.03/2.02, so to be able to compare to the future performance of 2.04 .

What are the numbers right now ?

Link to comment
Share on other sites

I have some more points to add. Regarding ICMP 3-4, Windows Wireshark is known to have some limitations in what it can capture, so I tested large pings on it and a Linux box. The Linux box showed that when trying to ping an AT&T DSL node, an intermediate AT&T router sent an ICMP just as expected, and also included the max supported MTU in the message (1492 (which may be non-optimum)). Windows ping on the other hand simple gave a "request timed out" and Wireshark didn't show any incoming ICMP. No software firewalls. Windows is known to perform PMTUD so this seems to be a failure on the OS's part.

I've heard plans to use these messages, it seems as though the only way to do this is with ICMP raw sockets, which need admin privileges (and hopefully actually show the relevant messages). Is this correct? What about Vista/7 where constant admin access is discouraged?

Also, for reporting your MTU to others, can you access Windows PMTUD data somehow? If something more appropriate doesn't exist, I found some code, which seems to indicate a packet too big error can be returned when using ping functions in an application. If that does work you could ping a host like bittorrent but with a TTL of 3 or 4 so that it only travels a few hops onto the ISP network, testing whether something like an ISP supplied modem/router has a low MTU that might get in the way. If that could then be communicated in uTP a connection could use the lowest of the two values. This is roundabout but I'm not finding a more appropriate method in my searching with regards to Windows.

Link to comment
Share on other sites

Firon:

We will only implement the ICMP method

More issues/questions to consider:

- Does Wine fully support this functionality ?

- Does Mac fully support this functionality ?

- Does all Windows X (7/Vista/XP) support it, and at the same privilege level ?

The point is - that maybe it's best to stick to the application level with the implementation/debug/test (like #4 above) when aiming for mutli-platforms/OSes... :)

Link to comment
Share on other sites

All my searches say that admin privs are needed for the ICMP method, which would mean Vista and Win 7 users are screwed by default. I also looked up Wine details and icmp.dll comes up as 75% complete, rawsocket support (needed for ICMP method support afaict) does not appear to be among the available functions.

Link to comment
Share on other sites

  • 3 weeks later...

There are two main RFCs describing how to do PMTUD in TCP. The original one (RFC 1191), relies entirely on ICMP messages. The second one (RFC 4821) describes how to also take packet loss into account to guess PMTU.

My understanding is that no main stream TCP implementation implements the latter.

The reason why you would like to have the loss based PMTUD is because there are firewalls that block ICMP messages.

Our implementation in uTP sends one probe packet every RTT. A probe means that the dont-fragment-bit is set. It uses an ICMP socket (SOCK_RAW, IPPROTO_ICMP) to receive packet too big messages, and adjusts the assumed PMTU.

If we experience packet loss specifically for the probe packets, that is taken as a signal to lower the assumed PMTU as well.

If you can receive ICMP messages, the MTU will converge very quickly. If you can't it will take some loss and a bit longer.

Link to comment
Share on other sites

Will MTU be dynamically adjusted when path MTU has changed ?

How can you tell if communication breaks temporarily due to MTU change, or some other unrelated permanent reason ?

What would be the overhead (sending redundant probe/packets) ?

More details on the above, and on how/when the DF flag is being changed during the connect period would be nice.

Link to comment
Share on other sites

  • 1 month later...
Will MTU be dynamically adjusted when path MTU has changed ?

Yes, as the specification above, we send probe packets regularly (once every RTT). Which means that we'll adjust to changing PMTUs very quickly.

How can you tell if communication breaks temporarily due to MTU change, or some other unrelated permanent reason ?

We distinguish the case where _only_ the probe packets were lost, and no other packet. If communication breaks for other reasons or just congestive packet loss, chances are that some other packet is lost, or all packets in the RTT are lost.

What would be the overhead (sending redundant probe/packets) ?

There's no overhead at all in the stable state. Each probe packet that's lost will constitute some overhead. This is relatively small and I would imagine that it in practice ends up being a handful of packets per connection. I have not measured this myself.

More details on the above, and on how/when the DF flag is being changed during the connect period would be nice.

I'm not sure what you mean by "the connect period". We set the DF flag on one packet, once that packet is ACKed we set it on the next packet we send. When a probe is ACKed we look at its size and we adjust our MTU search range accordingly. If we receive duplicate acks, suggesting that a packet was lost, and the packet was a probe, we adjust our MTU search range again by capping it below the size of the probe packet.

Link to comment
Share on other sites

@arvid:

what was the end result , compared to w/o PMTUD ? I mean:

a. what was the average packet size compared to the fixed max - 1444 we have now ?

b. When you ran with and w/o it - how many fragmented packets did you monitor in both cases ? (any noticeable difference in PPS ?)

Link to comment
Share on other sites

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...