reduce uTP PPS by Path MTU discovery (for 2.04)

rafi · July 15, 2010

During the recent 2.03 beta development, there was an attempt to try and minimize uTP packets' fragmentation, by forcing the DF flag . The motivating was to reduce the increase in PPS caused by this fragmentation.

This seemed to me a bit problematic and premature (to say the least) w/o some form of PMTUD to go with it.

Since connection-oriented PMTUD for uTP is intended to be added in the next 2.04, I'll list down here some ideas & questions. I hope they can trigger more thoughts on the issue by others.

Some points were posted in this forum-thread : http://forum.bittorrent.org/viewtopic.php?id=119

Some other issues/points:

1) Is using ICMP type 3 code 4 (datagram too big) on Windows possible at all under regular user privilege ? I couldn't see those incoming packets at all in Wireshark from the Internet. Only from my local router/gateway .

2) Provide a setting for the MTU (default to 0 = auto detect)

3) On startup try and check the MTU to the local gateway and set it as the start size (if the settings is 0)

4) If using a fragmentation flag = don't fragment… - start with the above size - (as in 3) , and modify the uTP data-retransmit mechanism. In case you see that the connection was OK, and just the data wouldn't go through – use the second try with the DF flag reverted (=do fragment), and try again. If the peer responds – set DF again, and fall-back from the initial size until you get a response.

5) It is good to take stats of fragmented packets count in 2.03/2.02, so to be able to compare to the future performance of 2.04 .

What are the numbers right now ?

GTHK · July 15, 2010

I have some more points to add. Regarding ICMP 3-4, Windows Wireshark is known to have some limitations in what it can capture, so I tested large pings on it and a Linux box. The Linux box showed that when trying to ping an AT&T DSL node, an intermediate AT&T router sent an ICMP just as expected, and also included the max supported MTU in the message (1492 (which may be non-optimum)). Windows ping on the other hand simple gave a "request timed out" and Wireshark didn't show any incoming ICMP. No software firewalls. Windows is known to perform PMTUD so this seems to be a failure on the OS's part.

I've heard plans to use these messages, it seems as though the only way to do this is with ICMP raw sockets, which need admin privileges (and hopefully actually show the relevant messages). Is this correct? What about Vista/7 where constant admin access is discouraged?

Also, for reporting your MTU to others, can you access Windows PMTUD data somehow? If something more appropriate doesn't exist, I found some code, which seems to indicate a packet too big error can be returned when using ping functions in an application. If that does work you could ping a host like bittorrent but with a TTL of 3 or 4 so that it only travels a few hops onto the ISP network, testing whether something like an ISP supplied modem/router has a low MTU that might get in the way. If that could then be communicated in uTP a connection could use the lowest of the two values. This is roundabout but I'm not finding a more appropriate method in my searching with regards to Windows.

Firon · July 15, 2010

We will only implement the ICMP method. It's too difficult to do anything else. There's no known TCP implementation that does anything but the ICMP method (at least by default).

GTHK · July 15, 2010

Certainly unclean, read on some random website though that Microsoft did it at one point on their website (not the OS's though), with a large size reduction for the retry. Grain of salt.

rafi · July 17, 2010

Firon:
We will only implement the ICMP method

More issues/questions to consider:

- Does Wine fully support this functionality ?

- Does Mac fully support this functionality ?

- Does all Windows X (7/Vista/XP) support it, and at the same privilege level ?

The point is - that maybe it's best to stick to the application level with the implementation/debug/test (like #4 above) when aiming for mutli-platforms/OSes...

GTHK · July 17, 2010

All my searches say that admin privs are needed for the ICMP method, which would mean Vista and Win 7 users are screwed by default. I also looked up Wine details and icmp.dll comes up as 75% complete, rawsocket support (needed for ICMP method support afaict) does not appear to be among the available functions.

psorcerer · July 19, 2010

Hi,

The issue with PMTU for UDP traffic is not well defined in any socket implementation AFAIK.

The only working solution I know of is L2TP PMTU

It's described here (draft from 1998...)

http://tools.ietf.org/html/draft-ietf-pppext-l2tpmtu-00

arvid · August 9, 2010

There are two main RFCs describing how to do PMTUD in TCP. The original one (RFC 1191), relies entirely on ICMP messages. The second one (RFC 4821) describes how to also take packet loss into account to guess PMTU.

My understanding is that no main stream TCP implementation implements the latter.

The reason why you would like to have the loss based PMTUD is because there are firewalls that block ICMP messages.

Our implementation in uTP sends one probe packet every RTT. A probe means that the dont-fragment-bit is set. It uses an ICMP socket (SOCK_RAW, IPPROTO_ICMP) to receive packet too big messages, and adjusts the assumed PMTU.

If we experience packet loss specifically for the probe packets, that is taken as a signal to lower the assumed PMTU as well.

If you can receive ICMP messages, the MTU will converge very quickly. If you can't it will take some loss and a bit longer.

rafi · August 13, 2010

Will MTU be dynamically adjusted when path MTU has changed ?

How can you tell if communication breaks temporarily due to MTU change, or some other unrelated permanent reason ?

What would be the overhead (sending redundant probe/packets) ?

More details on the above, and on how/when the DF flag is being changed during the connect period would be nice.

arvid · September 17, 2010

Will MTU be dynamically adjusted when path MTU has changed ?

Yes, as the specification above, we send probe packets regularly (once every RTT). Which means that we'll adjust to changing PMTUs very quickly.

How can you tell if communication breaks temporarily due to MTU change, or some other unrelated permanent reason ?

We distinguish the case where _only_ the probe packets were lost, and no other packet. If communication breaks for other reasons or just congestive packet loss, chances are that some other packet is lost, or all packets in the RTT are lost.

What would be the overhead (sending redundant probe/packets) ?

There's no overhead at all in the stable state. Each probe packet that's lost will constitute some overhead. This is relatively small and I would imagine that it in practice ends up being a handful of packets per connection. I have not measured this myself.

More details on the above, and on how/when the DF flag is being changed during the connect period would be nice.

I'm not sure what you mean by "the connect period". We set the DF flag on one packet, once that packet is ACKed we set it on the next packet we send. When a probe is ACKed we look at its size and we adjust our MTU search range accordingly. If we receive duplicate acks, suggesting that a packet was lost, and the packet was a probe, we adjust our MTU search range again by capping it below the size of the probe packet.

GTHK · September 17, 2010

If you can do it ok in the code, probing around specific sizes first may be a nice optimization. 1460 or something for DSL? Also whatever MTU results from various VPN tunnels.

That aside, proposal looks pretty good.

rafi · September 17, 2010

@arvid:

what was the end result , compared to w/o PMTUD ? I mean:

a. what was the average packet size compared to the fixed max - 1444 we have now ?

b. When you ran with and w/o it - how many fragmented packets did you monitor in both cases ? (any noticeable difference in PPS ?)

GTHK · September 18, 2010

It'll definitely be better with pmtud, rather then using a fixed length it'll detect the optimal size.

Don't know if you accept MTU suggestions from ICMP, AT&T and probably others though will recommend MTU's, 1490 in the case of the AT&T DSL network.

rafi · September 18, 2010

I was asking for test-results, not specutations/theories...

edit:

Plus - it just does not work in 2.2 so I cancel the request for test-results (using 2.2...)

http://forum.utorrent.com/viewtopic.php?pid=522864#p522864

And I expect starting צאU to be the NIC MTU of ~1500... not 1444..

reduce uTP PPS by Path MTU discovery (for 2.04)

Recommended Posts

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Archived