DHT node ID value.

zorgiev · May 7, 2009

Hello everyone.

I have an interesting question to be asked and will ask it right now. bitTorrent DHT protocol spec clearly states about metric distance and so on. But look closer. It states that metric is calculated as |A xor B| and the _SMALLER_ values are closer. Consider SMALLER statement. Since it was written by someone we can say following: Smaller means - there exists a binary relationship between two distance values like <(d1,d2). I'll write it in more usual manner d1 < d2. Now remind the DISTANCE is calculated by |A xor B| where A and B are the elements of 160-bit space. And thereby in conjunction to "bitwise XOR on two elements gives the element from the same space as XOR operands" statement we can say that distance metric is from the same 160-bit space. Let us take a closer look at an ordering statement mentioned so far: a < b. We can adopt existing arithmetic value comparison algorithms to these 160-bit values after a bit of modification. But,

QUESTION: How do we have to treat a 160-bit value transfered over the bitTorrent clients? Shall we treat it as a big-endian or little-endian? I.e. what is the byte-ordering of such a value transferred over the network so far?

Yours, sincerely.

Ultima · May 7, 2009

I can't say I know the exact answer to your question (or really, the entire context given with it ), but um... I would assume anything transferred would be using network byte order unless otherwise explicitly stated, meaning "probably big-endian."

Edit: You'll also notice that anything mentioned in the DHT draft spec regarding byte order specify network byte order as the endianness.

zorgiev · May 8, 2009

Ok, I completedly agree with you. But, by the way, I have investigated both libTorrent(rakshasa) and libTorrent(rasterbar) and even original KAD eMule client sources. All of them compare IDs in a loop like this:

// Pseodocode
bool isLessThen(ElementType * a,
                ElementType * 
{
    int const bufSiz = ...;//Some ID buffer constant size.
    for (int i = 0; i < bufSiz; ++i)
    {
        if ( a[ i ] < b[ i ] )
        {
            return true;
        }
        else if ( a[ i ] > b[ i ] )
        {
            return false;
        }
    }

    return false;
}

This solution is incorrect in case of little-endian(Intel x86 and various other less-used CPUs).

It is correct in case of big-endian architecture. Consider little-endian architecture. You need first to reverse-order bytes in buffer and each of them should be converted from network order to little-endian.

I.e. on ia32 and other little-endians the lowest byte is the first - a[ 0 ] rather then last. Thus you need to iterate in backward-direction to achieve desired result.

It is like comparing bytes in a big-endian manner while each byte is compared to other one in a little-endian manner. So what is the result of such a comparison? Such code won't be working fine on big-endians because produced buckets and comparison results will differ from the same achieved with little-endians.

This issue also makes bucket division more complicated because you have to implement long-integer division algorithm instead of just shifting bits.

So I think it is worthful to add some explicity about this to protocol spec.

Yours, sincerely.

Sign In

DHT node ID value.

Recommended Posts

zorgiev

Link to comment

Share on other sites

Ultima

Link to comment

Share on other sites

zorgiev

Link to comment

Share on other sites

Archived

Browse

Activity