new protocal

pondini · December 2, 2006

firstly, i'm not trying to act like i'm a genious so please don't attack me for this post. i just need some feedback on my request. if i'm misguided just say so and i'll go away;)

here's a humble request to the utorrent coders:

create a new client protacal that patches all the known exploits, to be coded on two versions of utorrent; one for uploaders [a] and one for downloaders . [a] would send GETs to requesting an answer to it's question. the question could be 'whats the announce url' OR [a] asks to reply back with a four byte hash sig!

this part is tricky to explain so bear with me.

here's a GET packet i just invented as an example: 01 21 55

01=send a four byte hash sig [key] back to client [a]

21=byte one to be included in hash algorythm [random byte1]

55=byte two to be included in hash algorythm [random byte2]

client gets the request and perfoms a hash that uses the two random bytes from the GET packet, the torrent url, and EVERY byte of data that comprises client 's execution code. client [a] would obviously know the correct four byte key and if a SINGLE byte of data is altered in client 's code it will fail the hash and be disconnected.

here's an extremly primative algo to explain how the random bytes [rb1]/[rb2], the announce url , and the executable data [ex] from gets hashed:

1. muliply [rb1] with and put result in box-1

2. subtract [ex] from box-1 and put the result in box-2

3. add box-1 to box-2 and put the result in box-3

4. add boxes 1-3 together and put the result in box-4

5. repete steps 1-5 untill every byte of client 's executable data has been hashed

6. send boxes 1-4 [the four byte key] to [a]

we keep client honest by forcing it to use random bytes from our GET packet [to which already knows the answer.] the whole reason for performing the hash is to prevent anyone from re-writing the executable code used by client to report announce url's, up/down ratios, etc. remember, we are forcing to prove that none of its code has been altered.

the weekest link is client [a], but who really cares? if client [a] is decompiled and hacked, the worst thing that could happen is that paticular app could be used to share files for free...? so as long as an uploader aquires his app from a trusted source he will always be safe.

i truely believe that whoever creates a new protocal like this will be the prefered client as sites will ban all other clients.

edit: actually, there's no need for two different clients. each client could contain both the Q and A routines.

DreadWingKnight · December 2, 2006

And yet, once you have calculated/captured the hash value of a given binary, it's a simple matter to spoof that.

Sorry, but there's no point to this.

Firon · December 2, 2006

I moved it back for discussion purposes, flawed or not. Maybe someone else can improve on it, or something.

DreadWingKnight · December 2, 2006

The instant you put the trust in any other remote clients, you cannot verify any information about a given client.

Additionally, because you are not always connected to the same peers that each individual peer in your peerlist is connected to, you can't verify information from those outside peers.

Switeck · December 3, 2006

The only thing you can sort-of trust is the tracker (well, hopefully!) and the peers/seeds you've already tested.

However even with "untrusted" peers/seeds, you could tell how much they download/upload with you. That could be reported to the tracker and the tracker could cross-check it...but that assumes only a small fraction would lie about those numbers. You could also report the same info to all the peers/seeds you're connected to, but then the bandwidth overhead goes up immensely for very little gain in "reliability checking".

How to deal with hash fails is always a problem...either try to fall back to 1 seed or peer per chunk and be very slow downloading if near the end. Or try to record mulitple failed hash areas and see if a good one can be assembled by downloading the same section from different ips and reassembling them in random ways. ...Which is likely to be CPU-intensive and/or ram/hdd-intensive.

The network itself has to be both insular and open. Open in the sense that you can find a million files and millions of sources for those files. Insular in the sense that 1 million people cannot find you at once...or worse, try to download from you at once!

pondini · December 3, 2006

thanks for reopening the thread, firon:)

although dreadwing is generally correct in stating that a static hash check can be spoofed, it isn't because of this "once you have calculated/captured the hash value of a given binary, it's a simple matter to spoof that." in my post i outlined that the hash must include two bytes chosen by client [a], so client would need to account for that by spoofing every GET packet. i have PLENTY of experience in spoofing hash requests so i know how i would do it, but that's water under the bridge now since i re-evaluated my first post and came to the conclusion that it would just be a matter of time before it was hacked, resulting in a waste of time/effort.

however, i have a different appraoch that would be virtually impossible to spoof 'on the fly.' it's a trick that was deployed by d*recttv to detect code changes on their smartcards. in a nutshell, [a] sends a packet of dynamic data that must be ran as executable code to return an answer to [a]. a hacked version of client will never be able to process the hash [executable code] correctly, hence client [a] logs the user and disconnects.

the dynamic hash string packets could be easilly written by hobbyists and openly distributed on the net --trusted sources only as executable code can be dangerous-- for users to paste into their clients.

anyway, that's my attempt at a solution. hopfully someone else can build onto it.

ps. my original idea wasn't to address the issue of ratio cheating --i know virtually nothing about the inner-working of ratio reporting anyway. it was to address ghost-leechers. from what i've been told, all clients only request the announce url upon peer connection, so wouldn't second, third, or random queries be useful in preventing a peer from connecting then changing the announce? it just sounds too simple, so i must be wrong.

DreadWingKnight · December 3, 2006

The fundamental flaw in your protocol proposal is that it basically requires that peers form connection clusters that isolate themselves from the rest of the swarm in order to maintain connections that can validate.

Connection clusters are one of the things that a Microsoft "expert" claimed to be a major problem in the BitTorrent protocol (in their "Avalanche" research paper). Such clusters don't actually form in BitTorrent swarms in the wild, but are an extremely severe problem when they do happen.

The "ghost leecher" problem isn't always intentionally generated by downloaders. The big problem is that trackers aren't always equipped to handle the issue.

Peers are supposed to announce to the tracker at regular intervals (defined by the tracker, not the client) to inform them of their continued presence in the swarm and request new peerlists (if they don't have enough peers already).

from what i've been told, all clients only request the announce url upon peer connection

You've been told incorrectly. http://wiki.theory.org/BitTorrentSpecification and http://www.bittorrent.org have the correct information.

pondini · December 4, 2006

http://wiki.theory.org/BitTorrentSpecification

jeeze, thanks for the 'haystack' lol:) the only thing resembling a 'needle' that i found was this...

tracker id: A string that the client should send back on its next announcements. If absent and a previous announce sent a tracker id, do not discard the old value; keep using it.

that is from the TALK section. so yes the clients DO request the tracker url in subsiquent queries --i couldn't find a reference to the announce url that related to my question--, but if the info is 'not present' then it falls back to the initial url. i hope that is the info you intended me to find, otherwise a c&p would help me alot;) to my --limited-- knowledge this doesn't seem to hard to address.

The fundamental flaw in your protocol proposal is that it basically requires that peers form connection clusters that isolate themselves from the rest of the swarm in order to maintain connections that can validate.

my thought was that the protocal can be determined on a client-by-client basis. if i'm not mistaken there is something like this in place now to enable cross-client intermingling between cliets lit utorrent and shadow's client. and the commincation packets allow for bits to be set in the headers to determine protocals.

a hypothetical new client could operate dually with the old and new protocals, yes? if so the clients could choose to 'prefer' the new protocal were it available from the session initiator and vica versa.

it's obvios you guys know your stuff so i won't posture over my ideas. i just wanted to present a few ideas i had.

anyway, thanks for the input and have a good day:)

Sign In

new protocal

Recommended Posts

pondini

Link to comment

Share on other sites

DreadWingKnight

Link to comment

Share on other sites

Firon

Link to comment

Share on other sites

DreadWingKnight

Link to comment

Share on other sites

Switeck

Link to comment

Share on other sites

pondini

Link to comment

Share on other sites

DreadWingKnight

Link to comment

Share on other sites

pondini

Link to comment

Share on other sites

Archived

Browse

Activity