Archived

This topic is now archived and is closed to further replies.

fareedrizkalla

Torrent file structure

Recommended Posts

I'm programming a program that processes the bencoding of the torrent file.

I've looked in several places, but some parts appear to be vague to me.

How can I interpret the hashes, how are they stored.

I know the bencoding uses colons to structure the data.

After the last entry for pieces, the hashes reside.

But the SHA1 is 20 digit, I presume that every colon separates every piece of hash.

They don't seem to be of fixed length and they are encoded in binary form.

If anyone could provide any assistance, it would be greatly appreciated.

Share this post


Link to post
Share on other sites

This was one of my earlier resources that I used.

But I can't understand the pieces relation with the numbers following it, which aren't encapsulated in i3e.

Should I count after the pieces each 20 digits for one hash, if so how are they are organized for a multiple file torrent structure.

Share this post


Link to post
Share on other sites

12:piece lengthi524288e6:pieces14020:

So we got 12 and a colon representing a string "piece length", we then have an integer for the number of bytes for each piece length. Then we have 6 and a colon representing "pieces", after pieces their is a number which is not stored in any integer casing followed by a colon. After "14020" their is a colon is that where the hashing after it starts, interpret every 20 digits as a SHA-1 hash, but I know of the variable 1-2 byte saving of UTF-8.

Share this post


Link to post
Share on other sites

My English isn't bad, but it's not that good for technical documents. What do you mean with it ain't UTF-8 encoded! The biggest thing that is bothering me, am I supposed to parse the value 14020 after pieces. If so why isn't an i and e surrounding it. I'm starting to understand that the last information in the meta-structure are the hashes and I have to count every 20 digits for a hash.

Share this post


Link to post
Share on other sites
am I supposed to parse the value 14020 after pieces. If so why isn't an i and e surrounding it

You need to re-read the bencode specification.

the "i" before values is ONLY for integers.

the "e" after values is ONLY for after integers, lists and dictionaries.

Share this post


Link to post
Share on other sites

I managed to figure out how the hashes are stored. I always thought any letters before a colon was for readable text.

So the hashes are stored in an odd character format, do they have to be decoded or reinterpreted.

Share this post


Link to post
Share on other sites

Those are byte strings, which may or may not be plaintext.

It may look odd, but the way the hashes are stored is a more space efficient representation of the hash. Instead of wasting 1 byte per hex character (0-F), it spends 1 byte per 2 hex character representation (since 1 byte can represent 00-FF).

Share this post


Link to post
Share on other sites