Archived

This topic is now archived and is closed to further replies.

rnr

Merkle hash torrent extension

Recommended Posts

I hope it won't ever happen exactly as described in BEP30.

The No1 candidate for future p2p protocols is indeed TTH. TTH is fast and is already used in weakly related to each other DC and Gnutella2 (not counting ADC and Gnutella1).

All that AICH and SHA1-based Merkle torrent hash are just increasing the pain of otherwise comfortable file management. SHA1-based MTH IMHO is just a bicycle. Bad bicycle. Let it die unborn. Compatibility should be preferred unless there are strong reasons. I can see no reason to abandon TTH (with respect to file borders) in favor of nowhere-else-used SHA1 tree hash not respecting file borders.

A better approach is to get TTH of every file. If a torrent has more than one file, get a hash of directory. The idea is described here: http://www.adcportal.com/wiki/index.php/Proposed_Extensions#Hash_set_extension . The reference implementation can be taken from DC clients.

TTH (being a Merkle tree) allows getting cuts from a different layers. Put simply, one can choose blocks of different size, and the root hash remains the same. We can put a small hashset into .torrent files and let clients expand it by sending a deeper layer of Merkle tree. Embedded hashset allows webseeding (when there is no 'pieces'), and sending deeper layers allows for fine-grained block checking. Integrity is guaranteed by Merkle tree structure.

Root hash must present inside 'info', but hashset must be embedded somewhere outside of 'info' so that hashsets with different block sizes don't change BTIH.

One of the problems is that TIGER (and TTH as a consequence) is longer than TTH and BTIH, so server software should be modified to handle longer announce arguments. Of course, btih can still be calculated, and be used as always, but directory hash is more universal so I think clients should try to use it in DHT and in tracker announces.

Or maybe torrent clients should use DC's DHT and Gnutella-style Content-Addressable-Web N2R requests instead of TTH announces? CAW draft can be found here: http://www.peeep.us/cfa0a518 This question was left open in BEP30 and I left it open as well.

The compatibility with older clients is gained by using both 'pieces' and TTH root hash. It doesn't mean the amount of information is doubled. TTH hashset is only as big as required for webseeding. Web sources are less likely to fail. TTH hashset must actually be used when there is no 'pieces' part. In the absence of webseeding it should be enough to embed 1 root hash and 1 file permutation per 1 .torrent. Permutation is used in directory torrents. I didn't checked out, but TTH's are likely to be sorted before creating a directory hash, so permutation must restore the mapping of TTH to file names. As you can see, there is no doubling .torrent size at all.

Also, it should be considered that TorrentBuild can create TTH-rich torrents right now, but it does it in a slightly another manner. When there is just 1 file, everything is OK. We have a root hash which is the file hash, and no permutation is needed. However, when there are several files, it creates TTH for every file. Torrent clients should be able to reuse torrents created by TorrentBuild. Personally, I think, per-file TTH is mostly fine except cases when there are plenty of files. In the latter case directory hash is preferred to avoid doubling the size. But this happens not very often, so torrents maid by TorrentBuild are pretty fine right now. Even when there are plenty of files, I think, per-file TTH is OK since all this is about decreasing a total size of .torrent stored on server, and if rare ones will be big, it doesn't matter much.

When there is no TTH-enabled BitTorrent client online, TTH can be used to look for files in another p2p networks and in another torrents (via yet another to-be-done protocol extension).

WDYT?

Share this post


Link to post
Share on other sites