Jump to content

List of all torrents downloaded to be included in a database


neoultimatum

Recommended Posts

Hey..

Thought it would be neat if utorrent had a list of all torrents downloaded in the past, stored in a smart small database, which would be referred to whenever a file is downloaded. So that if we've already downloaded that file in the past, utorrent would pop up with a, "Hey, idiot..You've already downloaded this file in the past, Are u sure u wanna download it again" message, or something similiar! :P

That'd help me stop keeping referring my huge collection of CDs if I'm in doubt as to I've got it in the past!

In the future...such a resource might be extended to have a smart search for similar titles, so that even if we got an avi with a similar name, utorrent would alert us...

Just thought it'd make utorrent smarter. It already is!!

Link to comment
Share on other sites

Normally uTorrent makes a copy of every torrent file you use and stores it in a certain folder (usually ...\Application Data\uTorrent) so if you want to know if you've already downloaded something (or you want to download something again) the corresponding torrent should be there. Also check out the "Store .torrents in:" function under preferences.

Link to comment
Share on other sites

  • 1 month later...

I just discovered that folder and it has a copy of every torrent I've downloaded since Christmas morning 2005 (no I didn't get Utorrent for Christmas :P)! I didn't even know that folder even existed until now. How do I clear it other than manually deleting the files? I don't see a setting for it in the Utorrent settings.

Link to comment
Share on other sites

  • 1 month later...

I'm not clear on why this requested feature -- comparing a current torrent to a DB of previously downloaded files and indicating (e.g., with color coding) what files in the torrent have already been DL'd -- is being summarily rejected.

Torrent files get renamed a lot, so having something like a Tiger-Tree Hash per file and the necessary uT functionality would allow a user see only new files in a torrent. So that's the personal user benefit.

But here's the BIG public benefit: if you use a DB like DC++ (or even just use an existing DC++ DB, so uT doesn't even have to add code to build one, just code to read one), then the files a user already has MUST be available for DL'g to other users (in other words, a file in a torrent would be path-checked using the DB info to make sure it exists before it could be marked as existing) -- thus a user with existing files becomes a "super re-seeder".

There are many times where I've DL'd a new torrent where I have many (sometimes most) of the files, so I'm only interested in new files. I have to laboriously go through such torrents and find the new ones -- but other torrent users get no benefit from my existing collection of DLs. It would be a benefit to me AND to others if uT had a comparison DB function, so that, while I'm DL'g the 10% or so of new files from a torrent, others can UL from the 90% of the files I already have. By using something like TTH, file name changes become irrelevant -- uT would allow uploading of an identical hash file, but "lie" about the file name (even if it had to be so brute force as to copy the identical file to a temp file with the "correct" torrent name).

Link to comment
Share on other sites

Torrent files are unimportant in name. The filenames of the files inside the torrent are specified as well as their sizing, and the SHA1 hashes are aggregated based upon the piecesize to create the INFO dictionary. I wouldn't say it has been summarily rejected, but the fact remains resume.dat stores all the data for loaded torrents. You're free to make an exporter to your DB of choice, uT is not a file manager. It just downloads data from torrents.

Your super re-seeder idea is a bit easier to accommodate in 1.8, as you can now re-path individual files in multi-file torrents to separate locations. I don't think any more aggressive checking is required in a torrent client.. it uses SHA on data to verify integrity... if you think your file is in the torrent already, right click in Files tab, relocate it, right click on the torrent, force recheck... and uT checks it for you.

Link to comment
Share on other sites

It's simply not practical to manually check each file in a new torrent by right clicking, finding the directory and file that I think might be the same, and forcing a recheck -- for 100 files (the torrents I DL are mostly large file collections, NOT a few large files). Repetitive operations are for machines to do.

I'm not looking for uT to be a file manager, and the function I (and others) have suggested isn't really file management. It's a user friendly way of increasing the number of files shared to all torrent users while easing manual duplicate checking by each particular user.

The suggested feature has been implemented in a number of DC++ clients (e.g., ApecDC++) -- they color code which files in a share have already been DL'd, to cut down on bandwidth wasted in DL'g duplicate files. While obviously not a necessary feature in uT, not only would it make ease of use greater, BUT it would also SIGNIFICANTLY amplify the amount of files shared in torrents. I go back to my example -- while I'm DL'g only new files in an updated torrent, all other users can see all of my existing files (in "shared" directories, of course) and UL from those. Thus, an initial seeder INSTANTLY gets multiple "re-seeders" for the portions of a torrent that other users already have.

See also ComicTree 1.3.2, which I use as a partial solution. It reads a torrent file and compares the file names and sizes listed in the torrent to a DB (that it generates) of file names and sizes of selected directories. However, since it doesn't use hashes, it's not 100% accurate -- it gets tripped up on name changes (e.g., some files use spaces for spaces, but some use underscores as substitutes for spaces, and in other cases, file names have been significantly altered, which counts as a non-match).

I'm new here -- is it the norm in this Feature Requests forum to respond essentially with "program it yourself if you want it so bad" when a feature is requested and the benefits pointed out? I would have thought that it would have been sufficient to say "that's an interesting/bad/technically challenging idea, and we're not likely to do it."

Anyway, the message is clear, so I'll move on.

Link to comment
Share on other sites

uTorrent is not a gnutella/direct connect client :/ It may be true that other p2p-network(s)/clients use, handle, or apply multiple hashes, but bittorrent only works on SHA1. I can understand the brusqueness coming off as insincerity, but in this case the request goes against the feature filled low resource idea I think.

Bittorrent does not care about any other hashing mechanism for data. uTorrent does not deal with any protocol other than bittorrent. Even if tiger hash was applied (but it works for any other algorithm) the standard mechanism for 1 KiB max resolution of data granularity you are referencing for tiger trees which apply in Gnutella and DC networks adds immense overhead without any applicable use. Surely it can be applied per file for this function to add another .dat file to create a "list of loaded files database", but that's totally outside bittorrent. It's not that it's not-interesting/technically challenging... it does not apply to a bittorrent client.

Link to comment
Share on other sites

1) Please note that I'm not trying to make uT into some other p2p system. I'm suggesting that the bittorent system can be made more efficient by NOT "forgetting" existing state (i.e., ignoring already downloaded files that don't happen to be in a designated DL directory).

2) I understand that torrents deal in blocks, not files per se, and that the SHA-1 hash is applied per block.

3) I understand that what I'm suggesting requires a revision to the torrent file data structure -- and thus would need community support to be fully useful. However, a TTH root is 39 bytes long, so adding a per-file TTH hash to a 100-file torrent file would increase the file size by about 4Kbytes -- less than a 10% increase and thus negligible. Computation time for the TTH hash would hardly be noticeable for 100 files, and is a one-time event per torrent file. As you note, the hashes could be consolidated into a single ".dat" type file, or simply included at the end of the torrent file (with suitable delimiters).

4) The TTH hash has the advantage of being in use by DC++ clients, so the code is readily available for constructing a hash DB for existing files. But it could be any hash, so long as it's per file. Torrent clients could use the added hashes or not, as they chose, and torrent builders could add the hashes or not, as they chose. But when the hashes exist, a new opportunity for greatly increasing the efficiency of the entire torrent system arises.

5) There IS an applicable use -- decreasing wasted bandwidth due to duplicated downloads, making DL management easier for users, and, most importantly, dramatically increasing the number of files available for UL when a torrent file includes "old" material that many people already have.

However, I'm resigned to the fact that it ain't going to happen.

Link to comment
Share on other sites

No, the DB would be persistent, similar to the DB that DC++ clients use (see ApexDC++ for an example). In fact, uT could parisitize an existing DC++ share DB -- just set a local path to an existing HashIndex.xml file (which contains a root TTH per file, and the corresponding local path for each file).

The process (assuming that torrent files include a per-file hash) would be rather simple: IF the torrent file has a per-file hash (either bencoded at the end, for example, or in a disinct file, such as BTFileHashes.dat), then read a file hash & look it up in the designated share DB. If matched, set an attribute (e.g., color) for the corresponding torrent file to indicate match AND set the local path to the matched file as the UL path for that torrent file. Repeat until done with all file hashes, then run re-check for good measure (to deal with torrents being concerned with blocks, not files).

Link to comment
Share on other sites

Yes, as I stated above, I know that; the proposal would require adding per file hashes when creating a torrent, such as by a bencoded addition to the .torrent file (while computing SHA1 hashes per block, TTH or the like hashes would be computed for the constituent files). If the hashes existed, then uT would use them; otherwise, processing of a torrent file lacking such hashes would be the same as the current version of uT.

Obviously all of this isn't much use for torrents comprising one file, such as movies or the like. The real use is for torrents that generally have dozens to hundreds of files, like comics.

Link to comment
Share on other sites

What I would like is an option to select the properties on a specific file and select what part or parts U want to Download. I've often found parts of a series to D/L, but not all, unless U find it in its entirety . Then U have to D/L the whole thing, where as if U could select. You would be able to un-check what U already have

Link to comment
Share on other sites

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...