Jump to content

Better unix made torrents in Windows (same name but uppercase letters)


mariushm

Recommended Posts

I'm currently downloading (or at least trying to download) the 600 GB Geocities torrent from tpb and it's really a pain.

First of all, the torrent is made with 16MB chunk size and while uTorrent works, it breaks down way too often with such messages in the log:

[2010-12-05 08:34:57] IO Error:1168 line:395 align:-99 pos:-99 count:131072 actual:66767

... and then the joy of doing a hash check again on 200GB of incomplete files for 1 hour, only to download for 20minutes and have it break down again. This doesn't happen with torrents with regular chunk sizes.

Now I'm not 100% this is caused by the chunk size, because a far bigger problem is that the torrent was created on Linux and the person that made it used lowercase and uppercase file names:

[2010-12-05 08:34:57] ReadFile error: LOWERCASE\geocities-e-g.7z.001:0:2520484:2520484:3

This is because there's also a file LOWERCASE\geocities-e-G.7z.001 in the same folder.

And of course, when this happens uTorrent breaks down and I don't see any option to just let it continue with a warning and simply disable downloading that file .

There are about 130 files with upper/lowercase conflict in about 2000 files in the torrent so disabling each file individually is a pain, and I have to disable BOTH versions AND the files BEFORE and AFTER these two bad files (because of partial chunks) not to mention uTorrent forgets what was disabled when I have to do a re-check.

I do however have to applaud uTorrent for actually working - Deluge just crashes, Bitcomet didn't even start to hash check, Vuze I couldn't manage to get it to work (torrent didn't appear in the interface) and the others I tried were attemting to allocate 600 GB of files on my 500 GB drive (I'm selecting subfolders from the torrent, waiting to completition, download through FTP to other system, delete completed files from server, rinse and repeat)

So the suggestion is... add an option to detect those uppercase/lowercase conflicts AND give option to:

- set automatically to "Don't download"

- rename those files in some way:

For example, rename LOWERCASE\geocities-e-G.7z.001 to LOWERCASE\geocities-e-G.7z.001_ but this wouldn't work if there's a similar file with another letter uppercase in the same folder, for example:

LOWERCASE\geocities-e-g.7z.001

LOWERCASE\geocities-e-G.7z.001

LOWERCASE\geocitieS-e-G.7z.001

and maybe create a .bat file in the root folder that would have the commands to rename the files to the original names.

So I really don't know how it would be the easiest, maybe create a folder titled "Renamed Files" inside the torrent folder and there save the conflicting letters in hexadecimal or in some way, for example:

LOWERCASE\geocities-e-g.7z.001

LOWERCASE\geocities-e-x0047.7z.001 (x0047 = 47 hexadecimal, G)

LOWERCASE\geocitie0x0053-e-0x0047.7z.001 (x0053 = S, 0x0047 = G)

I've used x00 in front so that characters in other languages could be used, UTF-16 style.

Link to comment
Share on other sites

Yes, I may be able, I already said that.

But you have to disable all files that have chunks involved with those other files. For example when you have this :


[File_xyz.dat][ File_abc.dat ][ File_lower.dat ][ File_lOwer.dat ][ Another file.dat]
[.. chunk 1 ... ][.. chunk 2 ... ][.. chunk 3 ... ][.. chunk 4 ... ][.. chunk 5 ... ]

You can see chunk 2 is shared by File_abc.dat and File_lower.dat, so you have to also disable File_abc.dat because depending on how uTorrents holds data in memory, when it tries to hash check the last piece from File_abc.dat, it will try to read from File_lOwer,dat instead of File_lower.dat or something like that, and it will fail with disk read error. (well that's my theory anyway).

Another_file.dat is not affected because it doesn't share chunks with the two "bad" files, but you can't tell it and the file showed below File_lOwer.dat in the file list

You can't tell for sure which file is before File_lower.dat or after File_lOwer.dat, as they're not always sorted by file name in the torrent.

Either way, does uTorrent really have to fail so bad, with a critical error and force me to re-hash hundreds of GB every time? It really takes hours to do it every time.

Link to comment
Share on other sites

It's a 650GB torrent, the hard drive on the server is only 500 GB, with about 250 GB free space available for torrents. So I have no choice but to select about 20-50 GB of files, download, move completed files, select another set of files, hash check and set don't download on completed files, repeat.

Which reminds me how annoying it is to do hash check simply because I moved/deleted completed files and set them to "Don't download"/skip - when it tries to read those completed files and doesn't find them anymore, uTorrent should just give you a warning and continue, not stop with a critical error and enforce a hash check for the whole 200+ GB.

Link to comment
Share on other sites

Sorry but I have to disagree, it does work. Not perfectly but it does.

I currently have set uTorrent to use !ut as extension for incomplete files so I know which files are completed.

The whole torrent is currently seeded at an average of 46%, so I select about 20-40 GB of files, let it download for an hour or so and get about 5-10 GB of complete files.

- Stop the torrent

- download those 5-10 GB of completed files to my home computer,

- set completed files on "Don't download" and delete them from hard drive

- set the other previous files on Low Priority

- pick another 20-40 GB of files and set to "Normal Priority",

- run hash check to invalidate the completed files from before so that uTorrent won't fail with "disk read error"

- resume the torrent.

So in the end after about 10-20 cycles of this I have about 60 GB of complete files, 200 GB of incomplete files and I can just delete the incomplete files and start all over again with another batch of 200 GB. It's a waste of bandwidth for me and the people I download from (but i also upload stuff to people while downloading so it's not that bad) but the server is on a gigabit port with a huge bandwidth allowance so I don't care.

uTorrent will have some random partial chunks in that temporary .dat file but after hash checking it will just pop warnings in the log in the form of:

[2010-12-05 10:09:35] geocities.archiveteam.torrent: No longer have piece: 15986

but will keep happily downloading.

What you're saying about relocating files with issues to another folder is possible only with v 2.2 which is impossible to use with this torrent - it's much more unstable than this 2.0.4 version.

It's another "bug" I've reported, see my other posts... Basically it just stalls when hash checking the torrent, every 20-30 GB of content, so I have to manually press Pause and Start to make the hash checking resume. With this 2.0.4 version, it's less often, maybe once every 15-20 minutes it stalls when hash checking. And if i manage to hash check it completely, it just gives IO errors as if the hard drive is faulty, like this one:

[2010-12-05 08:34:57] IO Error:1168 line:395 align:-99 pos:-99 count:131072 actual:66767

This 2.0.4 version still pops them but much less often.

Link to comment
Share on other sites

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...