Jump to content

the poisoning problem


nightshifted

Recommended Posts

It hurts to bring this up again, but the problem is getting worse. Rather than sprinkle it here and there it's probably better to start a topic and have discussion of it in one place.

It's not even 6:00 PM in my time zone as I start to type this (it probably will be some hours as I sneak a chance here and there to finish it), and already today I've had to take down three torrents on the tracker where I'm a moderator because seeders using µTorrent 1.5 were unintentionally poisoning them with the Open for Seeding function. (These were new torrents where on each the original uploader was so far the sole seeder and there were no completed downloads yet.)

There's always been a small poisoning risk when seeders use µTorrent. It doesn't recheck each piece before sending (BitTornado, ABC, and BitComet do, and will lower their reported percentage of completion, even demoting themselves from seeder to leecher, if a piece they're about to send fails its hash, and then they will not send it). µTorrent, Mainline, and Azureus do not. If you alter your source after the initial file check when you start to seed, µTorrent (and Mainline and Azureus) will gleefully send out hash-failing copies. I've brought that up before, and someone -- perhaps Firon? -- pointed out the CPU overhead of rechecking every piece before sending. I asked what about keeping the source locked from writes by other processes from the start of the analysis until the torrent is stopped and never had a response to that.

But that happened only when users changed their sources after starting to seed.

Now, though, it's much worse, thanks to the Open for Seeding choice in µTorrent 1.5 (and several of the last betas on the way to it). So long as a file at the right pathname *exists* on the seeder's hard drive, µTorrent will skip the file check, announce as a seed to the tracker, connect as a seed to other peers, and send out hash-fails. So left and right, we're seeing poisoning, and we're running around putting out fires and telling the poisoners never to use "open for seeding" in µTorrent unless some day Ludde releases a version without this fault. We ask them to do a forced recheck and they're utterly surprised to see they didn't have 100% ... if they do the forced recheck we ask for.

[To make things more confusing, Azureus and BitComet both have functions called "open for seeding," which are different from µTorrent's and from each other.]

So an inexperienced seeder who doesn't know how to point the client to the right directory, instead of announcing as a leecher with 0%, seeing that something is wrong, and asking us for help (for which we send out our prepared text about that problem), poisons the torrent instead in blissful ignorance until a leecher notifies us that his/her time is being wasted and ratio being eroded for nothing. Every inexperienced seeder who screws with his/her source between after the .torrent file but before starting to seed, instead of being stuck as a leecher with 99% and coming to us for help and uploading only the good pieces and not harming the downloaders, just keeps passing out those hashfails while nobody completes.

We don't get many cases of users who alter their source after starting to seed, but we do of users who try to seed from the wrong place or who alter their source before starting to seed.

This is, in my opinion, the one major flaw in µTorrent, and "open for seeding" has made it much, much worse. Is there any hope of a fix in a future release?

Link to comment
Share on other sites

If they seed from the wrong place, it obviously won't work. The moment they start the torrent, µTorrent will error with files missing. Any file missing will give that error. If there's any change in filesize, the same thing will happen. All this happens before it ever gets a chance to announce.

Link to comment
Share on other sites

Are you positive about incorrect filesizes, Firon? I know that missing files will prevent announcing, because we get asked "what does `files missing´ mean?" all the time. But I've yet to see anyone ask us to explain an error message from µTorrent about files' being of the wrong size. I'll try it myself.

If that's the case, then there are only three ways this can be kicking in, and yesterday was just coincidental:

1. the would-be seeder picks the wrong directory but does not select "open for seeding," µTorrent creates the files and announces at 0%, the user notices something is wrong, tries again, spots "open for seeding," and selects it then;

or

2. the would-be seeder is selecting the right directory but has altered a file since creating the .torrent file, but the alteration did not change the size -- and from our experiences with that, it's very rare that the size does not change;

or

3. just as with Mainline, Azureus, or older versions of µTorrent, the would-be seeder screwed with the source after the initial announce as a seed, and the client doesn't care: this has not gotten any worse with release 1.5.

I'm going to try the wrong size test for myself and edit this post with my results.

Link to comment
Share on other sites

yeah, it doesn't say anything about wrong sizes, it just gives the generic files missing error. (or it should at least)

1) When they select open for seeding and it's still the wrong folder, when they start it it would immediately stop with files missing

2) This would go through with open for seeding

3) Most likely, µT would error with access denied, but if they started it again I think it'd just go through. If they moved out or renamed any file, then it wouldn't work

Link to comment
Share on other sites

If µTorrent gives a "files missing" error for a wrong size (I never did get to test it), that's good enough, because there still is an alert to the user and it still won't announce.

1. µTorrent won't stop with a "files missing" error if empty files are there created by previously trying to use that folder without "open for seeding."

2. We're in agreement, nothing more to say there.

3. In nearly all cases, they don't move or delete or rename a file; rather, they muck with the contents of a file (#3, again, is independent of "open for seeding"), so there's no missing file or missing name, and the would-be seeder merrily flings out hashfails.

BitComet's Open for Seeding does the same thing: it jumps right in without checking the contents. But it rehashes each piece before sending, so if the would-be seeder picked the wrong directory, we get a funny result. Upon trying to send a piece, BC realizes it doesn't really have it and demotes itself to leecher at 99+% (still believing it has the rest). Now it doesn't try to send that piece any more, but with each other piece it tries to send, it notices that it doesn't have that piece and lowers its percentage further. The would-be seeder watches his/her completion percentage drop, thinks the files are getting gradually erased, and panics. For people who don't like BitComet it's hilarious; for the user to whom it's happening, it's unnerving. Still, there's no poisoning and it doesn't result in hashfails, and the user will ask for help instead of believing everything's hunky-dory. Now that the tracker where I'm a moderator (would be easier to name it, but I guess that's frowned on as advertising) has banned BitComet and BitLord, we don't see that any more.

Of the three poisoning cases I saw Monday caused by µTorrent 1.5, two were at 99+% with only one piece failing its hash and could have been #3 or #2. On the third torrent, leechers were stuck at 0% and all pieces sent by the supposed seeder failed their hashes, so it looked for all the world like #1.

Link to comment
Share on other sites

1) That depends on whether a file of the wrong size will stop "Open for Seeding." I never did get a chance to test that.

And why the heck does my subscription to this topic not send me email when you post? I always have to check back here manually (and worse, to remember to do it).

Link to comment
Share on other sites

Makes me wonder how many of my other subscribed topics I'm not getting notices on. Some come; it's not as if there were none at all. And it's not because you're the one writing the new post; I get mail about your posts to other topics all the time. Maybe it's for this forum?

Anyhow, when I get a chance to see what happens if the path exists but the size is wrong, then I'll be back.

====

And I'm back. Testing completed: wrong sizes are not detected by "open for seeding." As long as a filename at the expected path already exists, it's happy, claims to be a seed, and jumps right in. There may be a glitch along the way if a file is smaller than the .torrent file says it should be, but I expect that there would be none if the real file is larger than the .torrent file says it should be. So "open for seeding" does not notice alterations to source file sizes between creating and opening the .torrent file and just merrily poisons.

That misfeature needs to be fixed or removed. I just had to deal with another µTorrent poisoning case yesterday and one more this morning. The one yesterday was a 99% case, where the uploader had altered the .txt file, perhaps after starting to seed so maybe "open for seeding" wasn't at fault. The one today, though, was a 0% case; the user had created a dummy fileset either by not using "open for seeding" or by using another client, and then went back with µTorrent 1.5 and "open for seeding" and kept sending out whatever garbage bytes were there, or were not there, as he admitted that his source files were all 0 KiB.

A peer should never make its first announce to join or rejoin a torrent without checking the files. Most users just aren't informed or attentive enough to do things right on their own and avoid the traps, and those who are can still make mistakes.

Link to comment
Share on other sites

  • 2 weeks later...

I didn't try zero-byte files. I made a torrent of a folder that included, among other things, some text files. I edited one of them to be shorter and one to be longer, chose Open for Seeding, and watched µTorrent jump right in unphased. A stop and a forced recheck, of course, showed it to be missing two pieces.

Maybe it skips zero-byte files ... we'll have to see. Then one-byte files, and so forth ...

Here's a miracle: I received a notification about post #168765!

Link to comment
Share on other sites

Without rechecking every piece you send, I don't think there is a foolproof way of protection against this. But... If users modify or change the file, doesn't that mean that the file-attributes get changed? Why not register and check the Last-Modification time, apart from path and file-size? (still not fool-proof).

It would be nice to see an advanced option like "check piece hash on send", not turned on by default. But still, modern hashing and hash check mechanism are very fast (see campers hashing code for Shareaza), would it really be such a performance penalty?

Link to comment
Share on other sites

Yes, it would, because you have to read the entire piece before sending it out. This increases reads significantly, and would be a pretty large hit for any semi-decent upload rate, especially without a read cache. The CPU use also adds up.

Also, I talked to ludde and Open for seeding will now, at the very least, check to make sure the filesizes match.

Link to comment
Share on other sites

If Open for Seeding will check file sizes, that will be a huge help. Thank Ludde and you, Firon. An initial full rehash would be even better -- and that's not asking to rehash each piece before sending, just once before the initial announce.

As to other ways to avoid it short of rehashing each piece before sending: Klaus suggested rereading the file update timestamp (should not be newer than the .torrent file[*]). Another possibility might be write-locking the source file(s) while the torrent is active.

[*] That won't help in this scenario: user prepares .torrent file, alters source, posts to members-only tracker that uses passkeys, gets personalized .torrent file that is saved with newer timestamp than the altered source file. So one could say, don't use the modification timestamp of the .torrent file but rather the creation date inside it: however, if one is not the original seeder, but a former leecher with a complete copy returning as a seeder and using Open for Seeding, all files in the source will be newer than the creation date of the .torrent file.

Link to comment
Share on other sites

Yes, it would, because you have to read the entire piece before sending it out. This increases reads significantly, and would be a pretty large hit for any semi-decent upload rate, especially without a read cache. The CPU use also adds up.

Also, I talked to ludde and Open for seeding will now, at the very least, check to make sure the filesizes match.

The read-cache should be able to help out here. CPU-usage should still be low as most hash (checking) algorithms outrun the HD's speed and I have yet to find anyone with a gigabit upload connection. "Check Hash on Open for Seeding" could also me an advanced option, turned off by default.

But still, there most be some semi-efficient way to solving this? Whenever you send out a block, you read the entire piece, hash check it, put all other blocks in the piece cache (guess it does that already) and mark internally as OK / hash checked.

Link to comment
Share on other sites

It doesn't lock the files while seeding because of people complaining they couldn't watch/use whatever they downloaded 'till they completely stopped the torrent.

True. That doesn't apply to those using "open for seeding," but it does apply to others who change their sources after entering as seeds without "open for seeding" or after completing their downloads and becoming seeds. I guess some of those programs insist on getting their own write locks even though they make no changes, just to prevent changes by yet other programs.

And if you want a full rehash, just don't use open for seeding.

Firon, you're forgetting that I'm not raising the issue as a µTorrent user. Of course, for my own uploads, I can avoid "open for seeding," resist changing the source during seeding, and stop and do a Force Recheck whenever I feel unsure. It was clear from the basenote of the thread that I already knew that.

Rather, I'm raising it as a moderator of a tracker, dealing with users. We can't grab them by the forearm in advance to stop them from using "open for seeding" nor to make them force a recheck.

Maybe "open for seeding" could just be renamed "open without checking," which is what it basically does (except for looking for the names to exist and be nonempty). Then we can tell users, "well, you decided not to check the source." After Ludde implements size checks, it can be named "open without checking data."

Link to comment
Share on other sites

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...