Archived

This topic is now archived and is closed to further replies.

BlindFreddie

Clients that fetch torrents sequentially from the start - happening!

Recommended Posts

See my post just now in "Index » Feature Requests » Prioritizing first/last pieces - enhancements".

The "Transmission" client seems to be downloading torrents sequentially from the start. I'm a BT newbie, but I suspect that it's violating protocol.

Should the protocol state that clients should ban other clients that are detected doing that?

Share this post


Link to post
Share on other sites

Thanks, Switeck. Giving v low pri/choking is essentially what I meant.

BTW, a "Transmission" client is on one of my peer lists again, it's one of the offenders, and it's version is shown as 0.72Z (letter 'Z').

Since you're a moderator, I'll leave it to you to pass on the message to the uT programming team.

(Later) I Googled for Transmission, and came up with pay dirt in a transmission.m0k.org forum article, where member Charles Kerr, one of the program's developers, states his piece-fetching priorities.

See http://transmission.m0k.org/forum/viewtopic.php?t=1950&highlight=first

If the article is no longer online when/if you need it, I have kept a copy.

Share this post


Link to post
Share on other sites

Bleh, me being a "moderator" just means I post alot here.

Doesn't mean I have any pull with the uT programming team...barely know any of their names even.

Share this post


Link to post
Share on other sites

Causes piece availability to be front heavy, but shouldn't cause problems if the client doesn't insist on the order. Of course, best piece availability would be to the front as well. The protocol itself doesn't prohibit this kind of behaviour, but is discouraged as Firon said. The only real flaw with this is that the bandwidth isn't distributed across the peers since with this any peer can fulfill any (sequential) request. Also takes longer to have other full seeds out there since the likelyhood of one peer staying to seed after finishing isn't that high (a good majority just drop out soon after receiving the whole thing in public trackers), so this is where it harms the swarm.

Share this post


Link to post
Share on other sites

I know this is an old debate, but I'm unconvinced that sequential downloading harms torrents.

In an environment where every peer downloads pieces sequentially without exception, the rarest pieces will be at the end of the torrent. In that case, the end of a torrent is at risk of falling out of the network altogether.

But in reality, I see a self-correcting situation. While downloading a file, I noticed on my graphic indicator that my download was "end heavy". The further a piece was from the beginning of the file, the more likely I was to have received it.

In a mixed environment of sequential downloaders (SDs) and non-sequential downloaders (NSDs), wouldn't the activity of the SDs cause the NSDs to preferentially trade end pieces, thus balancing piece availability on the network?

I am presuming that the NSDs prefer to download rarest pieces first, and that they know what pieces are rare by polling the swarm (or asking the tracker, as the case may be). Whatever pieces the SDs make rare, the NSDs will want to trade.

Sequential downloading is preferred by users who want to preview the files to decide whether they want to finish downloading them.

If a sequential option is built into a Bittorrent client, it may be wise to design it so that the client alternates between requesting sequential pieces and rare pieces. It could boost piece availability and give "file insurance" to the individual user against losing the ability to download rare pieces toward the end.

Share this post


Link to post
Share on other sites

Sequential downloaders DEMAND only the next piece they don't have. In a mixed torrent swarm environment, the odds of other peers having the piece they demand is pretty low ...which means ONLY seeds will likely have it! That forces seeds to upload the same "early" pieces of a torrent multiple times to multiple peers, KILLING the swarm when seeds leave with a >4:1 UL/DL ratio and availability is still less than 1. Even having bt.prio_first_last_piece enabled in uTorrent causes this to a lesser degree, as other peers and seeds are forced to spend more of their upload bandwidth sending out the first and last pieces of the torrent to new peer arrivals.

Also, if there's multiple sequential downloaders, any 2 will never download from each other at the same time because at least one of the 2 will always want pieces the other doesn't have. This causes BitTorrent's tit-for-tat logic to fail miserably!

Even a low percentage of sequential downloaders on a torrent, such as 10%, can cause a huge concentration of the "early" pieces to be repeated by other peers...because the only thing the other peers can download from them is the first few pieces, and then that may be all they can share too for awhile. The torrent swarm can end up stalling, and then the overall download speed often falls to no greater than the upload speed of the seed/s...even if the seed/s previously uploaded the torrent multiple times already.

The damage that sequential downloading causes is very similar to having LOTS of hit-and-run downloaders. (They leave the moment it finishes downloading.) This is probably the leading cause of torrent death.

Share this post


Link to post
Share on other sites

Thank you.

The behavior looks clearer to me now.

Speaking of hit-and-run downloaders, sequential downloading may cause them to be worse sharers than they already are in a random upload/download environment. The last pieces that they receive may be least likely to get uploaded by them before they leave.

Do you think that sequential downloading by other peers is the reason why my download was extremely "end heavy", or do you think it is a coincidence?

The last dead torrent in which I participated seemed to be "dead all over", not dead at the end. But I don't have much experience with Bittorrent to know how torrents usually die.

The "bt.prio_first_last_piece" option seems silly. There is usually no motive to prioritize the end pieces, and the option implies sequential downloading on both ends. It seems better to mix sequential downloading the beginning with preferentially requesting rare pieces.

Share this post


Link to post
Share on other sites

A download that's lacking the last few sequential pieces has quite likely been the victim of sequential downloading.

"Dead all over" torrents are probably just that way due to ordinary hit-and-runners.

I've seen numerous torrents with BitComet clients that had strong availability of all end-of-file pieces for every large file in the torrent. I'm pretty sure they are prioritizing start/end pieces for every file by default!

It is better to not do sequential downloading period.

Share this post


Link to post
Share on other sites

I'm still not sure that sequential downloading kills torrents. But disappearance of seeds certainly does.

I recognize the problem of stratification, which is an inability of peers to trade. Stratification is a condition like this:

Peer A has some fraction of the torrent, and has all of the pieces available on the network at the moment. Peer B has some fraction of A's pieces, and nothing more. Peer C has some fraction of B's pieces, and nothing more. Peer D has some fraction of C's pieces, and nothing more. Every peer has a different amount of the torrent, and none of the peers are capable of swapping pieces.

Stratification would eventually happen in an environment where there is no seed, and every peer insists on getting pieces quickly in exchange for its bandwidth. If the torrent had sequential downloaders, the lower strata would tend only to have the beginning of the file. The higher strata would tend to be missing only the end of the file.

If you see a stratified dead torrent with evidence of sequential downloading, it is easy to blame sequential downloading for the death of the torrent. But the torrent actually died from stratification. If the torrent does not have sequential downloaders, stratification can still happen. The only difference is that the downloaded pieces will look more scattered on each peer's map of the file.

If a seed returns to the network, all of the stratified peers will be newly motivated to send pieces downstream because seeds reward peers for providing bandwidth. A torrent should not become stratified as long as seeds are uploading.

It seems as if torrents need seeds, period. Peers that share while downloading will reduce the bandwidth burden on the seeds. But the torrent concept does not work unless seeds remain available.

Well-designed seeds will distribute pieces to peers that are good at sharing. They will preferentially distribute rare pieces on the network. They will try to send the pieces to low strata, because this increases the number of times that the pieces will be copied as they "bubble up" to the higher strata through swapping.

Share this post


Link to post
Share on other sites

"Peer A has some fraction of the torrent..."

That devolves into ALL peers having the same pieces as A (all stuck on the same percent complete.) Seen it, been there, done that. Except for very huge torrents, this often happens within 0-4 hours of the last seed leaving. It even happens when the lone seed is very slow and/or only appears intermittently.

"Stratification would eventually happen in an environment where there is no seed, and every peer insists on getting pieces quickly in exchange for its bandwidth."

So long as availability is >=1 even with no seeds connected, it's possible for the torrent to complete for all participants IF they "play fair".

There is *NO* peer-to-peer tit-for-tat exchange between sequential downloaders! They aren't playing fair. Sequential downloaders would quickly KILL such a torrent with no seeders!

"If the torrent had sequential downloaders, the lower strata would tend only to have the beginning of the file. The higher strata would tend to be missing only the end of the file."

And that shouldn't happen because it's hopelessly bad!

Take for instance a torrent swarm initially with 1 seed and 10 peers.

1st instance, 10 sequential downloader peers...with each having 10% more than the last starting with the 1st having 0%.

Seed leaves.

Even though all-combined there has been 10% + 20% + 30% + 40% + 50% + 60% + 70% + 80% + 90% = 450% downloaded relative to the size of the torrent (possibly alot from the 50% to 90% peers), the torrent CANNOT complete for the 10 peers left on the torrent. Availability is exactly 0.9 (90%). Everyone can only get to 90%. If the peer with 90% gets disgusted and leaves immediately as well, then nobody else can get more than 80%.

2nd instance, 10 regular peers...with each having 10% more than the last starting with the 1st having 0%.

Seed leaves AND the peer with 90% gets disgusted and leaves immediately as well. (That peer probably uploaded far out of proportion to how much it got from other peers and the seed by that point.)

Of the remainder there is 10% + 20% + 30% + 40% + 50% + 60% + 70% + 80% = 360% downloaded relative to the size of the torrent. Availability is still VERY likely to be greater than 1. I'd put a high probability that the remaining 9 peers can still complete the torrent given enough sharing between them.

The 90% peer took much longer to make in the 1st instance than in the 2nd instance. This means the lone seed had to upload more and for longer...only for the torrent to instantly die when it left.

"Well-designed seeds will distribute pieces to peers that are good at sharing. They will preferentially distribute rare pieces on the network."

Peers CHOOSE what pieces they want to download from other peers/seeds. The only choices a seed has is to NOT upload to peers...via initial seeding mode and/or limited upload slots...or to upload slowly. This is automatically done, with the user not getting a choice in the matter.

Share this post


Link to post
Share on other sites
It seems as if torrents need seeds, period. Peers that share while downloading will reduce the bandwidth burden on the seeds. But the torrent concept does not work unless seeds remain available.

No. Torrents need availabilities above 1.0 to work. Seeds are not absolutely necessary beyond the initial seeding, provided the availability remains at or above 1.0.

Share this post


Link to post
Share on other sites
No. Torrents need availabilities above 1.0 to work. Seeds are not absolutely necessary beyond the initial seeding, provided the availability remains at or above 1.0.

That is correct, technically. But I expect that a torrent availability above 1.0 would seldom be present for a large, unseeded torrent. Given the giant number of pieces in some files, it is improbable that every piece would be available.

The probability of the whole thing being available looks something like this:

P(A)*P(B)*P©*P(D)*P(E)*P(F)*... where P(A) is the probability that piece A is available.

It is a product of very, very many numbers between 0 and 1 which becomes smaller as the number of pieces increases.

Of course, a person can use three ip addresses, hosting a different third of the torrent with each address. Then the torrent can have zero seeds, and a forced availability of 1.0. But for this discussion, I consider that example to be a method of seeding by using a workaround.

The high odds that a whole torrent wouldn't be available without seeds illustrates the importance of developing seeds which preferentially distribute pieces which are scarce among peers. Seeds should not be told which pieces to distribute. Seeds should act as if unfinished peers are willing to take any of its missing pieces. That way, a seed's role in keeping the torrent alive is unlikely to be wasted by sequential downloaders.

It may also be a good idea to establish a certain minimum number of pieces into which a very large torrent would be divided.

Share this post


Link to post
Share on other sites

It's not as rare as it sounds on large torrents for peers to NOT see any seeds but still have an apparent availability of >1.0 due to the peers they are connected to. Not quite what you mean...but edge conditions are important too! :P

uTorrent seeks out rarest pieces first. This is precisely why it's possible for a large torrent swarm of peers to hold together even if the only seeder leaves. And this is EXACTLY what Sequential downloading does NOT do!

Share this post


Link to post
Share on other sites

Switeck: can you provide another link, if possible, for the sequential downloading matter? I don't know if the problem is mine, but Azureus wiki seems to be offline all the time. TIA

Share this post


Link to post
Share on other sites

Semi sequential downloading is quite useful option.

If you download video is some archive with images you absolutely need beginning of the file and preferably end too if video is missing few chunks that does not matter much, It will still play fine with few interruptions. and you may be able to partially extract or repair archive too.

this also does some good in the end of the torrent life cycle, since torrent will be stuck on 98% with no seeds. If that is some movie , it will play quite fine and lechers hoping for seed to appear will be staying online almost forever acting as semi seeds.

if torrent is well balanced it is more likely that it will be stuck on some random 20-80% becoming completely useless

Share this post


Link to post
Share on other sites

Wouldn't an easy solution to this be to have this as a manual feature (i.e. you have to select it on each torrent or something), have it only be able to be used on 1 active torrent (as chances are that the only reason you would want to sequentially download is to be able to view the file while it is downloading), and have it only be able to be used when the seed:peer ratio is >#?

For example, if I were on a torrent with 80 seeds and 2 peers, it is extremely unlikely that sequential downloading by either or both of the peers would result in the torrent's death.

If I were on a torrent with 3 seeds and 10 peers, sequential downloading could indeed have adverse effects, and thus should not be able to be enabled.

Many people download multiple torrents at a time, so limiting it to one active torrent sequentially downloading would greatly reduce the number of people downloading sequentially on any given torrent.

Many people download single torrents, but in a queue, so having the feature be manual (i.e. each time a new torrent starts, the feature would have to be enabled, and further, each time the client starts the feature could have to be re-enabled on a specific torrent) would greatly reduce the number of sequential downloaders.

And lastly, limiting it to torrents with a minimum seed:peer ratio would ensure that sequential downloading could only be used on torrents where it presented little or no risk to the integrity of the swarm.

This would essentially cripple this feature to a point where its existence would be inconsequential to the swarm's integrity, yet still be an extremely useful feature for many users.

Share this post


Link to post
Share on other sites

"if I were on a torrent with 80 seeds and 2 peers, it is extremely unlikely that sequential downloading by either or both of the peers would result in the torrent's death."

Initial seeding scenarios often have the majority of peers become seeds almost at the same time.

And very shortly afterwards, probably more than half the new seeds leave.

Many of the remaining seeds may be slow, firewalled, or conditionally slow if on hostile ISPs.

The reported seed/peer numbers can also be artificially inflated due to miscounting caused by multiple ips to describe a single peer/seed. Typically this is when they have both a IPv4 and IPv6 (via Teredo) address.

Considering the default settings for uTorrent is to allow up to 50 connections per torrent and have 4 upload slots per torrent, that means on a busy torrent a seed could be connected to 40 peers but only uploading to 4 -- 1/10th (10%) of all of them. On top of that, their average upload speed per upload slot is likely to be less than 3 KB/sec.

To recap:

80 seeds + 2 peers

Only 50% of the seeds remaining just minutes after being last seen.

On average, about 50% are firewalled.

Maybe 20% are on hostile ISPs or have really pathetic max upload speeds.

Maybe 25% of the ips are duplicates of the same seed/peer.

Each averages about 3 KB/sec upload per upload slot, but only uploads to a particular peer maybe 1/10th of the time.

82 total peers+seeds x 50% remaining x 50% unfirewalled x 80% not on hopelessly hostile ISPs

x 75% aren't duplicated ips x 3 KB/sec upload speed x 10% =

About 3.69 KB/sec upload from all that.

All that talk about redundancy assumes uTorrent can know when someone will quit.

It doesn't.

Share this post


Link to post
Share on other sites

"And very shortly afterwards, probably more than half the new seeds leave."

Not sure what scenarios you have experienced, but the majority of torrents I download are on private trackers with sustained high seed:peer ratios, with many of the seeds being seedboxes, even one of which could max out my connection. Thus the following changes would be made to your calculation (note that the majority of these numbers are just estimated from what is required of the average private tracker user):

80 seeds + 2 peers

In a worst case scenario, 90% of seeds are remaining just minutes after being last seen

On average, about 5% are firewalled (if you can't seed, you can't maintain ratio, so you can't stay on the site)

Maybe 5% are on hostile ISPs or have really pathetic max upload speeds (see above)

Maybe 0% of the ips are duplicates of the same seed/peer (on private trackers, each user can only have 1 IP associated with their account at any given time, I believe).

Each averages about 50KB/sec upload per upload slot (this is a conservatively low estimate, as the seedboxes would likely skew that number up by a large amount), and using the correct fraction since we've only got 2 peers, not 40, they only upload to a particular peer maybe 1/2 of the time.

82 total peers+seeds x 90% remaining x 100% unfirewalled x 95% not on hopelessly hostile ISPs x 100% aren't duplicated ISPs x 50KB/s upload speed x 50%=

About 1.75 MB/sec upload from all that.

For your calculation to hold true, the following conditions would have to apply:

1. A torrent on a public tracker. The situation you're describing, where 50+% of users hit and run, does not happen on private trackers due to DL:UL ratio requirements.

2. A torrent which has, moments ago, finished initial seeding and converted the majority of peers to seeds. This period is brief, is it not? i.e. the chances that a significant number of people will join a torrent just after it has finished initial seeding, but just before seeds drop off like flies, and then have that as the torrent they choose to have download sequentially, then manually set it to download sequentially, thus crippling the torrent... Those chances are quite remote.

3. A torrent where the only remaining seeds are peers seeding from a home connection with an upload limit set at 12KB/s (am I right in assuming that's what you meant by 4 upload slots, max 3KB/s/slot?). If there were servers as seeds, or even users with fast home connections, the torrent could survive even with sequential downloaders for the period of time between tracker polls by the SDs, who then would stop being SDs, thus removing the threat to the torrent.

From this, I assume two things:

1. Sequential downloading implemented in the manner I described in my previous post would not pose a threat to private trackers, which many people use.

2. Sequential downloading implemented in the manner I described in my previous post would pose a threat to torrents on public trackers, but the threat would be very unlikely to kill a torrent in the worst case scenario, and in average or best case scenarios would pose insignificant or no threat.

Share this post


Link to post
Share on other sites

A public torrent with 1 slow-ish seed often has nearly every peer finish downloading around the same time. They often quit within 0-12 hours of that but still get recorded as "part" of the swarm incorrectly by either the tracker or other peers/seeds for days. (Mainly by peers/seeds via PEX or DHT.) So if you're on a public torrent that claims 80 seeds, don't expect to connect to more than 40. Even on private torrents reporting 80 seeds, don't expect to connect to more than 70 -- some may have reached global or per torrent max connections.

I screwed up, indeed if there's only 2 reported peers...then the seeds (and those peers) can only upload to you and the peers.

But that's only for that torrent!

They may also be running 5+ other torrents and/or have priority/upload speed max set for this torrent really low.

Even if not, then an "average" ADSL or Cable line that has only ~60 KB/sec max upload speed is certainly splitting it thin if it's running 5 torrents with 4 upload slots each. That's 18 or 19 total upload slots.

60 KB/sec / 18 upload slots = 3 1/3 KB/sec each

Even if they only had 2 other torrents with 4 upload slots each, then there'd still be 10 upload slots active...so:

60 KB/sec / 10 upload slots = 6 KB/sec each

No joke -- if "ordinary" seeds are seeding multiple torrents, you're not going to get a lot from them even when you ARE downloading from them. So if all 82 sources upload to you, that's about 500 KB/sec combined download speed. That should be enough for ordinary video playback but not enough for upper-end HD video.

The risk to private trackers is probably minimal, as they typically have ratio enforcement, very few firewalled peers/seeds, and higher-than-average (by FAR for some!) upload speeds per seed.

The risk to public torrents depends on the 1.accuracy of the torrent's stats (how many reported seeds/peers), 2.firewall ratio, 3.average upload speeds, and 4.whether seeds stick around to seed at least 1:1 after downloading.

1.I've seen numerous times where reported seeds/peers were at least a magnitude higher than available seeds/peers were really there. Some reported seeds/peers were even blatantly impossible ips, such as 255.255.255.255 or 0.0.0.0! Any estimate for whether a torrent swarm can sustain heavy leeching may be doomed to failure due to lack of real seeds.

2.Many ADSL ISPs give out modem-routers to their unwitting customers. Antivirus products now often include software firewalls. Software firewalls often have to be explicitly configured to allow incoming connections. Wireless and Satellite ISPs are almost always hopelessly firewalled. Lastly, a large percentage of broadband customers have routers. In all, it's not unreasonable to assume at LEAST 40% are firewalled on public torrents...some hopelessly so. My guess is the firewalled percentage is probably >50%. :(

3.With few ISPs giving more than 1 megabit/second upload speeds even on their "premium" speed tiers, most people have less than 120 KB/sec usable upload. ISPs may throttle BitTorrent uploads heavily, at least to other ISPs. People may lower upload speed max as well to only a small fraction of their line's max. Or they may run lots of torrents at once. Or they may have lots of upload slots PER torrent. Or they may allow lots of peer/seed connections at once.

4.On public torrents, seeding duration seems to be limited to only a few hours after completion for the vast majority of seeds. Seldom is there any appreciable number of seeds (despite tracker/peers reporting so!) except on super-popular torrents.

In the BitTorrent protocol original specification, seeds were sources of LAST resort -- to be used only for pieces no peers have. By demanding a common piece instead of a rare one from the seed, the seed is spending its bandwidth but not increasing the torrent's availability. If the seed or other seeds leave, the rare piece gets rarer and the sequential downloader has not offset that by doing its part.

Sequential downloading rewards bad behavior and punishes good behavior. ...because those that initially seek rare pieces will likely have nothing to offer them, and likewise will get little-to-nothing in return.

If the sequential downloading isn't as fast or faster than video playback, it's nearly unworkable. For HD video, this can require download speeds faster than the average upload speed max for most ADSL and cable lines. This means it cannot scale so that everyone can do it UNLESS everyone seeds heavily after they're done downloading, which sadly isn't the case.

In short, HD video sequential downloading is pretty much unworkable on public torrents even with very favorable seed-to-peer ratios.

If you're doing sequential downloading on something besides video, then much lower download speeds would "work"...but likewise there's fewer and weaker reasons to do so. For multi-file torrents, do-not-download and High/Normal/Low priority can cover those cases well enough already.

A torrent that's mis-marked as private on a public tracker is even worse off from sequential downloading and hit and runners.

A feature that will *ONLY* work well with private trackers has nearly nil chance of being implemented.

Example of 1 BitTorrent peer (out of 4 peers) using high priority on Beginning/Ending of each file in a torrent:

http://img197.imageshack.us/img197/5263/bitcometpeerpresent.png

There are also hints of sequential downloading across the whole torrent, with the far left side availability being much higher than the right.

Share this post


Link to post
Share on other sites

I generally think of myself as intelligent, but in these forums there is so much talk that is above my head. I understand most of what is being said here, but not all. Kindly permit me to lay my exact scenario out, in the hope of someone giving me some guidance.

First I must confess that I have been guilty of sequential file downloading. I'm sorry. I got this software when a friend "hipped" me to it, but I'll be honest, I didn't try to find out anything more than how to make it download fast. In the process, I discovered seeding articles here and have since tried to seed to at least 3 on all my downloads. The problem is, when downloading a series of videos, I select the first video only, then when its done I clear the torrent and reload it. Then I select 1 and 2 (and then repeat so at the end I will just select all of them) I never knew I was doing anything wrong. I thought by seeding like crazy (when I do this process on a slow torrent I get much better upload speeds than download so by the time a file is complete the ratio is usually over 2 or 3) I was "helping out".

Okay, finally to the question. When I get to the end, I intend to leave the download seeding for a long time. (6 or so) Is this still bad (I'm afraid I already know the answer, but this torrent is super slow AND I'm on HughesNet satelite (no port forwarding) AND I really wanna see these videos before the year 2012) ? Sorry I searched the forum and used the FAQ, but I couldn't find an answer that I could understand. Thankfully I can understand yes and no fairly well.

Share this post


Link to post
Share on other sites
The problem is, when downloading a series of videos, I select the first video only, then when its done I clear the torrent and reload it. Then I select 1 and 2 (and then repeat so at the end I will just select all of them)

It is unnecessary to remove and re-add the torrent to have subsequent files download, but you're better off just downloading everything if you're going to do so anyway.

It does take at LEAST an upload factor of 6 to recover from the damage that sequential downloading can potentially cause.

Share this post


Link to post
Share on other sites

Thanks for your help (and not condemning me to die for making a mistake) DreadWingKnight. The Moderators here are very helpful (if a little intimidating sometimes XD) We newbies are usually less interested in all these technical aspects and need guidance to understand some of these concepts. I appreciate your taking time to help me.

Share this post


Link to post
Share on other sites
Guest
This topic is now closed to further replies.