Archived

This topic is now archived and is closed to further replies.

rjanssen

Reproducible crash when adding large amount (75+) torrent files.

Recommended Posts

All,

Just wanted to say thank you for providing a linux client. So far, so good. Things run relatively well considering the alpha state of the code. I have stumbled upon one issue that I believe is probably a bug.

Currently, I use a smb share to auto-load torrent files. While the feature works, there are crashes when I add a large amount of torrent files at one time. As I migrate my torrent data and files over to the new linux box I've found I'm able to consistently crash utserver by adding more than 75+ torrents at a time. The easy work around is to load a few files at a time. Easy enough, but I assume this may be something you'd like to chase down?

I don't have too many details, as I'm not able to get the /LOGFILE option to output a log. I'm running 32 bit Ubuntu Server 10.04.

root@loot:/usr/local/bin/utserver/l# uname -a

Linux loot 2.6.32-24-generic-pae #42-Ubuntu SMP Fri Aug 20 15:37:22 UTC 2010 i686 GNU/Linux

I'd be happy to help out in any way I can. I assume first I'd need some help getting the logfile to output correctly. Currently I'm starting uTorrent using this command:

screen -dmS uTorrent /usr/local/bin/utserver/utserver

I've tried using:

screen -dmS uTorrent /usr/local/bin/utserver/utserver /LOGFILE

and

screen -dmS uTorrent /usr/local/bin/utserver/utserver /LOGFILE /usr/local/bin/utserver/utserv.log

Neither of which output a log file in the working directory.

-Ryan

Share this post


Link to post
Share on other sites

Update:

Everything was going swimmingly until I added around 1000 torrent files. Now utserver crashes within 10 seconds of starting it each time. After deleting resume.dat and resume.dat.old everything works file. Looks like possibly there are corruption issues with the resume.dat file?

What is the correct way to gracefully kill the server? Currently I'm killing the screen session to stop the server.

Share this post


Link to post
Share on other sites

Thank you for the response.

Currently ulimit for open files is set to 1024. Is it fair to say that uTorrent doesn't actually have all of the files open at once or do I need to change something?

I don't have any errors relating to ulimit in daemon.log, syslog, messages or dmesg. Any suggestions?

root@loot:/var/log# ulimit -a

core file size (blocks, -c) 0

data seg size (kbytes, -d) unlimited

scheduling priority (-e) 20

file size (blocks, -f) unlimited

pending signals (-i) 16382

max locked memory (kbytes, -l) 64

max memory size (kbytes, -m) unlimited

open files (-n) 1024

pipe size (512 bytes, -p) 8

POSIX message queues (bytes, -q) 819200

real-time priority (-r) 0

stack size (kbytes, -s) 8192

cpu time (seconds, -t) unlimited

max user processes (-u) unlimited

virtual memory (kbytes, -v) unlimited

file locks (-x) unlimited

Share this post


Link to post
Share on other sites

uT is probably crashing from the file handle limit and not getting a chance to trigger a log event in any of the logs.

Try with a ulimit -n set around 2048-4096 when you get it to the 1000 torrent limit.

Share this post


Link to post
Share on other sites
I'd need some help getting the logfile to output correctly.

I've tried using:

screen -dmS uTorrent /usr/local/bin/utserver/utserver /LOGFILE

and

screen -dmS uTorrent /usr/local/bin/utserver/utserver /LOGFILE /usr/local/bin/utserver/utserv.log

Neither of which output a log file in the working directory.

-Ryan

Thanks for your report.

Use a dash instead of a forward slash (-logfile). The argument name is case-insensitive, BTW. The documentation has since been corrected but we haven't released a new revision yet.

Share this post


Link to post
Share on other sites
What is the correct way to gracefully kill the server? Currently I'm killing the screen session to stop the server.

$ kill pid

where pid is the process ID of the utserver process. The TERM signal is the default signal sent by kill and that will work.

If the shell kills your process due to ulimit issues, I would think you would see a kill signal report on the controlling terminal (that's what I've seen when that has happened).

Share this post


Link to post
Share on other sites

Thank you both. I think we're getting closer. I bumped up the ulimit to 51200 and got to around 1800 torrents loaded before it choked this time. Going to try 102400 after deleting the resume dats to get uT to fire up again. Logging is setup now.

Share this post


Link to post
Share on other sites

Well damn. I woke up this morning hoping all my files would be properly checked, but no dice. We do have logs this time. Any suggestions?

ryan@loot:~/utserver$ ulimit -n

102400

ryan@loot:~/utserver$ tail -35 ut.log

[08:44:23] *** coupler - sue?o sonado: PIECE 137 FAILED HASH CHECK

[08:46:19] Banned 110.***.*.**:59435

[10:38:05] IO Error:21 line:337 align:-99 pos:-99 count:131072 actual:0

[10:38:05] IO Error:21 line:337 align:-99 pos:-99 count:131072 actual:0

[10:38:05] IO Error:21 line:337 align:-99 pos:-99 count:131072 actual:0

[10:38:05] IO Error:21 line:337 align:-99 pos:-99 count:131072 actual:0

[10:38:05] IO Error:21 line:337 align:-99 pos:-99 count:131072 actual:0

[10:38:05] IO Error:21 line:337 align:-99 pos:-99 count:131072 actual:0

[10:38:05] IO Error:21 line:337 align:-99 pos:-99 count:131072 actual:0

[10:38:05] IO Error:21 line:337 align:-99 pos:-99 count:131072 actual:0

[10:38:05] IO Error:21 line:337 align:-99 pos:-99 count:131072 actual:0

[10:38:05] IO Error:21 line:337 align:-99 pos:-99 count:131072 actual:0

[10:38:05] IO Error:21 line:337 align:-99 pos:-99 count:131072 actual:0

[10:38:05] IO Error:21 line:337 align:-99 pos:-99 count:131072 actual:0

[10:38:05] IO Error:21 line:337 align:-99 pos:-99 count:131072 actual:0

[10:38:05] IO Error:21 line:337 align:-99 pos:-99 count:131072 actual:0

[10:38:05] IO Error:21 line:337 align:-99 pos:-99 count:131072 actual:0

[10:38:05] IO Error:21 line:337 align:-99 pos:-99 count:131072 actual:0

[10:38:05] IO Error:21 line:337 align:-99 pos:-99 count:131072 actual:0

[10:38:05] IO Error:21 line:337 align:-99 pos:-99 count:131072 actual:0

[10:38:05] IO Error:21 line:337 align:-99 pos:-99 count:131072 actual:0

[10:38:05] IO Error:21 line:337 align:-99 pos:-99 count:131072 actual:0

[10:38:05] IO Error:21 line:337 align:-99 pos:-99 count:131072 actual:0

[10:38:05] IO Error:21 line:337 align:-99 pos:-99 count:131072 actual:0

[10:38:05] IO Error:21 line:337 align:-99 pos:-99 count:131072 actual:0

[10:38:05] IO Error:21 line:337 align:-99 pos:-99 count:131072 actual:0

[10:38:05] IO Error:21 line:337 align:-99 pos:-99 count:131072 actual:0

[10:38:05] IO Error:21 line:337 align:-99 pos:-99 count:131072 actual:0

[10:38:05] IO Error:21 line:337 align:-99 pos:-99 count:131072 actual:0

[10:38:05] IO Error:21 line:337 align:-99 pos:-99 count:131072 actual:0

[10:38:05] IO Error:21 line:337 align:-99 pos:-99 count:131072 actual:0

[10:38:05] IO Error:21 line:337 align:-99 pos:-99 count:131072 actual:0

[10:38:05] IO Error:21 line:337 align:-99 pos:-99 count:131072 actual:0

[10:38:05] IO Error:21 line:337 align:-99 pos:-99 count:131072 actual:0

[10:38:05] IO Error:21 line:337 align:-99 pos:-99 count:131072 actual:0

ryan@loot:~/utserver$

Share this post


Link to post
Share on other sites
Well damn. I woke up this morning hoping all my files would be properly checked, but no dice. We do have logs this time. Any suggestions?

[10:38:05] IO Error:21 line:337 align:-99 pos:-99 count:131072 actual:0

Thanks for the detailed report.

#define EISDIR 21 /* Is a directory */

Can you review the organization of your file system with respect to the directories you are configuring the server to use? It looks like there are probably some reads/writes from/to what the server thinks are files, but which are actually directories.

Share this post


Link to post
Share on other sites

Aha.. In the configuration should there be a trailing / or no? Here is the listing of the directory structure uTorrent is using and a screenshot of the directory config. Look good?

ryan@loot:~/torrents$ ls -alh

total 1.7M

drwxrwxrwx 7 root root 4.0K 2010-09-14 12:24 .

drwxr-xr-x 4 root root 4.0K 2010-09-12 13:19 ..

drwxr-xr-x 2 ryan ryan 164K 2010-09-14 12:25 complete

-rwxr--r-- 1 ryan ryan 16K 2010-09-13 07:11 .DS_Store

drwxr-xr-x 2 ryan ryan 480K 2010-09-14 11:24 incoming

drwxr-xr-x 7 ryan ryan 4.0K 2010-09-13 08:22 news

drwxr-xr-x 2 ryan ryan 676K 2010-09-14 11:24 receipts

-rwxr--r-- 1 ryan ryan 106K 2010-09-12 13:43 .VolumeIcon.icns

drwxr-xr-x 2047 ryan ryan 220K 2010-09-14 12:21 working

ss1.png

Share this post


Link to post
Share on other sites
Aha.. In the configuration should there be a trailing / or no? Here is the listing of the directory structure uTorrent is using and a screenshot of the directory config. Look good?

It looks OK. I don't think you need to supply a trailing slash. I seem to remember testing that recently, but I haven't had time to confirm that. I would think that the software should accommodate either the presence or absence of a trailing slash, so I will check that.

Share this post


Link to post
Share on other sites
Aha.. In the configuration should there be a trailing / or no? Here is the listing of the directory structure uTorrent is using and a screenshot of the directory config. Look good?

It looks OK. I don't think you need to supply a trailing slash. I seem to remember testing that recently' date=' but I haven't had time to confirm that. I would think that the software should accommodate either the presence or absence of a trailing slash, so I will check that.[/quote']

I'd agree. I tried it both ways and the behavior was the same.

At this point I'm not really sure what to do. There's no additional information in the log file and everything works great until I attempt to import my .torrent files. I'm not getting any errors in any of the system logs so I'm inclined to think it's something with uTorrent.

I still wouldn't rule a system configuration issue out, but I'm not really sure where to turn as I don't really have any troubleshooting info to go off of.

-Ryan

Share this post


Link to post
Share on other sites

It looks OK. I don't think you need to supply a trailing slash. I seem to remember testing that recently' date=' but I haven't had time to confirm that. I would think that the software should accommodate either the presence or absence of a trailing slash, so I will check that.[/quote']

I'd agree. I tried it both ways and the behavior was the same.

At this point I'm not really sure what to do. There's no additional information in the log file and everything works great until I attempt to import my .torrent files. I'm not getting any errors in any of the system logs so I'm inclined to think it's something with uTorrent.

I still wouldn't rule a system configuration issue out, but I'm not really sure where to turn as I don't really have any troubleshooting info to go off of.

I'll try this myself when I have some time.

Share this post


Link to post
Share on other sites

Thank you.

FYI, my total torrent count is around 2300. I don't know if this is the root cause of the issue, but it seems like it dies after around 1700 torrents imported. In each case uT will not start until after I delete the resume.dat files.

-Ryan

Share this post


Link to post
Share on other sites
FYI, my total torrent count is around 2300. I don't know if this is the root cause of the issue, but it seems like it dies after around 1700 torrents imported. In each case uT will not start until after I delete the resume.dat files.

That helps. I forgot to read back in this thread to see the early part about ulimits, which may come into play here. Although, it sounds like it hangs rather than crashes at this stage, and I'm assuming you are boosting the ulimit appropriately given the earlier discussion in this thread. To clarify, when you say

uT will not start

do you mean the program exits immediately, or the program stays alive (doesn't crash) but the web UI is unresponsive or unavailable, or the program stays alive but there is no torrent activity seen on the web UI unless you first delete resume.dat (I'm guessing the last choice)?

If the problem depends on how many torrents the server is handling, then it's not the directory specification that is the problem.

Share this post


Link to post
Share on other sites

Currently ulimit -n is set at 102400. I can go higher if needed, but I figure the chance of anything reaching 102400 is slim.

I'll copy the torrent files from my receipts directory to the incoming directory. Everything goes great until hours later when I come back and uT has crashed. No running process in ps, and the only thing left in the log (besides torrent already exists errors) are the errors posted above.

After the initial crash I'll try to run uT again. If I run ./utserver the screen opens for a few seconds, displays "Aborted" and closes. There is no additional info written to ut.log. Until I delete the resume files this behavior is consistent.

I may have been incorrect about the amount of torrent files it takes to crash the server. I'm guessing initially I was running into a ulimit issue, but that this may be something different. If you'd like to poke around the server let me know. I can create a login and allow ssh access through my firewall if you can share the ip you'll be connecting from.

Share this post


Link to post
Share on other sites
Currently ulimit -n is set at 102400. I can go higher if needed, but I figure the chance of anything reaching 102400 is slim.

I would think that would be plenty. To confirm the shell is accepting those settings, did you run ulimit -n to verify after changing? There are system limits and a non-root user can increase values up to those system limits. However, I don't think that's the problem - see below.

After the initial crash I'll try to run uT again. If I run ./utserver the screen opens for a few seconds, displays "Aborted" and closes.

If the process was killed due to a ulimit issue, I would expect you to see a "Killed" message as the result of utserver being sent a KILL (9) signal. For an "Aborted" message, the server is likely to present an assertion failure message which would be valuable. Maybe you are invoking it using a non-persistent terminal and that's why the terminal window opens and closes? If so, try running the program from a persistent terminal window so any assertion failure message persists.

Share this post


Link to post
Share on other sites

mcdonald,

I did verify ulimit -n and it shows 102400. I have this configured just for the user that's running uT. I also verified that the setting remained after rebooting.

By persistent do you mean using screen to start the process?

Share this post


Link to post
Share on other sites
For an "Aborted" message, the server is likely to present an assertion failure message which would be valuable. Maybe you are invoking it using a non-persistent terminal and that's why the terminal window opens and closes? If so, try running the program from a persistent terminal window so any assertion failure message persists.

Or, try running the program within gdb at those times.

Share this post


Link to post
Share on other sites
mcdonald,

I did verify ulimit -n and it shows 102400. I have this configured just for the user that's running uT. I also verified that the setting remained after rebooting.

OK.

By persistent do you mean using screen to start the process?

If you are running a GUI on that machine, you could invoke a terminal window. If not, you could use screen, or perform a remote login via ssh. As long as whatever you use doesn't close the window (and thus lose the output) when the process exits.

Share this post


Link to post
Share on other sites

I'm new to gdb. I tried running gdb ./utserver and this is what happened:

ryan@loot:/opt/utserver$ gdb ./utserver

GNU gdb (GDB) 7.1-ubuntu

Copyright © 2010 Free Software Foundation, Inc.

License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>

This is free software: you are free to change and redistribute it.

There is NO WARRANTY, to the extent permitted by law. Type "show copying"

and "show warranty" for details.

This GDB was configured as "i486-linux-gnu".

For bug reporting instructions, please see:

<http://www.gnu.org/software/gdb/bugs/>...

Reading symbols from /opt/utserver/utserver...(no debugging symbols found)...done.

I assume I'm not doing it correctly.

Share this post


Link to post
Share on other sites
I'm new to gdb. I tried running gdb ./utserver and this is what happened:

ryan@loot:/opt/utserver$ gdb ./utserver

GNU gdb (GDB) 7.1-ubuntu

Copyright © 2010 Free Software Foundation, Inc.

License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>

This is free software: you are free to change and redistribute it.

There is NO WARRANTY, to the extent permitted by law. Type "show copying"

and "show warranty" for details.

This GDB was configured as "i486-linux-gnu".

For bug reporting instructions, please see:

<http://www.gnu.org/software/gdb/bugs/>...

Reading symbols from /opt/utserver/utserver...(no debugging symbols found)...done.

I assume I'm not doing it correctly.

Looks OK to me so far, except I don't see a (gdb) prompt in the output you posted. If you see one, type r and hit enter and the program will run. If the program aborts, type bt and hit enter. If you want to stop the program, hit control-C in that window. When the program terminates or you are done, type quit and hit enter.

Share this post


Link to post
Share on other sites

Okay, it crashed and I was able to get a gdb dump. Asterisks were added by me; they were the folder names.

[Thread 0xbb806b70 (LWP 8413) exited]

[New Thread 0xbc007b70 (LWP 8414)]

[New Thread 0xbc808b70 (LWP 8415)]

[New Thread 0xbd009b70 (LWP 8416)]

[Thread 0xbc007b70 (LWP 8414) exited]

[Thread 0xbd009b70 (LWP 8416) exited]

[Thread 0xbc808b70 (LWP 8415) exited]

[New Thread 0xbd80ab70 (LWP 8417)]

[New Thread 0xbe00bb70 (LWP 8418)]

[Thread 0xbd80ab70 (LWP 8417) exited]

[Thread 0xbe00bb70 (LWP 8418) exited]

[New Thread 0xbe80cb70 (LWP 8419)]

[New Thread 0xbf00db70 (LWP 8420)]

[New Thread 0xbf80eb70 (LWP 8421)]

[Thread 0xbf00db70 (LWP 8420) exited]

[Thread 0xbf80eb70 (LWP 8421) exited]

[Thread 0xbe80cb70 (LWP 8419) exited]

*** *****************: PIECE 254 FAILED HASH CHECK

*** *****************: PIECE 240 FAILED HASH CHECK

*** *****************: PIECE 259 FAILED HASH CHECK

Moving files from '/mnt/external/working/*******' to '/mnt/external/complete/*******'

Moving files from '/mnt/external/working/*******' to '/mnt/external/complete/*******'

Moving files from '/mnt/external/working/*******' to '/mnt/external/complete/*******'

Moving files from '/mnt/external/working/*******' to '/mnt/external/complete/*******'

Moving files from '/mnt/external/working/*******' to '/mnt/external/complete/*******'

Moving files from '/mnt/external/working/*******' to '/mnt/external/complete/*******'

Program received signal SIGABRT, Aborted.

0x0012d422 in __kernel_vsyscall ()

Share this post


Link to post
Share on other sites