aMule Forum
English => Backtraces => Topic started by: HipHopPunkSuperStar on January 19, 2005, 05:17:38 PM
-
Okay, I tried running amuled under Mac OS X several times now, but it keeps crashing after 20-30 minutes, I've tried the rc8 version, yesterdays and todays cvs always the same.
I hope I can help your development efforts by posting the backtraces here:
Okay, here we go:
This comes from rc8:
crash:
Program received signal EXC_BAD_ACCESS, Could not access memory.
[Switching to process 8946 thread 0x2e03]
0x0164eb34 in wxSocketBase::_Wait(long, long, int) ()
bt
(gdb) bt
#0 0x0164eb34 in wxSocketBase::_Wait(long, long, int) ()
#1 0x000116a4 in CSocketGlobalThread::Entry() (this=0x4d0e268) at ListenSocket.cpp:2567
#2 0x02269888 in wxThreadInternal::MacThreadStart(void*) ()
#3 0x902c6d88 in PrivateMPEntryPoint ()
#4 0x900246e8 in _pthread_body ()
bt full
(gdb) bt full
#0 0x0164eb34 in wxSocketBase::_Wait(long, long, int) ()
No symbol table info available.
#1 0x000116a4 in CSocketGlobalThread::Entry() (this=0x4d0e268) at ListenSocket.cpp:2567
cur_sock = (CClientReqSocket *) 0x240f220
it =
locker = {
m_isOk = true,
m_mutex = @0x240e8a0
}
#2 0x02269888 in wxThreadInternal::MacThreadStart(void*) ()
No symbol table info available.
#3 0x902c6d88 in PrivateMPEntryPoint ()
No symbol table info available.
#4 0x900246e8 in _pthread_body ()
No symbol table info available.
This comes from yesterdays cvs:
crash
Program received signal EXC_BAD_INSTRUCTION, Illegal instruction/operand.
[Switching to process 7676 thread 0x2ddf]
0x00001994 in ?? ()
bt
(gdb) bt
#0 0x00001994 in ?? ()
#1 0x00001994 in ?? ()
#2 0x0165a384 in wxSocketBase::Read(void*, unsigned) ()
#3 0x00013bb4 in ECSocket::ReadBuffer(wxSocketBase*, void*, unsigned) (this=0x240e8e0, sock=0x4d3c6c0, buffer=0xf058ac78, len=1) at ECSocket.cpp:136
#4 0x00013a6c in ECSocket::ReadNumber(wxSocketBase*, void*, unsigned) (this=0x240e8e0, sock=0x4d3c6c0, buffer=0xf058ac78, len=1) at ECSocket.cpp:92
#5 0x00013f58 in ECSocket::ReadFlags(wxSocketBase*) (this=0x240e8e0, sock=0x4d3c6c0) at ECSocket.cpp:212
#6 0x00013ffc in ECSocket::ReadPacket(wxSocketBase*) (this=0x240e8e0, sock=0x4d3c6c0) at ECSocket.cpp:234
#7 0x0004db40 in ExternalConnClientThread::Entry() (this=0x4d11980) at ExternalConn.cpp:1608
#8 0x02269888 in wxThreadInternal::MacThreadStart(void*) ()
#9 0x902c6d88 in PrivateMPEntryPoint ()
#10 0x900246e8 in _pthread_body ()
bt full
(gdb) bt full
#0 0x00001994 in ?? ()
No symbol table info available.
#1 0x00001994 in ?? ()
No symbol table info available.
#2 0x0165a384 in wxSocketBase::Read(void*, unsigned) ()
No symbol table info available.
#3 0x00013bb4 in ECSocket::ReadBuffer(wxSocketBase*, void*, unsigned) (this=0x240e8e0, sock=0x4d3c6c0, buffer=0xf058ac78, len=1) at ECSocket.cpp:136
msgRemain = 1
LastIO = 0
iobuf = 0xf058ac78 ""
error = false
#4 0x00013a6c in ECSocket::ReadNumber(wxSocketBase*, void*, unsigned) (this=0x240e8e0, sock=0x4d3c6c0, buffer=0xf058ac78, len=1) at ECSocket.cpp:92
No locals.
#5 0x00013f58 in ECSocket::ReadFlags(wxSocketBase*) (this=0x240e8e0, sock=0x4d3c6c0) at ECSocket.cpp:212
i = 0
flags = 0
b = 0 '\0'
#6 0x00013ffc in ECSocket::ReadPacket(wxSocketBase*) (this=0x240e8e0, sock=0x4d3c6c0) at ECSocket.cpp:234
flags = 4032343504
p = (CECPacket *) 0x0
#7 0x0004db40 in ExternalConnClientThread::Entry() (this=0x4d11980) at ExternalConn.cpp:1608
request = (CECPacket *) 0xf058ad90
response = (CECPacket *) 0x0
#8 0x02269888 in wxThreadInternal::MacThreadStart(void*) ()
No symbol table info available.
#9 0x902c6d88 in PrivateMPEntryPoint ()
No symbol table info available.
#10 0x900246e8 in _pthread_body ()
No symbol table info available.
And todays cvs:
crash
Program received signal EXC_BAD_ACCESS, Could not access memory.
[Switching to process 20343 thread 0x2e03]
0x0165ab34 in wxSocketBase::_Wait(long, long, int) ()
bt
bt
#0 0x0165ab34 in wxSocketBase::_Wait(long, long, int) ()
#1 0x000121bc in CSocketGlobalThread::Entry() (this=0x4d1a608) at ListenSocket.cpp:2583
#2 0x02269ef4 in wxThreadInternal::MacThreadStart(void*) ()
#3 0x902c6d88 in PrivateMPEntryPoint ()
#4 0x900246e8 in _pthread_body ()
bt full
bt full
#0 0x0165ab34 in wxSocketBase::_Wait(long, long, int) ()
No symbol table info available.
#1 0x000121bc in CSocketGlobalThread::Entry() (this=0x4d1a608) at ListenSocket.cpp:2583
cur_sock = (CClientReqSocket *) 0x4de36c0
it =
locker = {
m_isOk = true,
m_mutex = @0x240e894
}
#2 0x02269ef4 in wxThreadInternal::MacThreadStart(void*) ()
No symbol table info available.
#3 0x902c6d88 in PrivateMPEntryPoint ()
No symbol table info available.
#4 0x900246e8 in _pthread_body ()
No symbol table info available.
Hope that helps you guys keeping up the great work...
Let me know if I can do more...
-
Hi HipHopPunkSuperStar,
Thanks for your reports. Unfortunately, I'm not seeing what's going wrong. Not your fault, just not enough info yet.
Could you rebuild wxMac with these additional configure flags: --enable-debug --enable-debug_gdb
Then, repeat the configure and make steps for aMule.
Hopefully, with these changes your backtraces will have a little extra info that will help us debug this.
Thanks.
-
Okay, downloaded wxMac again and rebuild everything, but I'm not sure if I'm supposed to somehow report on wxMac, if so, tell me how...
So this first crash happend after clicking on the "Files" link in the webserver, after just a few seconds of running amuled:
Crash:
Program received signal SIGPIPE, Broken pipe.
0x90048768 in mach_wait_until ()
bt:
(gdb) bt
#0 0x90048768 in mach_wait_until ()
#1 0x902e4ff4 in MPDelayUntil ()
#2 0x02afafd4 in wxThread::Sleep(unsigned long) (milliseconds=100) at ../src/mac/carbon/thread.cpp:1247
#3 0x000a98c0 in CamuleDaemonApp::OnRun() (this=0x240e7b0) at amuled.cpp:150
#4 0x02ab1b14 in wxEntry(int&, char**) (argc=@0xbffffcb8, argv=0xbffffd60) at ../src/common/init.cpp:411
#5 0x000a9588 in main (argc=1, argv=0xbffffd60) at amuled.cpp:128
bt full:
#0 0x90048768 in mach_wait_until ()
No symbol table info available.
#1 0x902e4ff4 in MPDelayUntil ()
No symbol table info available.
#2 0x02afafd4 in wxThread::Sleep(unsigned long) (milliseconds=100) at ../src/mac/carbon/thread.cpp:1247
wakeup = {
hi = 612,
lo = 2052917596
}
#3 0x000a98c0 in CamuleDaemonApp::OnRun() (this=0x240e7b0) at amuled.cpp:150
msRun = 0
locker = {
m_isOk = true,
m_mutex = @0x240e8e0
}
uLoop = 100
msWait = 100
#4 0x02ab1b14 in wxEntry(int&, char**) (argc=@0xbffffcb8, argv=0xbffffd60) at ../src/common/init.cpp:411
callOnExit = {}
cleanupOnExit = {}
#5 0x000a9588 in main (argc=1, argv=0xbffffd60) at amuled.cpp:128
No locals.
This one happend after like 30 secs to 1min of running amuled, but I'm not aware of doing anything that might have provoked it:
crash:
Program received signal SIGPIPE, Broken pipe.
0x90010808 in sendto ()
bt:
(gdb) bt
#0 0x90010808 in sendto ()
#1 0x02219860 in GSocket::Send_Stream(char const*, int) (this=0x2793e50, buffer=0x2788070 "?#", size=40) at ../src/unix/gsocket.cpp:1312
#2 0x0221886c in GSocket::Write(char const*, int) (this=0x2793e50, buffer=0x2788070 "?#", size=40) at ../src/unix/gsocket.cpp:865
#3 0x02213928 in wxSocketBase::_Write(void const*, unsigned) (this=0x2793570, buffer=0x2788070, nbytes=40) at ../src/common/socket.cpp:541
#4 0x02213808 in wxSocketBase::Write(void const*, unsigned) (this=0x2793570, buffer=0x2788070, nbytes=40) at ../src/common/socket.cpp:509
#5 0x00018dfc in CEMSocket::Send(char*, int, int) (this=0x2793570, lpBuf=0x2788070 "?#", nBufLen=40) at EMSocket.cpp:396
#6 0x00018838 in CEMSocket::SendPacket(Packet*, bool, bool) (this=0x2793570, packet=0x279c310, delpacket=true, controlpacket=true) at EMSocket.cpp:320
#7 0x00025a70 in CUpDownClient::SendPacket(Packet*, bool, bool) (this=0x2796c50, packet=0x279c310, delpacket=true, controlpacket=true) at BaseClient.cpp:2075
#8 0x00035604 in CUpDownClient::SendFileRequest() (this=0x2796c50) at DownloadClient.cpp:246
#9 0x000217bc in CUpDownClient::ConnectionEstablished() (this=0x2796c50) at BaseClient.cpp:1387
#10 0x00021178 in CUpDownClient::TryToConnect(bool) (this=0x2796c50, bIgnoreMaxCon=false) at BaseClient.cpp:1303
#11 0x00035044 in CUpDownClient::AskForDownload() (this=0x2796c50) at DownloadClient.cpp:151
#12 0x0006e450 in CPartFile::Process(unsigned, unsigned char) (this=0x383fa00, reducedownload=0, m_icounter=10 '\n') at PartFile.cpp:1707
#13 0x0003ca80 in CDownloadQueue::Process() (this=0x272c900) at DownloadQueue.cpp:378
#14 0x000a7e9c in CamuleApp::OnCoreTimer(wxEvent&) (this=0x240e7b0) at amule.cpp:1378
#15 0x000a98f4 in CamuleDaemonApp::OnRun() (this=0x240e7b0) at amuled.cpp:156
#16 0x02ab1b14 in wxEntry(int&, char**) (argc=@0xbffffcb8, argv=0xbffffd60) at ../src/common/init.cpp:411
#17 0x000a9588 in main (argc=1, argv=0xbffffd60) at amuled.cpp:128
bt full:
(gdb) bt full
#0 0x90010808 in sendto ()
No symbol table info available.
#1 0x02219860 in GSocket::Send_Stream(char const*, int) (this=0x2793e50, buffer=0x2788070 "?#", size=40) at ../src/unix/gsocket.cpp:1312
old_handler = (void (*)(void)) 0x1
ret = 41512672
#2 0x0221886c in GSocket::Write(char const*, int) (this=0x2793e50, buffer=0x2788070 "?#", size=40) at ../src/unix/gsocket.cpp:865
ret = -1073745408
#3 0x02213928 in wxSocketBase::_Write(void const*, unsigned) (this=0x2793570, buffer=0x2788070, nbytes=40) at ../src/common/socket.cpp:541
total = 0
ret = 41451632
#4 0x02213808 in wxSocketBase::Write(void const*, unsigned) (this=0x2793570, buffer=0x2788070, nbytes=40) at ../src/common/socket.cpp:509
No locals.
#5 0x00018dfc in CEMSocket::Send(char*, int, int) (this=0x2793570, lpBuf=0x2788070 "?#", nBufLen=40) at EMSocket.cpp:396
tosend = 40
result = 41451632
#6 0x00018838 in CEMSocket::SendPacket(Packet*, bool, bool) (this=0x2793570, packet=0x279c310, delpacket=true, controlpacket=true) at EMSocket.cpp:320
bCheckControlQueue = false
#7 0x00025a70 in CUpDownClient::SendPacket(Packet*, bool, bool) (this=0x2796c50, packet=0x279c310, delpacket=true, controlpacket=true) at BaseClient.cpp:2075
No locals.
#8 0x00035604 in CUpDownClient::SendFileRequest() (this=0x2796c50) at DownloadClient.cpp:246
packet = (Packet *) 0x279c310
dataFileReq = {
= {
= {
_vptr$CFile = 0x272f68,
m_fd = -1,
m_error = false,
fFilePath = {
= {
static npos = 4294967295,
m_pchData = 0x2b74df4 ""
}, }
},
members of CMemFile:
m_GrowBytes = 32,
m_position = 0,
m_BufferSize = 0,
m_FileSize = 0,
m_delete = false,
m_buffer = 0x0
},
= {
_vptr$CFileDataIO = 0x272fdc
}, }
#9 0x000217bc in CUpDownClient::ConnectionEstablished() (this=0x2796c50) at BaseClient.cpp:1387
e = {
= {
= {
_vptr$wxObject = 0xbffff5b0,
static ms_classInfo = {
m_className = 0x2b1a2c4 "wxObject",
m_objectSize = 8,
m_objectConstructor = 0,
m_baseInfo1 = 0x0,
m_baseInfo2 = 0x0,
static sm_first = 0x223a264,
m_next = 0x2b93a98,
static sm_classTable = 0x240c9d0
},
m_refData = 0x273c4c0
},
members of wxEvent:
m_eventObject = 0xbffff670,
m_eventType = 0,
m_timeStamp = 10000,
m_id = 800,
m_callbackUserData = 0x6,
m_propagationLevel = 914745753,
m_skipped = 3221222832,
m_isCommandEvent = false,
static ms_classInfo = {
m_className = 0x2b1d080 "wxEvent",
m_objectSize = 40,
m_objectConstructor = 0,
m_baseInfo1 = 0x2b93acc,
m_baseInfo2 = 0x0,
static sm_first = 0x223a264,
m_next = 0x2b93cd0,
static sm_classTable = 0x240c9d0
}
},
members of GUIEvent:
ID = 3221222752,
byte_value = 0 '\0',
short_value = 5220,
long_value = 3221222832,
longlong_value = 1234776,
string_value = {
= {
static npos = 4294967295,
m_pchData = 0x12e1e0 "\177?\002?\220~"
}, },
ptr_value = 0xbffff5d0,
ptr_aux_value = 0x0
}
packet = (Packet *) 0xbffff620
packet = (Packet *) 0xbffff620
#10 0x00021178 in CUpDownClient::TryToConnect(bool) (this=0x2796c50, bIgnoreMaxCon=false) at BaseClient.cpp:1303
data = {
= {
= {
_vptr$CFile = 0x0,
m_fd = -1,
m_error = false,
fFilePath = {
= {
static npos = 4294967295,
m_pchData = 0x0
}, }
},
members of CMemFile:
m_GrowBytes = 0,
m_position = 10,
m_BufferSize = 0,
m_FileSize = 0,
m_delete = 45567476,
m_buffer = 0x2796c50 "\002y?@"
},
= {
_vptr$CFileDataIO = 0x0
}, }
packet = (Packet *) 0x2b885b8
tmp = {
= {
= {
= {
= {
_vptr$wxObject = 0x2b885b8,
static ms_classInfo = {
m_className = 0x2b1a2c4 "wxObject",
m_objectSize = 8,
m_objectConstructor = 0,
m_baseInfo1 = 0x0,
m_baseInfo2 = 0x0,
static sm_first = 0x223a264,
m_next = 0x2b93a98,
static sm_classTable = 0x240c9d0
},
m_refData = 0x0
},
members of wxSockAddress:
static ms_classInfo = {
m_className = 0x2222b2c "wxSockAddress",
m_objectSize = 12,
m_objectConstructor = 0,
m_baseInfo1 = 0x2b93acc,
m_baseInfo2 = 0x0,
static sm_first = 0x223a264,
m_next = 0x223a034,
static sm_classTable = 0x240c9d0
},
m_address = 0x0
},
members of wxIPaddress:
static ms_classInfo = {
m_className = 0x2222b3c "wxIPaddress",
m_objectSize = 12,
m_objectConstructor = 0,
m_baseInfo1 = 0x223a04c,
m_baseInfo2 = 0x0,
static sm_first = 0x223a264,
m_next = 0x223a04c,
static sm_classTable = 0x240c9d0
}
},
members of wxIPV4address:
static ms_classInfo = {
m_className = 0x2222b48 "wxIPV4address",
m_objectSize = 16,
m_objectConstructor = 0x220db50 ,
m_baseInfo1 = 0x223a064,
m_baseInfo2 = 0x0,
static sm_first = 0x223a264,
m_next = 0x223a064,
static sm_classTable = 0x240c9d0
},
m_origHostname = {
= {
static npos = 4294967295,
m_pchData = 0x2710 "}\210\002?=\214"
}, }
}, }
#11 0x00035044 in CUpDownClient::AskForDownload() (this=0x2796c50) at DownloadClient.cpp:151
No locals.
#12 0x0006e450 in CPartFile::Process(unsigned, unsigned char) (this=0x383fa00, reducedownload=0, m_icounter=10 '\n') at PartFile.cpp:1707
download_state = 11 '\v'
it =
cur_src = (CUpDownClient *) 0x2796c50
old_trans = 0
dwCurTick = 2526484452
#13 0x0003ca80 in CDownloadQueue::Process() (this=0x272c900) at DownloadQueue.cpp:378
cur_file = (CPartFile *) 0x383fa00
i = 3
size = 7
downspeed = 0
#14 0x000a7e9c in CamuleApp::OnCoreTimer(wxEvent&) (this=0x240e7b0) at amule.cpp:1378
msPrev1 = 97269
msPrev5 = 95957
msPrevSave = 60003
msCur = 97561
#15 0x000a98f4 in CamuleDaemonApp::OnRun() (this=0x240e7b0) at amuled.cpp:156
msRun = 2526484396
locker = {
m_isOk = true,
m_mutex = @0x240e8e0
}
uLoop = 100
msWait = 100
#16 0x02ab1b14 in wxEntry(int&, char**) (argc=@0xbffffcb8, argv=0xbffffd60) at ../src/common/init.cpp:411
callOnExit = {}
cleanupOnExit = {}
#17 0x000a9588 in main (argc=1, argv=0xbffffd60) at amuled.cpp:128
No locals.
Ah, this all happens with rc8 as todays cvs woun't compile on my mac...
-
Originally posted by HipHopPunkSuperStar
Okay, downloaded wxMac again and rebuild everything, but I'm not sure if I'm supposed to somehow report on wxMac, if so, tell me how...
OK, thanks for doing all that. You didn't necessarily need to download it again. I meant to just have you rebuild it with additional configure flags. But you didn't harm anything.
Also, I didn't mean to suggest that the bug is in wxMac, just that running amuled with a version of wxMac that has debugging information enabled could help us locate the bug in amuled. So, no reason yet to report anything to wxMac devs.
So this first crash happend after clicking on the "Files" link in the webserver, after just a few seconds of running amuled:
Crash:
Program received signal SIGPIPE, Broken pipe.
0x90048768 in mach_wait_until ()
[other backtraces snipped]
These aren't actually crashes. By default, gdb stops the execution of the program whenever the program receives a signal. Some signals (e.g. EXC_BAD_ACCESS or SIGSEGV) are the sort of thing which would crash amuled if gdb had allowed the signal to pass. Others (e.g. SIGPIPE) are benign and amuled can cope with them if gdb allows them to pass. The way to tell gdb to let this benign signal pass through to the program is to issue this command to gdb before running the program:
handle SIGPIPE nostop noprint pass
Since you'll want gdb to execute this command every time you use it, you can put it into the file ~/.gdbinit and it will be executed automatically whenever you run gdb.
(Just so you know, if you had told gdb to continue after getting these "crashes", you would have seen that amuled would happily go about its business.)
-
I have another suggestion to try, as well. I'm pretty sure from your previous backtraces that the crash is the result of an object being freed and then being used after it is gone. Here's an additional command to issue to gdb before running amuled:
set env MallocScribble 1
This will cause freed memory to be overwritten with 0x55555555, so that if something continues to use it after it's freed, it will likely fail immediately (as opposed to stumbling along seemingly successfully for a while out of sheer dumb luck).
Unlike my previous post, this command is not something that should go into ~/.gdbinit. Just manually enter it each time you run amuled from gdb, until we figure out the cause of your crashes.
-
Okay, seems like I forgot to create the .gdbinit file when I changed computers...
Now here comes a new crash with the lines from the wiki and the one above in the .gdbinit.
For the last hint, I used the "set env MallocScribble 1 " command from within gdb after initializing it with the $gdb /Path/to/amuled
command. I hope this was correct.
Now here we go with the crash:
Program received signal EXC_BAD_ACCESS, Could not access memory.
[Switching to process 1448 thread 0x3003]
0x02213e14 in wxSocketBase::_Wait(long, long, int) (this=???, seconds=???, milliseconds=???, flags=???) at ../src/common/socket.cpp:719
719 result = m_socket->Select(flags | GSOCK_LOST_FLAG);
bt
(gdb) bt
#0 0x02213e14 in wxSocketBase::_Wait(long, long, int) (this=???, seconds=???, milliseconds=???, flags=???) at ../src/common/socket.cpp:719
#1 0x02215328 in wxSocketClient::WaitOnConnect(long, long) (this=0x2760ed0, seconds=0, milliseconds=0) at ../src/common/socket.cpp:1252
#2 0x00011d60 in CSocketGlobalThread::Entry() (this=0x273e0b8) at ListenSocket.cpp:2559
#3 0x02afa7f4 in wxThreadInternal::MacThreadStart(void*) (parameter=0x273e0b8) at ../src/mac/carbon/thread.cpp:1042
#4 0x902c6d88 in PrivateMPEntryPoint ()
#5 0x900246e8 in _pthread_body ()
bt full
bt full
#0 0x02213e14 in wxSocketBase::_Wait(long, long, int) (this=???, seconds=???, milliseconds=???, flags=???) at ../src/common/socket.cpp:719
result = Cannot access memory at address 0x40
-
And anotherone:
Crash:
rogram received signal EXC_BAD_ACCESS, Could not access memory.
[Switching to process 1473 thread 0x3103]
0x02213e14 in wxSocketBase::_Wait(long, long, int) (this=???, seconds=???, milliseconds=???, flags=???) at ../src/common/socket.cpp:719
719 result = m_socket->Select(flags | GSOCK_LOST_FLAG);
bt:
(gdb) bt
#0 0x02213e14 in wxSocketBase::_Wait(long, long, int) (this=???, seconds=???, milliseconds=???, flags=???) at ../src/common/socket.cpp:719
#1 0x02215328 in wxSocketClient::WaitOnConnect(long, long) (this=0x275ae40, seconds=0, milliseconds=0) at ../src/common/socket.cpp:1252
#2 0x00011d60 in CSocketGlobalThread::Entry() (this=0x27407b8) at ListenSocket.cpp:2559
#3 0x02afa7f4 in wxThreadInternal::MacThreadStart(void*) (parameter=0x27407b8) at ../src/mac/carbon/thread.cpp:1042
#4 0x902c6d88 in PrivateMPEntryPoint ()
#5 0x900246e8 in _pthread_body ()
bt full
(gdb) bt full
#0 0x02213e14 in wxSocketBase::_Wait(long, long, int) (this=???, seconds=???, milliseconds=???, flags=???) at ../src/common/socket.cpp:719
result = Cannot access memory at address 0x40
Seems to be the same...
-
how much max connections do you have in prefs?
and there is at least under linux a bug in udp socket, so you could try to disable it in prefs and see if it runs better then.
with max connections < 1024 and udp sockt disabled runs fine here for about 22h till i shut it down.
so maybe you can try that
stefanero
-
Well, HipHopPunkSuperStar, thanks to your persistence (Thanks!), we now have complete information on where this crash is happening, but I'm afraid I still don't know why. Hopefully, one of the other developers will have an insight.
If stefanero's suggestions help, let us know. Of course, we would like to figure out the reason for this crash.
You're having this problem with amuled. Does amule itself (aMule.app) work for you?
Also, do you have a dual-processor Mac? I don't know if it matters, just stabbing in the dark.
-
I'll try stefanero's tip tomorrow and let you know (hopefully till then bittorrent is ready and I got my powerbook running Debian, right now I can't check 'cause bt and the debian installation is taking up all my bandwith...)
aMule.app works for me, but it's SLOW, (I know about the "limitations" of the ED2K network, the credits and stuff) and it seemed to me that especialy Mac Users are having this problem of slowness but no one had tried amuled, so I thought maybe I could solve the problem using amuled, and actualy it *seems* to be faster, although I can't confirm it cause it crashes after max. 30 minutes.. While it's running it *seems* to download faster than aMule.app, at least it stays above 1,8 K for more than a couple of seconds what never happens with aMule.app ...
For the system I'm running, probably I should have told you before, the first post I made to this thread and the crashes and backtraces were from a G4 TiBook 667MHz, all the others from my new 2x2GHz G5 :-)
(as I said before, I'm currently installing debian on mi TiBook so I might be able to give you some info on the differences of aMule & amuled between Linux & Mac aswell, I would only need someone to tell me where this eMule-file which I suppose stores the credits is located on my mac as I still haven't found it to copy it between the two systems.)
-
Originally posted by HipHopPunkSuperStar
Program received signal [b]EXC_BAD_INSTRUCTION[/b], Illegal instruction/operand.
[Switching to process 7676 thread 0x2ddf]
0x00001994 in ?? ()
Oh-oh, I would say that either this is a corrupt executable or a failing RAM module. :(
If the problem is just with aMule, I would stick to the first option.
Originally posted by HipHopPunkSuperStar
Program received signal EXC_BAD_ACCESS, Could not access memory.
[Switching to process 1448 thread 0x3003]
0x02213e14 in wxSocketBase::_Wait(long, long, int) (this=???, seconds=???, milliseconds=???, flags=???) at ../src/common/socket.cpp:719
719 result = m_socket->Select(flags | GSOCK_LOST_FLAG);
(gdb) bt full
#0 0x02213e14 in wxSocketBase::_Wait(long, long, int) (this=???, seconds=???, milliseconds=???, flags=???) at ../src/common/socket.cpp:719
result = Cannot access memory at address 0x40
Now, this one seems consistent and reproducible, could be a wxBUG. ;) But I don't think so. Looks like m_socket == 0x40, and that indicates either memory corruption or a previously deleted socket. More probable is memory corruption, because since the times of RSB (random socket bug) we always set the socket to NULL after deletion, so we would probably not get to this point of the code. Either way, we won't find the error on these lines.
maybe try:
(gdb) frame 1
(gdb) p *this
So that we can have a look at this :P
I think it is strange that this has not been reported on other platforms.
Any new information, please report. Thanks for your help!
-
I amuled limit to number of connection must be set to 1000. If not - segfaults in wx socket code are guarantied. We should include this in preferences.
-
phoenix, I agree, it presents very much as memory corruption, except that it's remarkably consistent in where it crashes.
Actually, now that I think about it, I bet stefanero and lfroen are correct. HipHop may have a m_socket->m_fd greater than FD_SETSIZE (1024). So, the call to Select uses FD_SET which is writing out of bounds, overwriting the stack and garbling the backtrace, as seen. If it's not the FD_SET, then the select(m_fd + 1, &readfds, &writefds, &exceptfds, &tv) will memset readfds, writefds, and exceptfds out to the location where m_fd would be if they were large enough.
HipHop, doing what phoenix suggested (frame 1, print *this) when amuled crashes will confirm this theory. (By the way, I'm envious of your G5. :) )
The two places that aMule data is stored on Mac OS X are ~/.aMule directory (just like other platforms) and "~/Library/Preferences/eMule Preferences" which is the same as ~/.eMule file for other platforms. However, this file ("eMule Preferences" AKA .eMule) doesn't store your credits. Your credits are stored on the computers of the clients with whom you've earned them. To "redeem" them, you need to keep your userhash which is in your .aMule directory (preferences.dat and cryptkey.dat). The credits that others have earned with you are in the clients.met, also in .aMule. Anyway, copying ~/.aMule and the prefs file to another system is how you move an aMule configuration. However, don't run two aMule clients with the same userhash simultaneously. I think you can get banned that way.
-
Originally posted by ken
phoenix, I agree, it presents very much as memory corruption, except that it's remarkably consistent in where it crashes.
Yeah, I agree too.
Originally posted by ken
Actually, now that I think about it, I bet stefanero and lfroen are correct. HipHop may have a m_socket->m_fd greater than FD_SETSIZE (1024). So, the call to Select uses FD_SET which is writing out of bounds, overwriting the stack and garbling the backtrace, as seen. If it's not the FD_SET, then the select(m_fd + 1, &readfds, &writefds, &exceptfds, &tv) will memset readfds, writefds, and exceptfds out to the location where m_fd would be if they were large enough.
Then we need to do:
(gdb) p *m_socket
The message about 0x40 is from gdb, my guess is that m_socket is 0x40, but who knows... If that is the case, then it is most likely memory corruption.
Cheers!
-
Okay, I disabled the UDP-Port, Max Connections is set to 500 (I think I never touched this...)
Anyway I got two crashs pretty soon after launching amuled so here we go:
CrashProgram received signal EXC_BAD_ACCESS, Could not access memory.
[Switching to process 531 thread 0x2e03]
0x02213e14 in wxSocketBase::_Wait(long, long, int) (this=???, seconds=???, milliseconds=???, flags=???) at ../src/common/socket.cpp:719
719 result = m_socket->Select(flags | GSOCK_LOST_FLAG);
bt
(gdb) bt
#0 0x02213e14 in wxSocketBase::_Wait(long, long, int) (this=???, seconds=???, milliseconds=???, flags=???) at ../src/common/socket.cpp:719
#1 0x02214084 in wxSocketBase::WaitForLost(long, long) (this=0x27bc9b0, seconds=0, milliseconds=0) at ../src/common/socket.cpp:784
#2 0x00011db8 in CSocketGlobalThread::Entry() (this=0x2741b38) at ListenSocket.cpp:2567
#3 0x02afa7f4 in wxThreadInternal::MacThreadStart(void*) (parameter=0x2741b38) at ../src/mac/carbon/thread.cpp:1042
#4 0x902c6d88 in PrivateMPEntryPoint ()
#5 0x900246e8 in _pthread_body ()
bt full
(gdb) bt full
#0 0x02213e14 in wxSocketBase::_Wait(long, long, int) (this=???, seconds=???, milliseconds=???, flags=???) at ../src/common/socket.cpp:719
result = Cannot access memory at address 0x40
frame 1
(gdb) frame 1
#1 0x02214084 in wxSocketBase::WaitForLost(long, long) (this=0x27bc9b0, seconds=0, milliseconds=0) at ../src/common/socket.cpp:784
784 return _Wait(seconds, milliseconds, GSOCK_LOST_FLAG);
p *this
(gdb) p *this
$1 = {
= {
_vptr$wxObject = 0x26fa98,
static ms_classInfo = {
m_className = 0x2b1a2c4 "wxObject",
m_objectSize = 8,
m_objectConstructor = 0,
m_baseInfo1 = 0x0,
m_baseInfo2 = 0x0,
static sm_first = 0x223a264,
m_next = 0x2b93a98,
static sm_classTable = 0x240c9d0
},
m_refData = 0x0
},
members of wxSocketBase:
static ms_classInfo = {
m_className = 0x2222e1c "wxSocketBase",
m_objectSize = 120,
m_objectConstructor = 0,
m_baseInfo1 = 0x2b93acc,
m_baseInfo2 = 0x0,
static sm_first = 0x223a264,
m_next = 0x223a18c,
static sm_classTable = 0x240c9d0
},
m_socket = 0x27505d0,
m_type = wxSOCKET_BASE,
m_flags = 1,
m_connected = true,
m_establishing = false,
m_reading = false,
m_writing = false,
m_error = false,
m_lasterror = 1431655765,
m_lcount = 0,
m_timeout = 600,
m_states = {
= {
= {
= {
_vptr$wxObject = 0x2b88380,
static ms_classInfo = {
m_className = 0x2b1a2c4 "wxObject",
m_objectSize = 8,
m_objectConstructor = 0,
m_baseInfo1 = 0x0,
m_baseInfo2 = 0x0,
static sm_first = 0x223a264,
m_next = 0x2b93a98,
static sm_classTable = 0x240c9d0
},
m_refData = 0x0
},
members of wxListBase:
m_count = 0,
m_destroy = false,
m_nodeFirst = 0x0,
m_nodeLast = 0x0,
m_keyType = wxKEY_NONE
}, },
members of wxList:
static ms_classInfo = {
m_className = 0x2b19d9c "wxList",
m_objectSize = 28,
m_objectConstructor = 0x2abe254 ,
m_baseInfo1 = 0x2b93acc,
m_baseInfo2 = 0x0,
static sm_first = 0x223a264,
m_next = 0x2b93a20,
static sm_classTable = 0x240c9d0
}
},
m_interrupt = false,
m_beingDeleted = false,
m_unread = 0x0,
m_unrd_size = 0,
m_unrd_cur = 0,
m_id = -1,
m_handler = 0x0,
m_clientData = 0x0,
m_notify = false,
m_eventmask = 0,
static m_countInit = 1
}
p *m_socket
(gdb) p *m_socket
$2 = {
_vptr$GSocket = 0x2239660,
m_ok = true,
m_fd = 1344,
m_local = 0x0,
m_peer = 0x27aa5c0,
m_error = GSOCK_NOERROR,
m_non_blocking = false,
m_server = false,
m_stream = true,
m_establishing = false,
m_reusable = false,
m_timeout = 0,
m_detected = 8,
m_cbacks = {0x2214408 , 0x2214408 , 0x2214408 , 0x2214408 },
m_data = {0x27bc9b0 "", 0x27bc9b0 "", 0x27bc9b0 "", 0x27bc9b0 ""},
m_gui_dependent = 0x0
}
And another crash, although to me the crash looks pretty much the same, but what do I know about this ?(
crash
Program received signal EXC_BAD_ACCESS, Could not access memory.
[Switching to process 557 thread 0x2f03]
0x02213e14 in wxSocketBase::_Wait(long, long, int) (this=???, seconds=???, milliseconds=???, flags=???) at ../src/common/socket.cpp:719
719 result = m_socket->Select(flags | GSOCK_LOST_FLAG);
bt
(gdb) bt
#0 0x02213e14 in wxSocketBase::_Wait(long, long, int) (this=???, seconds=???, milliseconds=???, flags=???) at ../src/common/socket.cpp:719
#1 0x02215328 in wxSocketClient::WaitOnConnect(long, long) (this=0x27591e0, seconds=0, milliseconds=0) at ../src/common/socket.cpp:1252
#2 0x00011d60 in CSocketGlobalThread::Entry() (this=0x2741c38) at ListenSocket.cpp:2559
#3 0x02afa7f4 in wxThreadInternal::MacThreadStart(void*) (parameter=0x2741c38) at ../src/mac/carbon/thread.cpp:1042
#4 0x902c6d88 in PrivateMPEntryPoint ()
#5 0x900246e8 in _pthread_body ()
bt full
(gdb) bt full
#0 0x02213e14 in wxSocketBase::_Wait(long, long, int) (this=???, seconds=???, milliseconds=???, flags=???) at ../src/common/socket.cpp:719
result = Cannot access memory at address 0x40
frame 1
(gdb) frame 1
#1 0x02215328 in wxSocketClient::WaitOnConnect(long, long) (this=0x27591e0, seconds=0, milliseconds=0) at ../src/common/socket.cpp:1252
1252 return _Wait(seconds, milliseconds, GSOCK_CONNECTION_FLAG |
p *this
(gdb) p *this
$1 = {
= {
= {
_vptr$wxObject = 0x26fa98,
static ms_classInfo = {
m_className = 0x2b1a2c4 "wxObject",
m_objectSize = 8,
m_objectConstructor = 0,
m_baseInfo1 = 0x0,
m_baseInfo2 = 0x0,
static sm_first = 0x223a264,
m_next = 0x2b93a98,
static sm_classTable = 0x240c9d0
},
m_refData = 0x0
},
members of wxSocketBase:
static ms_classInfo = {
m_className = 0x2222e1c "wxSocketBase",
m_objectSize = 120,
m_objectConstructor = 0,
m_baseInfo1 = 0x2b93acc,
m_baseInfo2 = 0x0,
static sm_first = 0x223a264,
m_next = 0x223a18c,
static sm_classTable = 0x240c9d0
},
m_socket = 0x27883a0,
m_type = wxSOCKET_CLIENT,
m_flags = 1,
m_connected = false,
m_establishing = true,
m_reading = false,
m_writing = false,
m_error = false,
m_lasterror = 1431655765,
m_lcount = 0,
m_timeout = 600,
m_states = {
= {
= {
= {
_vptr$wxObject = 0x2b88380,
static ms_classInfo = {
m_className = 0x2b1a2c4 "wxObject",
m_objectSize = 8,
m_objectConstructor = 0,
m_baseInfo1 = 0x0,
m_baseInfo2 = 0x0,
static sm_first = 0x223a264,
m_next = 0x2b93a98,
static sm_classTable = 0x240c9d0
},
m_refData = 0x0
},
members of wxListBase:
m_count = 0,
m_destroy = false,
m_nodeFirst = 0x0,
m_nodeLast = 0x0,
m_keyType = wxKEY_NONE
}, },
members of wxList:
static ms_classInfo = {
m_className = 0x2b19d9c "wxList",
m_objectSize = 28,
m_objectConstructor = 0x2abe254 ,
m_baseInfo1 = 0x2b93acc,
m_baseInfo2 = 0x0,
static sm_first = 0x223a264,
m_next = 0x2b93a20,
static sm_classTable = 0x240c9d0
}
},
m_interrupt = false,
m_beingDeleted = false,
m_unread = 0x0,
m_unrd_size = 0,
m_unrd_cur = 0,
m_id = -1,
m_handler = 0x0,
m_clientData = 0x0,
m_notify = false,
m_eventmask = 0,
static m_countInit = 1
},
members of wxSocketClient:
static ms_classInfo = {
m_className = 0x2222e3c "wxSocketClient",
m_objectSize = 120,
m_objectConstructor = 0,
m_baseInfo1 = 0x223a1a4,
m_baseInfo2 = 0x0,
static sm_first = 0x223a264,
m_next = 0x223a1bc,
static sm_classTable = 0x240c9d0
}
}
p *m_socket
(gdb) p *m_socket
$2 = {
_vptr$GSocket = 0x2239660,
m_ok = true,
m_fd = 1345,
m_local = 0x0,
m_peer = 0x27589e0,
m_error = GSOCK_WOULDBLOCK,
m_non_blocking = false,
m_server = false,
m_stream = true,
m_establishing = false,
m_reusable = false,
m_timeout = 0,
m_detected = 8,
m_cbacks = {0x2214408 , 0x2214408 , 0x2214408 , 0x2214408 },
m_data = {0x27591e0 "", 0x27591e0 "", 0x27591e0 "", 0x27591e0 ""},
m_gui_dependent = 0x0
}
Hope this helps you somehow...
-
(gdb) p *m_socket
$2 = {
_vptr$GSocket = 0x2239660,
m_ok = true,
m_fd = 1345, <<<------*****
m_local = 0x0,
m_peer = 0x27589e0,
m_error = GSOCK_WOULDBLOCK,
m_non_blocking = false,
m_server = false,
m_stream = true,
m_establishing = false,
m_reusable = false,
m_timeout = 0,
m_detected = 8,
m_cbacks = {0x2214408 , 0x2214408 , 0x2214408 , 0x2214408 },
m_data = {0x27591e0 "", 0x27591e0 "", 0x27591e0 "", 0x27591e0 ""},
m_gui_dependent = 0x0
}
Ken, looks like you were right. m_fd is bigger than 1000. Who is the guilty one? I can't look at wx code now, will do that at home.
-
Who is the guilty one?
The guilty is amule (if socket limit is set correctly) which doesn't close sockets on time. As a result, amule creates more then 1024 fd's, while only some of them are in use. So, when socket > 1024 entering one of Wait functions - segfault
-
1345! 8o
Selected quotes from man select:FD_SETSIZE [...] is normally at least equal to the maximum number of descriptors supported by the system
BUGS
The default size FD_SETSIZE (currently 1024) is somewhat smaller than the current kernel limit to the number of open files. However, in order to accommodate programs which might potentially use a larger number of open files with select, it is possible to increase this size within a program by providing a larger definition of FD_SETSIZE before the inclusion of .
Supposedly, ulimit -n (under bash) or limit descriptors (under tcsh) displays the maximum number of open file descriptors. I get 256. However, these may only apply to the shell and programs it starts. May not apply to programs started from the Finder or from gdb.
On the other hand, I can do this:
(gdb) print (int) getdtablesize() # get descriptor table size
$1 = 10240
This seems to correspond to:
% sysctl kern.maxfilesperproc
kern.maxfilesperproc = 10240
Here's another line of inquiry:
(gdb) call (rlimit*)malloc(sizeof(rlimit))
$1 = (rlimit *) 0x2b0e100
(gdb) call (int) getrlimit(8, $1) # RLIMIT_NOFILE == 8
$2 = 0
(gdb) print *$1
$3 = {
rlim_cur = 10240,
rlim_max = 10240
}
(gdb) call (void)free($1)
I'm not entirely sure whose responsibility it is to avoid this problem. I suppose wxWidgets could do:struct rlimit foo;
getrlimit(RLIMIT_NOFILE, &foo);
if (foo.rlim_max > FD_SETSIZE)
{
foo.rlim_cur = FD_SETSIZE;
foo.rlim_max = FD_SETSIZE;
setrlimit(RLIMIT_NOFILE, &foo);
}
at initialization to make sure that socket descriptors don't exceed FD_SETSIZE;
HipHop, here's something for you to test:
$ gdb /Path/to/amuled
(gdb) break main
(gdb) run
(gdb) call (rlimit*)malloc(sizeof(rlimit))
(gdb) call (int) getrlimit(8, $1) # RLIMIT_NOFILE == 8
(gdb) print *$1 # I'm curious if the output from this matches what I got
(gdb) set *$1 = { 1024, 1024 }
(gdb) call (int) setrlimit(8, $1)
(gdb) call (void) free($1)
(gdb) cont
If I'm right, then amuled will run just fine without crashing. Of course, this is a test by absence of evidence, but your crashes are so reliable that I think we can consider the mystery solved if amuled runs a good long while.
-
Just saw lfroen's post. (I really should learn not to multi-task while I'm composing posts. :) )
To see if amuled is failing to close connections, you might try:
$ netstat -nptcp
from another Terminal window while amuled is running. To count the number of connections, use:
$ netstat -nptcp | wc -l
You can do these repeatedly over time to track how many connections are in use.
-
Originally posted by ken
HipHop, here's something for you to test:
$ gdb /Path/to/amuled
(gdb) break main
(gdb) run
(gdb) call (rlimit*)malloc(sizeof(rlimit))
(gdb) call (int) getrlimit(8, $1) # RLIMIT_NOFILE == 8
(gdb) print *$1 # I'm curious if the output from this matches what I got
(gdb) set *$1 = { 1024, 1024 }
(gdb) call (int) setrlimit(8, $1)
(gdb) call (void) free($1)
(gdb) cont
If I'm right, then amuled will run just fine without crashing. Of course, this is a test by absence of evidence, but your crashes are so reliable that I think we can consider the mystery solved if amuled runs a good long while.
Well, let's see if I got this right, everything after
(gdb) run
comes after the crash, right?
so here we go:
(gdb) break mainBreakpoint 1 at 0xa957c: file ../include/wx/mac/carbon/private.h, line 199.
(gdb)run
...
Crash
Program received signal EXC_BAD_ACCESS, Could not access memory.
[Switching to process 1147 thread 0x2d43]
0x022186ec in GSocket::Read(char*, int) (this=0x0, buffer=0x0, size=0) at ../src/unix/gsocket.cpp:821
821 if (m_stream)
(gdb) call (rlimit*)malloc(sizeof(rlimit))(gdb) call (rlimit*)malloc(sizeof(rlimit))
CClientReqSocketHandler: thread 59447296 for 0x2744370 exited
$1 = (rlimit *) 0x27a9500
(gdb) call (int) getrlimit(8, $1)
(gdb) call (int) getrlimit(8, $1)
$2 = 0
(gdb) print *$1
(gdb) print *$1
$3 = {
rlim_cur = 10240,
rlim_max = 10240
}
(gdb) set *$1 = { 1024, 1024 } #didn't give any output... (?)
(gdb) call (int) setrlimit(8, $1)
(gdb) call (int) setrlimit(8, $1)
$4 = 0
(gdb) call (void) free($1) # didn't give any output... (?)
(gdb) cont
(gdb) cont
Continuing.
Program received signal EXC_BAD_ACCESS, Could not access memory.
0x022186ec in GSocket::Read(char*, int) (this=0x0, buffer=0x0, size=0) at ../src/unix/gsocket.cpp:821
821 if (m_stream)
And for the sake of completness:
bt
(gdb) bt
#0 0x022186ec in GSocket::Read(char*, int) (this=0x0, buffer=0x0, size=0) at ../src/unix/gsocket.cpp:821
#1 0x022132d0 in wxSocketBase::_Read(void*, unsigned) (this=0x27a7040, buffer=0xf0407c58, nbytes=1) at ../src/common/socket.cpp:365
#2 0x022130d4 in wxSocketBase::Read(void*, unsigned) (this=0x27a7040, buffer=0xf0407c58, nbytes=1) at ../src/common/socket.cpp:308
#3 0x00013770 in ECSocket::ReadBuffer(wxSocketBase*, void*, unsigned) (this=0x240e930, sock=0x27a7040, buffer=0xf0407c58, len=1) at ECSocket.cpp:136
#4 0x00013628 in ECSocket::ReadNumber(wxSocketBase*, void*, unsigned) (this=0x240e930, sock=0x27a7040, buffer=0xf0407c58, len=1) at ECSocket.cpp:92
#5 0x00013b14 in ECSocket::ReadFlags(wxSocketBase*) (this=0x240e930, sock=0x27a7040) at ECSocket.cpp:212
#6 0x00013bb8 in ECSocket::ReadPacket(wxSocketBase*) (this=0x240e930, sock=0x27a7040) at ECSocket.cpp:234
#7 0x0004fb4c in ExternalConnClientThread::Entry() (this=0x2795550) at ExternalConn.cpp:1755
#8 0x02afa7f4 in wxThreadInternal::MacThreadStart(void*) (parameter=0x2795550) at ../src/mac/carbon/thread.cpp:1042
#9 0x902c6d88 in PrivateMPEntryPoint ()
#10 0x900246e8 in _pthread_body ()
bt full
(gdb) bt full
#0 0x022186ec in GSocket::Read(char*, int) (this=0x0, buffer=0x0, size=0) at ../src/unix/gsocket.cpp:821
ret = 0
#1 0x022132d0 in wxSocketBase::_Read(void*, unsigned) (this=0x27a7040, buffer=0xf0407c58, nbytes=1) at ../src/common/socket.cpp:365
more = true
total = 0
ret = 1
#2 0x022130d4 in wxSocketBase::Read(void*, unsigned) (this=0x27a7040, buffer=0xf0407c58, nbytes=1) at ../src/common/socket.cpp:308
No locals.
#3 0x00013770 in ECSocket::ReadBuffer(wxSocketBase*, void*, unsigned) (this=0x240e930, sock=0x27a7040, buffer=0xf0407c58, len=1) at ECSocket.cpp:136
msgRemain = 1
LastIO = 0
iobuf = 0xf0407c58 ""
error = false
#4 0x00013628 in ECSocket::ReadNumber(wxSocketBase*, void*, unsigned) (this=0x240e930, sock=0x27a7040, buffer=0xf0407c58, len=1) at ECSocket.cpp:92
No locals.
#5 0x00013b14 in ECSocket::ReadFlags(wxSocketBase*) (this=0x240e930, sock=0x27a7040) at ECSocket.cpp:212
i = 0
flags = 0
b = 0 '\0'
#6 0x00013bb8 in ECSocket::ReadPacket(wxSocketBase*) (this=0x240e930, sock=0x27a7040) at ECSocket.cpp:234
flags = 1819239275
p = (CECPacket *) 0x274ea20
#7 0x0004fb4c in ExternalConnClientThread::Entry() (this=0x2795550) at ExternalConn.cpp:1755
request = (CECPacket *) 0xf0407d70
response = (CECPacket *) 0x0
#8 0x02afa7f4 in wxThreadInternal::MacThreadStart(void*) (parameter=0x2795550) at ../src/mac/carbon/thread.cpp:1042
thread = (wxThread *) 0x2795550
pthread = (wxThreadInternal *) 0x27966c0
dontRunAtAll = false
#9 0x902c6d88 in PrivateMPEntryPoint ()
No symbol table info available.
#10 0x900246e8 in _pthread_body ()
No symbol table info available.
Didn't check the netstat part though cause I had running bittorrent at the same time, I'll check on that later...
-
Question:
When I try again now,
(gdb) break
gives me:
No default breakpoint address now.
What's that about?
Should I keep running amuled with all the gdb stuff anyway or do I have to do something else to get back to "normal"?
-
"cont"
-
Crash
Program received signal EXC_BAD_ACCESS, Could not access memory.
[Switching to process 942 thread 0x2f03]
0x02213e14 in wxSocketBase::_Wait(long, long, int) (this=???, seconds=???, milliseconds=???, flags=???) at ../src/common/socket.cpp:719
719 result = m_socket->Select(flags | GSOCK_LOST_FLAG);
call (rlimit*)malloc(sizeof(rlimit))(gdb) call (rlimit*)malloc(sizeof(rlimit))
$1 = (rlimit *) 0x2759ee0
call (int) getrlimit(8, $1)
(gdb) call (int) getrlimit(8, $1)
$2 = 0
print *$1
(gdb) print *$1
$3 = {
rlim_cur = 10240,
rlim_max = 10240
}
set *$1 = { 1024, 1024 }
call (int) setrlimit(8, $1)
(gdb) call (int) setrlimit(8, $1)
$4 = 0
(gdb) call (void) free($1)
cont
(gdb) cont
Continuing.
Program received signal EXC_BAD_ACCESS, Could not access memory.
0x02213e14 in wxSocketBase::_Wait(long, long, int) (this=???, seconds=???, milliseconds=???, flags=???) at ../src/common/socket.cpp:719
719 result = m_socket->Select(flags | GSOCK_LOST_FLAG);
bt
(gdb) bt
#0 0x02213e14 in wxSocketBase::_Wait(long, long, int) (this=???, seconds=???, milliseconds=???, flags=???) at ../src/common/socket.cpp:719
#1 0x02214084 in wxSocketBase::WaitForLost(long, long) (this=0x27cc030, seconds=0, milliseconds=0) at ../src/common/socket.cpp:784
#2 0x00011db8 in CSocketGlobalThread::Entry() (this=0x2753d58) at ListenSocket.cpp:2567
#3 0x02afa7f4 in wxThreadInternal::MacThreadStart(void*) (parameter=0x2753d58) at ../src/mac/carbon/thread.cpp:1042
#4 0x902c6d88 in PrivateMPEntryPoint ()
#5 0x900246e8 in _pthread_body ()
bt full
(gdb) bt full
#0 0x02213e14 in wxSocketBase::_Wait(long, long, int) (this=???, seconds=???, milliseconds=???, flags=???) at ../src/common/socket.cpp:719
result = Cannot access memory at address 0x40
$ netstat -nptcp | wc -l
was always between 223 and 260.
... ?(
-
I'm providing a new precompiled one very soon.
-
Don't know if this is of any help, but checking:
$ netstat -nptcp | wc -l
on aMule a couple of times only gives me values between 120 and 150 which is 100 less than the always crashing amuled... ?(
-
Originally posted by HipHopPunkSuperStar
Well, let's see if I got this right, everything after
(gdb) run
comes after the crash, right?
Well, not quite. My intent was for the "break main" to cause gdb to stop the program almost immediately after it began so that we could change a setting which controls how it will run (disallow more than 1024 open file descriptors). ("break main" sets a breakpoint on function "main", which is normally the entry point of a program. gdb should stop at the breakpoint and let us issue some gdb commands. We can tell gdb to let the program proceed with the "cont" command.)
Unfortunately, this didn't work quite as I intended...
(gdb) break mainBreakpoint 1 at 0xa957c: file ../include/wx/mac/carbon/private.h, line 199.
This isn't where I expected that breakpoint to be set. Apparently, since amuled ran until a crash instead of hitting the breakpoint, it isn't the right breakpoint to stop the program just as it is starting. Try, in addition to "break main", "break wxEntry" before the run. When gdb stops at one of these breakpoints the first time, issue the rest of the commands. If it also stops at the other breakpoint, just use "cont" to continue -- no need to issue all those commands twice in one session. If it runs until a crash, then don't bother issuing the rest of the commands -- by that point, it's too late for them to be useful or interesting.
Unfortunately, this means that the experiment you performed didn't test what I was hoping to test. Sorry for the hassle.
In the meantime, I had a thought. The socket file descriptors could have very high numbers even if you don't have very many of them open if you have a lot of disk file descriptors open as well. So, that leads to some questions:
- How many part.met files do you have? Do "ls ~/.aMule/Temp/*.part.met | wc -l" to find out.
- What is the highest-numbered part.met file you have? Do "(cd ~/.aMule/Temp/; ls *.part.met | tail -1)" to find out.
- How many files are you sharing? This is displayed in aMule, in the Shared Files Window, above the list.
Another thing to try is the "lsof" (list open files) command on amuled while it is running. To use it, first find out the process ID (pid) of amuled:
ps -xcopid,command | grep amuled
The number in the left-hand column of the output is amuled's pid. Then run:
lsof -p amuled_pid >amuled_lsof.txt
to list the files that amuled has open and write the result into a new file called "amuled_lsof.txt". If there's anything private in there, you might want to edit it out. Then make a posting here and attach the file.
-
Okay so I understand that when gdb stops amuled I would recognize this somehow or at least it wouldn't show me the usual Crash error like this:
Program received signal EXC_BAD_ACCESS, Could not access memory.
[Switching to process 1936 thread 0x2f03]
0x02213e14 in wxSocketBase::_Wait(long, long, int) (this=???, seconds=???, milliseconds=???, flags=???) at ../src/common/socket.cpp:719
719 result = m_socket->Select(flags | GSOCK_LOST_FLAG);
(gdb)
and that it give you much info if I use and post the output of all the commands after it has already crashed, so I won't.
It crashes before anything that I recognize as anything else than a crash happens...
for the sake of completeness I post you the bt and bt full anyway:
bt
#0 0x02213e14 in wxSocketBase::_Wait(long, long, int) (this=???, seconds=???, milliseconds=???, flags=???) at ../src/common/socket.cpp:719
#1 0x02214084 in wxSocketBase::WaitForLost(long, long) (this=0x27cc4d0, seconds=0, milliseconds=0) at ../src/common/socket.cpp:784
#2 0x00011db8 in CSocketGlobalThread::Entry() (this=0x276a0f8) at ListenSocket.cpp:2567
#3 0x02afa7f4 in wxThreadInternal::MacThreadStart(void*) (parameter=0x276a0f8) at ../src/mac/carbon/thread.cpp:1042
#4 0x902c6d88 in PrivateMPEntryPoint ()
#5 0x900246e8 in _pthread_body ()
bt full
(gdb) bt full
#0 0x02213e14 in wxSocketBase::_Wait(long, long, int) (this=???, seconds=???, milliseconds=???, flags=???) at ../src/common/socket.cpp:719
result = Cannot access memory at address 0x40
I have 7 *.*part.met files with the highest beeing 007.part.met.
I'm sharing 35 files, all in my Incoming Dir extept thos 7 I'm downloading right now.
There seems to be nothing private, not even my own public IP in the "amuled_lsof.txt" so I can attach it here, but as there are the IPs/DNSs listed of the people I'm connecting to I would apreciate if someone could remove it form the post after all the Developers interested got it, or tell me that I can remove it, although I'm not quite sure if anyone could find out something with these IPs..
-
Hmm. I'm surprised that the breakpoint at main and wxEntry isn't working (it works here for me). From your backtraces, I see that the CSocketGlobalThread is running. I've attempted to trace through the code to see how that thread gets started, and I'm pretty certain that it gets started ultimately from CamuleApp::OnInit. So, in theory, a breakpoint there must be hit before we can get to the crash you're experiencing. (Right? [plaintive]Right?[/plaintive])
So, use this instead of (or in addition to) "break main" and "break wxEntry":
break CamuleApp::OnInit
By the way, several of the commands I asked you to run previously were just to check that things on your system were similar to mine. They really only needed to be run once for me to see the results. So, I can reduce the routine a bit, to:
$ gdb /Path/to/amuled
(gdb) break CamuleApp::OnInit
(gdb) run
#
# gdb should stop at a breakpoint here, not on a signal such as EXC_BAD_ACCESS.
#
(gdb) call (rlimit*)malloc(sizeof(rlimit))
(gdb) set *$1 = { 1024, 1024 }
(gdb) call (int) setrlimit(8, $1)
(gdb) call (void) free($1)
(gdb) cont
If you again get to a crash before stopping at this breakpoint, then we should just put the damn code directly into the program. To do that, download the attached patch file. In your aMule source directory (the one from which you "make"), issue the command:
patch -p0
and, then:
cd src
make amuled
to rebuild amuled (this sequence avoids rebuilding the graphical amule app). Run the newly built executable (./amuled) under gdb. This time, no need to issue any special commands to gdb. Just "run".
-
Okay, now the unpatched version stopped again with teh EXC_BAD_ACCESS but the patched gave something more...
So I got this message:
Program received signal SIGTRAP, Trace/breakpoint trap.
0x90016f48 in semaphore_wait_signal_trap ()
so I issued: (gdb) call (rlimit*)malloc(sizeof(rlimit)) which gave me:
(gdb) call (rlimit*)malloc(sizeof(rlimit))
15:20:18: Debug: ClientCredits.cpp(526): assert "false" failed. [in child thread]
ClientCredits.cpp(526): assert "false" failed. [in child thread]
BaseClient.cpp(1827): assert "false" failed. [in child thread]
15:20:18: Debug: BaseClient.cpp(1827): assert "false" failed. [in child thread]
BaseClient.cpp(1827): assert "false" failed. [in child thread]
Program received signal SIGTRAP, Trace/breakpoint trap.
0x90001e00 in malloc ()
The program being debugged was signaled while in a function called from GDB.
GDB remains in the frame where the signal was received.
To change this behavior use "set unwindonsignal on"
Evaluation of the expression containing the function (malloc) will be abandoned.
set *$1 = { 1024, 1024 }
(gdb) set *$1 = { 1024, 1024 }
History has not yet reached $1.
call (int) setrlimit(8, $1)
(gdb) call (int) setrlimit(8, $1)
History has not yet reached $1.
call (void) free($1)
again:
(gdb) call (void) free($1)
History has not yet reached $1.
and cont just gave me:
(gdb) cont
Continuing.
so it seems that there is not much of valuable Info in this...
anyhow here is the bt
(gdb) bt
#0 0x90016f48 in semaphore_wait_signal_trap ()
#1 0x9000e790 in _pthread_cond_wait ()
#2 0x90283d68 in MPEnterCriticalRegion ()
#3 0x02af9d78 in wxMutexInternal::Lock() (this=???) at ../src/mac/carbon/thread.cpp:461
#4 0x02afcbb8 in wxMutex::Lock() (this=0x240e8e0) at ../include/wx/thrimpl.cpp:44
#5 0x0012fcc8 in wxMutexLocker::wxMutexLocker(wxMutex&) (this=0xbffffba0, mutex=@0x240e8e0) at /usr/local/include/wx-2.5/wx/thread.h:181
#6 0x0012bd28 in wxMutexLocker::wxMutexLocker(wxMutex&) (this=0xbffffba0, mutex=@0x240e8e0) at /usr/local/include/wx-2.5/wx/thread.h:181
#7 0x000a98ac in CamuleDaemonApp::OnRun() (this=0x240e7b0) at amuled.cpp:154
#8 0x02ab1b14 in wxEntry(int&, char**) (argc=@0xbffffcb8, argv=0xbffffd54) at ../src/common/init.cpp:411
#9 0x000a954c in main (argc=1, argv=0xbffffd54) at amuled.cpp:128
and the bt full
(gdb) bt full
#0 0x90016f48 in semaphore_wait_signal_trap ()
No symbol table info available.
#1 0x9000e790 in _pthread_cond_wait ()
No symbol table info available.
#2 0x90283d68 in MPEnterCriticalRegion ()
No symbol table info available.
#3 0x02af9d78 in wxMutexInternal::Lock() (this=???) at ../src/mac/carbon/thread.cpp:461
err = Cannot access memory at address 0x40
No idea if this helps you in any way...
BTW: I installed Ubuntu-Linux on my TiBook and am running aMule from there now which gives me LOT better speeds. The same files I tried downloading with aMule and amuled on Mac OS X and which came down with a Max. Overall Speed of say 2kB/s now come down with 20-30kB/s (all the other prefs are the same, I just copied my .aMule folder from one machine to the other...)
I'm willing to keep testing amuled in Mac OS X though... :))
-
I think that the SIGTRAP signal is one of those benign signals that gdb is stopping on, but which aMule would cope with if gdb passed it along. So, add this to your .gdbinit like the others:
handle SIGTRAP nostop noprint pass
Also, since you're running the patched aMule, you don't need to issue the commands for setting the new limit on file descriptors (that's what the patch does). So, if it crashes, just the backtraces are needed -- none of the other stuff (calling malloc, setting $1, calling setrlimit, calling free, or continuing).
Thanks for your willingness to help!
-
Originally posted by ken
I think that the SIGTRAP signal is one of those benign signals that gdb is stopping on, but which aMule would cope with if gdb passed it along.
Seems like your thinking was wrong:
Program terminated with signal SIGTRAP, Trace/breakpoint trap.
The program no longer exists.
(gdb) bt
No stack.
(gdb) bt full
No stack.
(gdb)
?(
What now?
-
Indeed, it does seem I was wrong. Sorry about that. :) You should remove that line from your .gdbinit. The reason I thought it was a benign (or at least, expected) signal was the presence of "semaphore_wait_signal_trap" in the backtrace, which I assumed meant the program was specifically waiting for a SIGTRAP before it would proceed.
At this point, I'm at a bit of a loss as to how to proceed. To recap: the main problem we've identified is that amuled is opening enough file descriptors so that eventually you get a socket file descriptor higher than 1024, which is the limit for some data structures and system calls to handle properly. You don't seem to have excessive numbers of files open, and your Max. Connections is within reason, so the combined total should remain under 1024. So, we don't know why amuled seems to be exceeding that. We suspect that amuled is failing to close some of the connections it has opened, but neither netstat nor lsof have so far corroborated that suspicion. We would expect each of those to produce output with about 1000 lines if amuled was opening many, many connections and failing to close them.
In the meantime, peripheral to the main problem:
It is unclear why setting breakpoints doesn't seem to work for you.
Also, the patch I provided to prevent file descriptors from exceeding 1024 seems to have had weird, unexpected effects. In particular, you are seeing SIGTRAP signals that you weren't seeing before. We don't know whether or not the patch addresses the original problem.
We also don't know why you seem to be the only person having this problem. Perhaps few Mac users are using amuled. Perhaps they tried, got this error, but didn't report it.
More generally, we don't know why amuled has this problem while aMule.app doesn't. We don't know why amuled keeps about 100 more connections open at a time than aMule.app (according to netstat). Both of these are probably related to the fact that amuled seems to download faster than aMule.app.
Lastly, you have installed aMule under Ubuntu on your PowerBook and amuled works much better there. It neither has this crashing problem nor does it exhibit the slow downloads that are apparently endemic to Mac aMule. Since this remains a big-endian platform, we can eliminate endianness issues as the cause of either of those two problems.