aMule Forum
English => en_Bugs => Topic started by: gav616 on June 07, 2009, 12:09:01 AM
-
been using amuled for ages now and it never normally goes above 30-50MB (on heavy load) but ATM im only uploading and i've logged back into my nix box and its nearly 350MB with only 600 on queue,
for my config, this is not normal.
-
Here only uploading, 1 file, 620 clients on queue...and about 50 MB...
...i use amule (not amuled!)...
-
gav616, are you one of the blessed ones with insane upload rates (> 1 MB/s) ?
-
nope... 30KiB here (laugh)
-
Can you try to watch memory usage through a script and match it against the logfile to see if it raises continuously, or when certain actions happen?
Did you compile with mmap or without?
-
yeah, strangely enough its the first time i noticed and enabled 'mmap' at build, i will disable this, as i think now this was the leak..
-
build: r9661
build parameters:
--disable-monolithic \
--enable-amule-daemon \
--disable-amulecmd \
--disable-webserver \
--enable-amule-gui \
--disable-cas \
--disable-wxcas \
--disable-ed2k \
--disable-alc \
--disable-alcc \
--disable-upnp \
--disable-xas \
--enable-geoip \
--disable-mmap \
--disable-nls \
--disable-debug \
--enable-optimize
connection: KAD only
uptime: 2 D 11 H 32 M
downlaods: 7
clients on queue: 320
peak connections: 73
average connections: 16
max new connections / 5 secs: 20 (default)
file buffer size: 240000 bytes (default)
upload queue size: 5000 clients (default)
current 'amuled' usage: 101 MB ....and slowly rising...
Still seems way too big for my liking...
-
gav616,
I see that running amuled for a long time (>1-2D), the memory usage tend to reach a steady state.
So, against what I said previously, I currently don't think there is a 'leakage'.
Absolute memory footprint is another story, and it depends on several factors: number of connections, downloads, shared files, clients, IP filter enabled or not, and so on.
Looking at your last post, I'd say that in your running conditions 100Mbyte footprint is appropriate.
ah, ok, thanks for info.
-
Can one of you please try aMule instead of amuled to see if it has the same memory usage?
And please also try amuled with optimize and without debug.
Of course memory use is supposed to rise over time (with KAD tables building), but I think amuled is using way too much memory here.
-
What about amuled 2.2.5 ? How much memory does it take ?
-
Thank you very much for making these tests for us. So it's probably nothing introduced recently.
Some other things to test (individually, just with SVN and either with or without SVN as you like):
- remove your known*.met (back it up and restore it later)
- disable Kad
-
known.met: I thought a BIG load of known AICH hashes might be part of the problem. Well, seems not to be.
I can't say what Kad does when it gets disconnected (maybe it doesn't free its internal tables). Please try to disable Kad and then restart the client and see how memory behaves then.
-
I wonder what hogs all that memory. ???
-
KAD is supposed to store data about other nodes, who has which file and which keyword results in which hash. And this data raises with time as more information is being published, reaching some maximum. No idea how much memory it's supposed to take.
I'm rather surprised how much memory it takes without Kad. I think I should do an analysis how much the different data structures take, and how big they grow, but that takes some time.
Freddy77 did a lot of memory analysis, but I can't say if he's still around.
-
Could you possibly run amuled with the Valgrind heap profiler (http://valgrind.org/docs/manual/ms-manual.html)?
-
So haven't I (except for a few test runs), but it's quite simple. Your machine should be able to run aMule without being at full load however, because Valgrind takes some extra CPU. That's probably the fastest way to get some clues about what's happening.
-
Thank you for your help, but I'm afraid it wasn't useful. :(
n1: 3631264 0x81F2D55: (within /usr/bin/amuled)
n1: 3631264 0x81FB92B: (within /usr/bin/amuled)
n2: 3631264 0x81CFCC7: (within /usr/bin/amuled)
n1: 3043028 0x80D01BF: (within /usr/bin/amuled)
n1: 3043028 0x8161A2A: (within /usr/bin/amuled)
n1: 3043028 0x80D05C7: (within /usr/bin/amuled)
You should have run amuled with debug infos so valgrind can print a useful backtrace about where the memory was spent.
So could you please try it again?
-
Interesting.
9709 will save 20MB by not storing IP filter comments (unless IP filter debug messages are turned on).
The bulk are KAD strings obviously:
heap_tree=detailed
n13: 138849065 (heap allocation functions) malloc/new/new[], --alloc-fns, etc.
n2: 82543656 0xCAFC93: wxStringBase::Alloc(unsigned int) (in /usr/lib/libwx_baseu-2.8.so.0.6.0)
n3: 82439436 0xCB1A8D: wxStringBase::ConcatSelf(unsigned int, wchar_t const*, unsigned int) (in /usr/lib/libwx_baseu-2.8.so.0.6.0)
n2: 61465104 0xCB2633: wxString::wxString(char const*, wxMBConv const&, unsigned int) (in /usr/lib/libwx_baseu-2.8.so.0.6.0)
n2: 61409636 0x82AD8DF: CFileDataIO::ReadOnlyString(bool, unsigned short) const (SafeFile.cpp:254)
n3: 61409636 0x82AC6ED: CFileDataIO::ReadString(bool, unsigned char, bool) const (SafeFile.cpp:226)
n3: 31806456 0x82AD075: CFileDataIO::ReadTag(bool) const (SafeFile.cpp:434)
n1: 26515424 0x81F38DA: Kademlia::CKademliaUDPListener::Process2PublishKeyRequest(unsigned char const*, unsigned int, unsigned int, unsigned short, Kademlia::CKadUDPKey const&) (KademliaUDPListener.cpp:1588)
n1: 26515424 0x81FC3DB: Kademlia::CKademliaUDPListener::ProcessPacket(unsigned char const*, unsigned int, unsigned int, unsigned short, bool, Kademlia::CKadUDPKey const&) (KademliaUDPListener.cpp:354)
n2: 26515424 0x81D0777: Kademlia::CKademlia::ProcessPacket(unsigned char const*, unsigned int, unsigned int, unsigned short, bool, Kademlia::CKadUDPKey const&) (Kademlia.cpp:292)
n1: 25030796 0x80D023F: CClientUDPSocket::OnPacketReceived(unsigned int, unsigned short, unsigned char*, unsigned int) (ClientUDPSocket.cpp:117)
n1: 25030796 0x81624DA: CMuleUDPSocket::OnReceive(int) (MuleUDPSocket.cpp:183)
n1: 25030796 0x80D0647: CClientUDPSocket::OnReceive(int) (ClientUDPSocket.cpp:69)
n1: 25030796 0xC50F5D: wxAppConsole::HandleEvent(wxEvtHandler*, void (wxEvtHandler::*)(wxEvent&), wxEvent&) const (in /usr/lib/libwx_baseu-2.8.so.0.6.0)
n1: 25030796 0xCF2527: wxEvtHandler::ProcessEventIfMatches(wxEventTableEntryBase const&, wxEvtHandler*, wxEvent&) (in /usr/lib/libwx_baseu-2.8.so.0.6.0)
n1: 25030796 0xCF35F2: wxEventHashTable::HandleEvent(wxEvent&, wxEvtHandler*) (in /usr/lib/libwx_baseu-2.8.so.0.6.0)
n1: 25030796 0xCF36F9: wxEvtHandler::ProcessEvent(wxEvent&) (in /usr/lib/libwx_baseu-2.8.so.0.6.0)
n1: 25030796 0xCF29D7: wxEvtHandler::ProcessPendingEvents() (in /usr/lib/libwx_baseu-2.8.so.0.6.0)
n1: 25030796 0xC512A7: wxAppConsole::ProcessPendingEvents() (in /usr/lib/libwx_baseu-2.8.so.0.6.0)
n1: 25030796 0x806E9E7: CamuleDaemonApp::OnRun() (amuled.cpp:661)
n1: 25030796 0xC8C188: wxEntry(int&, wchar_t**) (in /usr/lib/libwx_baseu-2.8.so.0.6.0)
n1: 25030796 0xC8C385: wxEntry(int&, char**) (in /usr/lib/libwx_baseu-2.8.so.0.6.0)
n0: 25030796 0x806EB69: main (amuled.cpp:170)
n0: 1484628 in 1 place, below massif's threshold (01.00%)
n1: 4563328 0x81C359A: Kademlia::CIndexed::ReadFile() (Indexed.cpp:131)
n1: 4563328 0x81C394E: Kademlia::CIndexed::CIndexed() (Indexed.cpp:77)
n1: 4563328 0x81D195A: Kademlia::CKademlia::Start(Kademlia::CPrefs*) (Kademlia.cpp:117)
n1: 4563328 0x8081693: Kademlia::CKademlia::Start() (Kademlia.h:64)
n1: 4563328 0x80747E3: CamuleApp::StartKad() (amule.cpp:2245)
n1: 4563328 0x807DC84: CamuleApp::OnInit() (amule.cpp:819)
n1: 4563328 0x806D78C: CamuleDaemonApp::OnInit() (amuled.cpp:684)
n1: 4563328 0xC8C15E: wxEntry(int&, wchar_t**) (in /usr/lib/libwx_baseu-2.8.so.0.6.0)
n1: 4563328 0xC8C385: wxEntry(int&, char**) (in /usr/lib/libwx_baseu-2.8.so.0.6.0)
n0: 4563328 0x806EB69: main (amuled.cpp:170)
n0: 727704 in 5 places, all below massif's threshold (01.00%)
n3: 29602748 0x82ACF51: CFileDataIO::ReadTag(bool) const (SafeFile.cpp:404)
n1: 22231524 0x81F38DA: Kademlia::CKademliaUDPListener::Process2PublishKeyRequest(unsigned char const*, unsigned int, unsigned int, unsigned short, Kademlia::CKadUDPKey const&) (KademliaUDPListener.cpp:1588)
I'm afraid it simply takes that much. I don't know enough about its internals to judge that. :(
Kad disabled and 1 safe server added - oh my, I never used ed2k servers... - in this situation , after 1 day run, memory usage don't raise above 130 Mbyte.
Looks a bit much I'd say - maybe the turning off of KAD doesn't fully work. A massive Trace of that config would be interesting too (an hour should do).
-
I had the same memory problem on amuled.
My system is a Ubuntu-9.04-Server with dev tools, all updated, amuled-2.2.5 compiled in this system. PIII, 512 MB. Amuled with kad on and ed2k on is started with the files "known.met", "known2_64.met", "key_index.dat", "src_index.dat" and "load_index.dat" deleted and a good "nodes.dat".
Checking the virtual memory size (vsize) of the process, a steady rise of vsize is detected at a rate of 9MB per hour, a month ago, and 3MB/h the last days. The vsize grows till a stable value between 100MB and 200MB. But, every 1 or 2 days there is a brisk rise of memory of about 100MB till the system memory is exhausted. This amount of memory is related with the line "Escritos 192 contactos Kad" on the logfile (sorry, my amuled speak spanish), it is when the files key_index.dat, src_index.dat, load_index.dat and nodes.dat are stored again on the .aMule dir.
After days of work around the problem, it seemed to be on the KAD part, I saw 3 different issues:
(1) Memory leaks, (2) Big memory fragmentation and (3) Steady memory rise.
1- MEMORY LEAKS
I looked for memory leaks on the KAD code. The stable memory value after a time says that the memory leak can be on the brisk rises, when that logfile line was written. Then, looking at the source code, the KAD is restarted with StopKad (the KAD objects are deleted) and StartKad (the KAD objects are created). Searching for some object not released at StopKad I only found in the CIndexed class that the maps m_Load_map, m_Sources_map and m_Keyword_map are not cleared inside ~CIndexed, surprisingly m_Notes_map is cleared. I don’t think this is a true memory leak (someone can check) cause this maps must be cleared and deleted when the CIndexed object is destroyed. First try, I added on the code the clearing of maps, if it is not a memory leak this could improve memory fragmentation, issue (2). Testing the new code the brisk rises were produced less often but the net was different at this time, no final conclusion.
2- BIG MEMORY FRAGMENTATION
The brisk amount of memory is related to the size of the file key_index.dat, in my tests x 7, this file is the big one stored. It looked like that the memory released on StopKad is not reused on StartKad. Putting a trace based on mallinfo on the code I saw that the freed memory is free memory but the new one is not accounted. This can be a corrupted memory heap (someone can check again) or more probably due to the big memory fragmentation the process is working in a new arena that mallinfo does not account.
The conclusion is that it is not normal free and take this lots of memory (>100MB) briefly in a machinewith few Megs.
3- STEADY MEMORY RISE
At this point the conclusion is that the main part of memory used by amuled is due of the contents of the file key_index.dat is expanded seven times or more, it can be the toll of using C++ if the data is too scattered.
The contents of key_index.dat are the index sources Kad has acquired in the steady memory rise. Looking inside that file I can see that a lot of sources (in my tests 66%) are from a continuous range of 16 IP addresses and many of the sources are nonsense: For example, the same file is published with more 500 different IDs.
I don’t know if this is an error of a new-code or a net attack. I don’t tell these addresses because I don’t know what is the attack policy in this forum but I found these addresses inside public ipfilter.dat files.
Putting these IPs on the ipfilter.dat unfortunately does nothing because amuled does not filter IPs on kad acquiring sources only kad contacts are filtered (this can be an issue of a next amuled version).
Finally I put them on a iptables filter. Then, running amuled with the index files removed, only KAD, after 8 hours the vsize is 64M and almost stable.
Disconnecting kad the size of the created key_index.dat is of 1162527 bytes, the memory freed is 12M (mallinfo trace, no memory released to the system).
Evaluating the number of different values of IPs, IDs an Filenales on key_index.dat:
New key_index.dat(size 1162527) with IP filtering:
7214 IPs, 4396 IDs, 5658 Filenames
Last key_index.dat(size 11130892) without IP filtering:
22785 IPs, 48828 IDs, 15436 Filenames
The new one can be OK. But the old one has a lot more source IDs than Filenames.
I'll check the process during some days.
-
Welcome to the forum, EloiBosc !
Now this is a very impressive first post. Thank you for your thorough analysis!
This should be looked at for sure.
The key_index is for storing the published keywords your client stores for others to search for. A file name can contain several keywords ("Ubuntu_Jaunty_Server.iso" has 3 for example), 500 is way too much of course.
Maybe a check could be added:
- sensible size of file name
- re-tokenize the file name and check if one of its words matches the published keyword id. If not, kick it (and maybe blacklist the publisher).
Kad doesn't honor the ip filter my design I think. Blocking a Kad contributor accidentally could cause great harm to the network. Validity of Kad sources is ensured in other ways I think (sigh, I don't know much about Kad yet :().
-
Well, after learning a bit about Kad I'll try to explain in more detail what is inside the strange "key_index.dat" file. I think in the previous post it was not clear.
This file has a list of search-keys (keyID) and after each key a list of sources IDs (sourceID), every sourceId has a list of filenames, IPs, size and other tags. I'm using the names on the source code and after some checks I supposed that sourceID is the hash of the file, not the ID of a peer. In this case the sourceID must identify one and only one file.
Taking a "key_index.dat" created in 2009-07-24 without ip filtering, size 11130892, a representative sample is the file name
"03 - 24 Hours.mp3".
This name appears always under the same keyID, not bad. But it has 830 different sourceIDs and 30 different sizes. All published by 29 IPs of the same range (A.B.C.136-248).
Focusing on the IP A.B.C.243. For this filename it has published 77 sourceIDs and 29 sizes.
If I'm not wrong with the meaning of "sourceID", how a host can have the same name file in 77 different files.
I'm really confused.
-
Focusing on the IP A.B.C.243. For this filename it has published 77 sourceIDs and 29 sizes.
If I'm not wrong with the meaning of "sourceID", how a host can have the same name file in 77 different files.
By having them in 77 directories each with different ID3 tags, for example. Pretty unlikely, though.
-
No. Definitively this is an attack to the kad net by flooding the key/source indexes.
(I cant stand longer to say it)
-
It's pretty obvious this is cache poisoning.
-
EloiBosc, can you send me your big key_index (Rapidshare or something) ? I'd like to take a look at it myself. Maybe I can set up some filter rules to weed this out.
-
Sorry Stu. I´ll try but this month I´m out on holiday.
-
nearly 7 day uptime stats,
'screen/gdb/amuled'
(http://img3.imageshack.us/img3/2193/muledtime.th.png) (http://img3.imageshack.us/img3/2193/muledtime.png/)
built '9758' with;
unset CFLAGS
unset CXXFLAGS
./configure --prefix=/usr \
--disable-monolithic \
--enable-amule-daemon \
--disable-amulecmd \
--disable-webserver \
--enable-amule-gui \
--disable-cas \
--disable-wxcas \
--disable-ed2k \
--disable-alc \
--disable-alcc \
--disable-upnp \
--disable-xas \
--enable-geoip \
--enable-mmap \
--disable-nls \
--enable-debug \
--disable-optimize
-
Attached a short one using Kad network only
Nothing exciting in this one I'm afraid. I'd like to investigate the Kad poisoning issue first before trying again with massif. I've asked EloiBosc, but he's on vacation. If yours is suspiciously big too I'd like to take a look.
-
amuled does not filter IPs on kad acquiring sources only kad contacts are filtered (this can be an issue of a next amuled version).
I'll take a look.
-
Got the suspicious key_index from EloiBosc. 39000 of 49000 entries are from 129.47.136.* . The entries themselves look valid, nothing obviously fake. Probably originating from real files with the hashes randomized and probably the sizes too (slightly). Most are mp3s.
izzobz has 9419 of 24335 from that range.
There is a "trust value" calculated in Entry.cpp CKeyEntry::ReCalculateTrustValue() which should sort out exactly that kind of behavior. However it's used only to suppress these as search results as long as there are others. Maybe it could be reused to create a blocking mechanism. The goal should be to identify such rogue clients and completely block them from Kad.
-
I think it's time to talk to the Powers That Be, Stu.
But wait for me to do so, don't do it yourself for now.
-
Great, you're under way. The "trust value" can be a smart solution: Weeks ago I manually blocked these addresses and since then amuled is stable.
In my July 26 post I quoted some brisk memory rise due to a spontaneous "Stop/Start Kad". This is caused by a detection of no-kad-traffic on the udpport. The port is not receiving and it is turned off/on to unblock. On my amuled this happens 1 or 2 times per day.
I checked the income packets looking for something (high udp traffic, malformed packets, ...) but apparently all was normal when the udpport was blocked. I'd like to know what part is responsible (os, wxWidgets, amule, ...) and some solution, since the code is protected about it somebody may know. I didn't find references, only some on wxWidgets (I put last version: no changes).
-
I'm convinced '979*' have fixed the massive mem leak.......?
..couple of days up-time ATM, and its still under 90mb (same number onqueue, downloads, KAD-only, build flags)
It looks 'better' to me anyway.
-
forget my last post, today mule is taking nearly 400mb with 1 download and 400 on queue. (9871)
-
If you got big key_index.dat, cause of big memory usage, and while waiting for a solution based on "trust value" try to filter (iptables or other) the nets:
sudo iptables -A INPUT -s 129.47.136.0/24 -j DROP
sudo iptables -A INPUT -s 129.47.160.0/24 -j DROP
sudo iptables -A INPUT -s 129.47.161.0/24 -j DROP
sudo iptables -A INPUT -s 129.47.147.0/24 -j DROP
-
I have a similar problem with amuled 2.2.6.
If KAD is disabled amuled VmSize is about 12-13MB.
IF KAD is enabled memory usage begins to slowly and continuously grow -
amuled starts with VmSize 18MB and 10 hours later VmSize reaches 30MB,
size of key_index.dat grows from 1MB to 2M.
But in my case Moblock doesn't detect any Whittaker IPs, although I know for sure these ip ranges are in my ipfilter.p2p
When I was running amuled 2.2.3 with KAD enabled VmSize never went over 20MB.
-
Sounds perfectly normal to me. Information gets posted slowly and so memory usage grows slowly. Does it still increase further after 24 hours?
When I was running amuled 2.2.3 with KAD enabled VmSize never went over 20MB.
How long ago was that? And same conditions? You know, just a different KAD-ID might get you to a "hotter" spot in the network with more data to handle.
-
Sounds perfectly normal to me. Information gets posted slowly and so memory usage grows slowly. Does it still increase further after 24 hours?
...
Yes, perhaps it's normal, but I have little RAM - 32MB.
I don't run amuled for 24h.
...
When I was running amuled 2.2.3 with KAD enabled VmSize never went over 20MB.
How long ago was that? And same conditions?
...
It was two months ago, conditions didn't changed except Shared Files that increased a little in number.
... You know, just a different KAD-ID might get you to a "hotter" spot in the network with more data to handle.
Yes, perhaps that is true.
Today amuled 2.2.6 (with enabled KAD) starts with VmSize 14MB, and 6 hours later VmSize reaches 20MB. Size of key_index.dat grows from 70kB to 380kb.
-
You will be having trouble running KAD with 32MB. That's a fact.
Try the SVN version with the mmap configure option - it has a bunch of optimizations for low memory platforms.
-
Stu Redman
aMuled SVN with mmap enabled don't reduce or limit VmSize.
-
Well, at least it should cut some peaks, like at hashing.
-
This is definitely fixed for me with 'mmap'.
wxgtk 2.8.10
--with-gtk=2 --enable-gui --enable-graphics_ctx --enable-unicode \
--disable-all-features --without-odbc --without-expat --without-libmspack --without-sdl
amule-svn
--disable-monolithic \
--enable-amule-daemon \
--disable-amulecmd \
--disable-webserver \
--enable-amule-gui \
--disable-cas \
--disable-wxcas \
--disable-ed2k \
--disable-alc \
--disable-alcc \
--disable-upnp \
--disable-xas \
--enable-geoip \
--enable-mmap \
--disable-nls \
--disable-debug
obviously this for a minimal remote gui setup only, but still, its fixed.
-
Ha, just checked my "productive" environment. Windows had increased the swap file. aMule was at 140MB memory, 60001 entries in the keyword index (13.8MB saved), 46939 from 129.47.136.* >:(
That's 129.47.136.215 US United States CA California Simi Valley 93063 34.3021 -118.7208 Whittaker Corporation Whittaker Corporation ???
Kry, any news? I'm starting to think a hard coded Kad firewall of that IP range would not be such a bad idea.
-
Look into a way to solve the problem, not workaround it. Check what can we see that would help detect it behaviouraly and feel free to talk about it in less public places.
In case whoever is doing that is reading this: Poisoning a distribution network is not the way to fight anything, no matter what your reasons are. You are intentionally disrupting a networking service intentionally, and that's very morally, if not legally, wrong. I'm all for fighting for law compliance with the respective's country law, but you're disrupting a content delivery service that hosts plenty of legal files, and most of them outside your country.