aMule Forum

Please login or register.

Login with username, password and session length
Advanced search  

News:

We're back! (IN POG FORM)

Pages: 1 [2] 3

Author Topic: 9649 mem leak?  (Read 25446 times)

Stu Redman

  • Administrator
  • Hero Member
  • *****
  • Karma: 214
  • Offline Offline
  • Posts: 3739
  • Engines screaming
Re: 9649 mem leak?
« Reply #15 on: July 11, 2009, 04:05:01 PM »

So haven't I (except for a few test runs), but it's quite simple. Your machine should be able to run aMule without being at full load however, because Valgrind takes some extra CPU. That's probably the fastest way to get some clues about what's happening.
Logged
The image of mother goddess, lying dormant in the eyes of the dead, the sheaf of the corn is broken, end the harvest, throw the dead on the pyre -- Iron Maiden, Isle of Avalon

Stu Redman

  • Administrator
  • Hero Member
  • *****
  • Karma: 214
  • Offline Offline
  • Posts: 3739
  • Engines screaming
Re: 9649 mem leak?
« Reply #16 on: July 15, 2009, 11:53:33 PM »

Thank you for your help, but I'm afraid it wasn't useful.  :(
Code: [Select]
n1: 3631264 0x81F2D55: (within /usr/bin/amuled)
  n1: 3631264 0x81FB92B: (within /usr/bin/amuled)
   n2: 3631264 0x81CFCC7: (within /usr/bin/amuled)
    n1: 3043028 0x80D01BF: (within /usr/bin/amuled)
     n1: 3043028 0x8161A2A: (within /usr/bin/amuled)
      n1: 3043028 0x80D05C7: (within /usr/bin/amuled)
You should have run amuled with debug infos so valgrind can print a useful backtrace about where the memory was spent.

So could you please try it again?
Logged
The image of mother goddess, lying dormant in the eyes of the dead, the sheaf of the corn is broken, end the harvest, throw the dead on the pyre -- Iron Maiden, Isle of Avalon

Stu Redman

  • Administrator
  • Hero Member
  • *****
  • Karma: 214
  • Offline Offline
  • Posts: 3739
  • Engines screaming
Re: 9649 mem leak?
« Reply #17 on: July 19, 2009, 09:32:48 PM »

Interesting.

9709 will save 20MB by not storing IP filter comments (unless IP filter debug messages are turned on).
The bulk are KAD strings obviously:
Code: [Select]
heap_tree=detailed
n13: 138849065 (heap allocation functions) malloc/new/new[], --alloc-fns, etc.
 n2: 82543656 0xCAFC93: wxStringBase::Alloc(unsigned int) (in /usr/lib/libwx_baseu-2.8.so.0.6.0)
  n3: 82439436 0xCB1A8D: wxStringBase::ConcatSelf(unsigned int, wchar_t const*, unsigned int) (in /usr/lib/libwx_baseu-2.8.so.0.6.0)
   n2: 61465104 0xCB2633: wxString::wxString(char const*, wxMBConv const&, unsigned int) (in /usr/lib/libwx_baseu-2.8.so.0.6.0)
    n2: 61409636 0x82AD8DF: CFileDataIO::ReadOnlyString(bool, unsigned short) const (SafeFile.cpp:254)
     n3: 61409636 0x82AC6ED: CFileDataIO::ReadString(bool, unsigned char, bool) const (SafeFile.cpp:226)
      n3: 31806456 0x82AD075: CFileDataIO::ReadTag(bool) const (SafeFile.cpp:434)
       n1: 26515424 0x81F38DA: Kademlia::CKademliaUDPListener::Process2PublishKeyRequest(unsigned char const*, unsigned int, unsigned int, unsigned short, Kademlia::CKadUDPKey const&) (KademliaUDPListener.cpp:1588)
        n1: 26515424 0x81FC3DB: Kademlia::CKademliaUDPListener::ProcessPacket(unsigned char const*, unsigned int, unsigned int, unsigned short, bool, Kademlia::CKadUDPKey const&) (KademliaUDPListener.cpp:354)
         n2: 26515424 0x81D0777: Kademlia::CKademlia::ProcessPacket(unsigned char const*, unsigned int, unsigned int, unsigned short, bool, Kademlia::CKadUDPKey const&) (Kademlia.cpp:292)
          n1: 25030796 0x80D023F: CClientUDPSocket::OnPacketReceived(unsigned int, unsigned short, unsigned char*, unsigned int) (ClientUDPSocket.cpp:117)
           n1: 25030796 0x81624DA: CMuleUDPSocket::OnReceive(int) (MuleUDPSocket.cpp:183)
            n1: 25030796 0x80D0647: CClientUDPSocket::OnReceive(int) (ClientUDPSocket.cpp:69)
             n1: 25030796 0xC50F5D: wxAppConsole::HandleEvent(wxEvtHandler*, void (wxEvtHandler::*)(wxEvent&), wxEvent&) const (in /usr/lib/libwx_baseu-2.8.so.0.6.0)
              n1: 25030796 0xCF2527: wxEvtHandler::ProcessEventIfMatches(wxEventTableEntryBase const&, wxEvtHandler*, wxEvent&) (in /usr/lib/libwx_baseu-2.8.so.0.6.0)
               n1: 25030796 0xCF35F2: wxEventHashTable::HandleEvent(wxEvent&, wxEvtHandler*) (in /usr/lib/libwx_baseu-2.8.so.0.6.0)
                n1: 25030796 0xCF36F9: wxEvtHandler::ProcessEvent(wxEvent&) (in /usr/lib/libwx_baseu-2.8.so.0.6.0)
                 n1: 25030796 0xCF29D7: wxEvtHandler::ProcessPendingEvents() (in /usr/lib/libwx_baseu-2.8.so.0.6.0)
                  n1: 25030796 0xC512A7: wxAppConsole::ProcessPendingEvents() (in /usr/lib/libwx_baseu-2.8.so.0.6.0)
                   n1: 25030796 0x806E9E7: CamuleDaemonApp::OnRun() (amuled.cpp:661)
                    n1: 25030796 0xC8C188: wxEntry(int&, wchar_t**) (in /usr/lib/libwx_baseu-2.8.so.0.6.0)
                     n1: 25030796 0xC8C385: wxEntry(int&, char**) (in /usr/lib/libwx_baseu-2.8.so.0.6.0)
                      n0: 25030796 0x806EB69: main (amuled.cpp:170)
          n0: 1484628 in 1 place, below massif's threshold (01.00%)
       n1: 4563328 0x81C359A: Kademlia::CIndexed::ReadFile() (Indexed.cpp:131)
        n1: 4563328 0x81C394E: Kademlia::CIndexed::CIndexed() (Indexed.cpp:77)
         n1: 4563328 0x81D195A: Kademlia::CKademlia::Start(Kademlia::CPrefs*) (Kademlia.cpp:117)
          n1: 4563328 0x8081693: Kademlia::CKademlia::Start() (Kademlia.h:64)
           n1: 4563328 0x80747E3: CamuleApp::StartKad() (amule.cpp:2245)
            n1: 4563328 0x807DC84: CamuleApp::OnInit() (amule.cpp:819)
             n1: 4563328 0x806D78C: CamuleDaemonApp::OnInit() (amuled.cpp:684)
              n1: 4563328 0xC8C15E: wxEntry(int&, wchar_t**) (in /usr/lib/libwx_baseu-2.8.so.0.6.0)
               n1: 4563328 0xC8C385: wxEntry(int&, char**) (in /usr/lib/libwx_baseu-2.8.so.0.6.0)
                n0: 4563328 0x806EB69: main (amuled.cpp:170)
       n0: 727704 in 5 places, all below massif's threshold (01.00%)
      n3: 29602748 0x82ACF51: CFileDataIO::ReadTag(bool) const (SafeFile.cpp:404)
       n1: 22231524 0x81F38DA: Kademlia::CKademliaUDPListener::Process2PublishKeyRequest(unsigned char const*, unsigned int, unsigned int, unsigned short, Kademlia::CKadUDPKey const&) (KademliaUDPListener.cpp:1588)

I'm afraid it simply takes that much. I don't know enough about its internals to judge that.  :(

Kad disabled and 1 safe server added  - oh my, I never used ed2k servers... - in this situation , after 1 day run,  memory usage don't raise above 130 Mbyte.
Looks a bit much I'd say - maybe the turning off of KAD doesn't fully work. A massive Trace of that config would be interesting too (an hour should do).
Logged
The image of mother goddess, lying dormant in the eyes of the dead, the sheaf of the corn is broken, end the harvest, throw the dead on the pyre -- Iron Maiden, Isle of Avalon

EloiBosc

  • Approved Newbie
  • *
  • Karma: 3
  • Offline Offline
  • Posts: 6
Re: 9649 mem leak?
« Reply #18 on: July 26, 2009, 09:16:46 PM »

I had the same memory problem on amuled.

My system is a Ubuntu-9.04-Server with dev tools, all updated, amuled-2.2.5 compiled in this system. PIII, 512 MB. Amuled with kad on and ed2k on is started with the files "known.met", "known2_64.met", "key_index.dat", "src_index.dat" and "load_index.dat" deleted and a good "nodes.dat".

Checking the virtual memory size (vsize) of the process, a steady rise of vsize is detected at a rate of 9MB per hour, a month ago, and 3MB/h the last days. The vsize grows till a stable value between 100MB and 200MB. But, every 1 or 2 days there is a brisk rise of memory of about 100MB till the system memory is exhausted. This amount of memory is related with the line "Escritos 192 contactos Kad" on the logfile (sorry, my amuled speak spanish), it is when the files key_index.dat, src_index.dat, load_index.dat and nodes.dat are stored again on the .aMule dir.

After days of work around the problem, it seemed to be on the KAD part, I saw 3 different issues:

(1) Memory leaks, (2) Big memory fragmentation and (3) Steady memory rise.

1- MEMORY LEAKS
 
I looked for memory leaks on the KAD code. The stable memory value after a time says that the memory leak can be on the brisk rises, when that logfile line was written. Then, looking at the source code, the KAD is restarted with StopKad (the KAD objects are deleted) and StartKad (the KAD objects are created). Searching for some object not released at StopKad I only found in the CIndexed class that the maps m_Load_map, m_Sources_map and m_Keyword_map are not cleared inside ~CIndexed, surprisingly  m_Notes_map is cleared. I don’t think this is a true memory leak (someone can check) cause this maps must be cleared and deleted when the CIndexed object is destroyed. First try, I added on the code the clearing of maps, if it is not a memory leak this could improve memory fragmentation, issue (2). Testing the new code the brisk rises were produced less often but the net was different at this time, no final conclusion.

2- BIG MEMORY FRAGMENTATION

The brisk amount of memory is related to the size of the file key_index.dat, in my tests x 7, this file is the big one stored. It looked like that the memory released on StopKad is not reused on StartKad. Putting a trace based on mallinfo on the code I saw that the freed memory is free memory but the new one is not accounted. This can be a corrupted memory heap (someone can check again) or more probably due to the big memory fragmentation the process is working in a new arena that mallinfo does not account.
The conclusion is that it is not normal free and take this lots of memory (>100MB) briefly in a machinewith few Megs.

3- STEADY MEMORY RISE

At this point the conclusion is that the main part of memory used by amuled is due of the contents of the file key_index.dat is expanded seven times or more, it can be the toll of using C++ if the data is too scattered.
The contents of key_index.dat are the index sources Kad has acquired in the steady memory rise. Looking inside that file I can see that a lot of sources (in my tests 66%) are from a continuous range of 16 IP addresses and many of the sources are nonsense: For example, the same file is published with more 500 different IDs.
I don’t know if this is an error of a new-code or a net attack. I don’t tell these addresses because I don’t know what is the attack policy in this forum but I found these addresses inside public ipfilter.dat files.

Putting these IPs on the ipfilter.dat unfortunately does nothing because amuled does not filter IPs on kad acquiring sources only kad contacts are filtered (this can be an issue of a next amuled version).
Finally I put them on a iptables filter. Then, running amuled with the index files removed, only KAD, after 8 hours the vsize is 64M and almost stable.
Disconnecting kad the size of the created key_index.dat is of 1162527 bytes, the memory freed is 12M  (mallinfo trace, no memory released to the system).

Evaluating the number of different values of IPs, IDs an Filenales on key_index.dat:

New key_index.dat(size 1162527) with IP filtering:
7214 IPs, 4396 IDs, 5658 Filenames

Last key_index.dat(size 11130892) without IP filtering:
22785 IPs, 48828 IDs, 15436 Filenames

The new one can be OK. But the old one has a lot more source IDs than Filenames.

I'll check the process during some days.
Logged

Stu Redman

  • Administrator
  • Hero Member
  • *****
  • Karma: 214
  • Offline Offline
  • Posts: 3739
  • Engines screaming
Re: 9649 mem leak?
« Reply #19 on: July 27, 2009, 11:38:59 PM »

Welcome to the forum, EloiBosc !
Now this is a very impressive first post. Thank you for your thorough analysis!
This should be looked at for sure.

The key_index is for storing the published keywords your client stores for others to search for. A file name can contain several keywords ("Ubuntu_Jaunty_Server.iso" has 3 for example), 500 is way too much of course.

Maybe a check could be added:
- sensible size of file name
- re-tokenize the file name and check if one of its words matches the published keyword id. If not, kick it (and maybe blacklist the publisher).

Kad doesn't honor the ip filter my design I think. Blocking a Kad contributor accidentally could cause great harm to the network. Validity of Kad sources is ensured in other ways  I think (sigh, I don't know much about Kad yet  :().
Logged
The image of mother goddess, lying dormant in the eyes of the dead, the sheaf of the corn is broken, end the harvest, throw the dead on the pyre -- Iron Maiden, Isle of Avalon

EloiBosc

  • Approved Newbie
  • *
  • Karma: 3
  • Offline Offline
  • Posts: 6
Re: 9649 mem leak?
« Reply #20 on: July 28, 2009, 05:05:48 PM »

Well, after learning a bit about Kad I'll try to explain in more detail what is inside the strange "key_index.dat" file. I think in the previous post it was not clear.

This file has a list of search-keys (keyID) and after each key a list of sources IDs (sourceID), every sourceId has a list of filenames, IPs, size and other tags. I'm using the names on the source code and after some checks I supposed that sourceID is the hash of the file, not the ID of a peer. In this case the sourceID must identify one and only one file.
 
Taking a "key_index.dat" created in 2009-07-24 without ip filtering, size 11130892, a representative sample is the file name
"03 - 24 Hours.mp3".

This name appears always under the same keyID, not bad. But it has 830 different sourceIDs and 30 different sizes. All published by 29 IPs of the same range (A.B.C.136-248).

Focusing on the IP A.B.C.243. For this filename it has published 77 sourceIDs and 29 sizes.

If I'm not wrong with the meaning of "sourceID", how a host can have the same name file in 77 different files.

I'm really confused.
Logged

GonoszTopi

  • The current man in charge of most things.
  • Administrator
  • Hero Member
  • *****
  • Karma: 169
  • Offline Offline
  • Posts: 2685
Re: 9649 mem leak?
« Reply #21 on: July 29, 2009, 11:47:15 PM »

Focusing on the IP A.B.C.243. For this filename it has published 77 sourceIDs and 29 sizes.

If I'm not wrong with the meaning of "sourceID", how a host can have the same name file in 77 different files.

By having them in 77 directories each with different ID3 tags, for example. Pretty unlikely, though.
Logged
concordia cum veritate

EloiBosc

  • Approved Newbie
  • *
  • Karma: 3
  • Offline Offline
  • Posts: 6
Re: 9649 mem leak?
« Reply #22 on: July 30, 2009, 07:02:10 PM »

No. Definitively this is an attack to the kad net by flooding the key/source indexes.

(I cant stand longer to say it)
Logged

Kry

  • Ex-developer
  • Retired admin
  • Hero Member
  • *****
  • Karma: -665
  • Offline Offline
  • Posts: 5795
Re: 9649 mem leak?
« Reply #23 on: July 31, 2009, 01:40:17 AM »

It's pretty obvious this is cache poisoning.
Logged

Stu Redman

  • Administrator
  • Hero Member
  • *****
  • Karma: 214
  • Offline Offline
  • Posts: 3739
  • Engines screaming
Re: 9649 mem leak?
« Reply #24 on: August 02, 2009, 03:39:02 PM »

EloiBosc, can you send me your big key_index (Rapidshare or something) ? I'd like to take a look at it myself. Maybe I can set up some filter rules to weed this out.
Logged
The image of mother goddess, lying dormant in the eyes of the dead, the sheaf of the corn is broken, end the harvest, throw the dead on the pyre -- Iron Maiden, Isle of Avalon

EloiBosc

  • Approved Newbie
  • *
  • Karma: 3
  • Offline Offline
  • Posts: 6
Re: 9649 mem leak?
« Reply #25 on: August 07, 2009, 06:30:04 PM »

Sorry Stu. I´ll try but this month I´m out on holiday.

Logged

gav616

  • Guest
Re: 9649 mem leak?
« Reply #26 on: August 17, 2009, 10:09:54 PM »

nearly 7 day uptime stats,

'screen/gdb/amuled'



built '9758' with;
Code: [Select]
unset CFLAGS
unset CXXFLAGS
  
 ./configure --prefix=/usr \
--disable-monolithic \
--enable-amule-daemon \
--disable-amulecmd \
--disable-webserver \
--enable-amule-gui \
--disable-cas \
--disable-wxcas \
--disable-ed2k \
--disable-alc \
--disable-alcc \
--disable-upnp \
--disable-xas \
--enable-geoip \
--enable-mmap \
--disable-nls \
--enable-debug \
--disable-optimize
Logged

Stu Redman

  • Administrator
  • Hero Member
  • *****
  • Karma: 214
  • Offline Offline
  • Posts: 3739
  • Engines screaming
Re: 9649 mem leak?
« Reply #27 on: September 04, 2009, 09:22:57 PM »

Attached a short one using Kad network only
Nothing exciting in this one I'm afraid. I'd like to investigate the Kad poisoning issue first before trying again with massif. I've asked EloiBosc, but he's on vacation. If yours is suspiciously big too I'd like to take a look.
Logged
The image of mother goddess, lying dormant in the eyes of the dead, the sheaf of the corn is broken, end the harvest, throw the dead on the pyre -- Iron Maiden, Isle of Avalon

GonoszTopi

  • The current man in charge of most things.
  • Administrator
  • Hero Member
  • *****
  • Karma: 169
  • Offline Offline
  • Posts: 2685
Re: 9649 mem leak?
« Reply #28 on: September 05, 2009, 01:16:19 PM »

amuled does not filter IPs on kad acquiring sources only kad contacts are filtered (this can be an issue of a next amuled version).
I'll take a look.
Logged
concordia cum veritate

Stu Redman

  • Administrator
  • Hero Member
  • *****
  • Karma: 214
  • Offline Offline
  • Posts: 3739
  • Engines screaming
Re: 9649 mem leak?
« Reply #29 on: September 13, 2009, 11:56:49 PM »

Got the suspicious key_index from EloiBosc. 39000 of 49000 entries are from 129.47.136.* . The entries themselves look valid, nothing obviously fake. Probably originating from real files with the hashes randomized and probably the sizes too (slightly). Most are mp3s.

izzobz has 9419 of 24335 from that range.

There is a "trust value" calculated in Entry.cpp CKeyEntry::ReCalculateTrustValue() which should sort out exactly that kind of behavior. However it's used only to suppress these as search results as long as there are others. Maybe it could be reused to create a blocking mechanism. The goal should be to identify such rogue clients and completely block them from Kad.

Logged
The image of mother goddess, lying dormant in the eyes of the dead, the sheaf of the corn is broken, end the harvest, throw the dead on the pyre -- Iron Maiden, Isle of Avalon
Pages: 1 [2] 3