aMule Forum

Please login or register.

Login with username, password and session length
Advanced search  

News:

We're back! (IN POG FORM)

Pages: [1] 2 3 4

Author Topic: known_files a bit fragile....  (Read 24027 times)

stoatwblr

  • Sr. Member
  • ****
  • Karma: 12
  • Offline Offline
  • Posts: 318
known_files a bit fragile....
« on: August 07, 2011, 03:46:46 AM »

SVN 10594

If amule is killed for any reason, the known_met (or known64_met) seems to lose a lot of AICH/MD4 entries it previously had (but usually not all)...

This results in a lot of unnecessary rehashing. Not good if there are a lot of files on the shared side of things.
Logged

Kry

  • Ex-developer
  • Retired admin
  • Hero Member
  • *****
  • Karma: -665
  • Offline Offline
  • Posts: 5795
Re: known_files a bit fragile....
« Reply #1 on: August 07, 2011, 06:35:40 AM »

Funny that, if you throw your harddrive out of a window, it loses them all.
Logged

Stu Redman

  • Administrator
  • Hero Member
  • *****
  • Karma: 214
  • Offline Offline
  • Posts: 3739
  • Engines screaming
Re: known_files a bit fragile....
« Reply #2 on: August 07, 2011, 12:36:03 PM »

That's why we should store that kind of stuff in a database instead of a met.
Logged
The image of mother goddess, lying dormant in the eyes of the dead, the sheaf of the corn is broken, end the harvest, throw the dead on the pyre -- Iron Maiden, Isle of Avalon

stoatwblr

  • Sr. Member
  • ****
  • Karma: 12
  • Offline Offline
  • Posts: 318
Re: known_files a bit fragile....
« Reply #3 on: August 07, 2011, 05:41:42 PM »

That's why we should store that kind of stuff in a database instead of a met.

Alternatively, create new mets and copy them in

Even just shutting down amule seems to result in a few entries being trashed.

Currently known2 is 733Mb while known is about 200k
Logged

Kry

  • Ex-developer
  • Retired admin
  • Hero Member
  • *****
  • Karma: -665
  • Offline Offline
  • Posts: 5795
Re: known_files a bit fragile....
« Reply #4 on: August 07, 2011, 07:00:46 PM »

That's why we should store that kind of stuff in a database instead of a met.

You do realize that the .met format is a database?
Logged

Stu Redman

  • Administrator
  • Hero Member
  • *****
  • Karma: 214
  • Offline Offline
  • Posts: 3739
  • Engines screaming
Re: known_files a bit fragile....
« Reply #5 on: August 07, 2011, 10:23:52 PM »

Kind of. Technically any file with data in it can be considered to be a "database". I had something more robust and sophisticated in mind, like SQLite. With an index. So you don't have to go sequentially through the whole thing every time in CAICHHashSet::SaveHashSet() . And transactions.

And I don't like that CAICHSyncTask::Entry() opens the met for read/write. It should read only. Writing is only for
a) creating the header if it's empty (useless, already happens in CAICHHashSet::SaveHashSet()  )
b) truncating on error - I'd rather close and reopen it in that special case!

Maybe this is causing the problem?

Anyway, if known2 is 733Mb it's gone way beyond what it was intended for.  I'd delete it and start all over from scratch. Let it rehash everything shared for once and be done.
Logged
The image of mother goddess, lying dormant in the eyes of the dead, the sheaf of the corn is broken, end the harvest, throw the dead on the pyre -- Iron Maiden, Isle of Avalon

stoatwblr

  • Sr. Member
  • ****
  • Karma: 12
  • Offline Offline
  • Posts: 318
Re: known_files a bit fragile....
« Reply #6 on: August 07, 2011, 11:29:45 PM »

Anyway, if known2 is 733Mb it's gone way beyond what it was intended for.  I'd delete it and start all over from scratch. Let it rehash everything shared for once and be done.

I suspected you might suggest that.

How big was it intended to grow?

There are only a couple hundred files exported at the moment - I've cut things right back but the exports keep getting trashed.
Logged

stoatwblr

  • Sr. Member
  • ****
  • Karma: 12
  • Offline Offline
  • Posts: 318
Re: known_files a bit fragile....
« Reply #7 on: August 08, 2011, 02:59:12 AM »

Even just shutting down amule seems to result in a few entries being trashed.

Specifically: Shutting down Amuled (either with HUP or a shutdown command via amulecmd) results in corruption of hashes of files that were downloading at the instant of the shutdown, resulting in them having to be rehashed before downloads can resume.

On the side of hashing of files being shared, I noticed that known2's mtime  seems to only be updated every 60 minutes. It looks like there needs to be better clean closing of the file at shutdown - and as Stu suggested, only open read/write when absolutely needed.



Logged

Stu Redman

  • Administrator
  • Hero Member
  • *****
  • Karma: 214
  • Offline Offline
  • Posts: 3739
  • Engines screaming
Re: known_files a bit fragile....
« Reply #8 on: August 08, 2011, 09:09:50 PM »

I don't think rehashing of downloading files is related to known.met at all.

Look at the logfile after shutdown. Is there "aMule shutdown completed." at the end?
What about the timestamps? The .part.met must all be same or newer as the .part .
Logged
The image of mother goddess, lying dormant in the eyes of the dead, the sheaf of the corn is broken, end the harvest, throw the dead on the pyre -- Iron Maiden, Isle of Avalon

stoatwblr

  • Sr. Member
  • ****
  • Karma: 12
  • Offline Offline
  • Posts: 318
Re: known_files a bit fragile....
« Reply #9 on: August 09, 2011, 01:23:12 AM »

Look at the logfile after shutdown. Is there "aMule shutdown completed." at the end?
[

Yes.

Quote
What about the timestamps? The .part.met must all be same or newer as the .part .

I'll have to check this next time I shutdown.

Logged

stoatwblr

  • Sr. Member
  • ****
  • Karma: 12
  • Offline Offline
  • Posts: 318
Re: known_files a bit fragile....
« Reply #10 on: August 09, 2011, 10:35:42 PM »

What about the timestamps? The .part.met must all be same or newer as the .part .

Yes, they all are.

Logged

stoatwblr

  • Sr. Member
  • ****
  • Karma: 12
  • Offline Offline
  • Posts: 318
Re: known_files a bit fragile....
« Reply #11 on: August 24, 2011, 03:03:49 AM »


fwiw after a few days it's back up to ~260Mb and back to the same problem.

Logged

stoatwblr

  • Sr. Member
  • ****
  • Karma: 12
  • Offline Offline
  • Posts: 318
Re: known_files a bit fragile....
« Reply #12 on: August 27, 2011, 07:35:22 PM »

known2 eventually grew to around 550MB, then amuled sat there spinning its wheels at 100% cpu for several hours, with known.met growing to 5-7Mb then shrinking back to zero.

Everything exited apparently cleanly on a HUP, however when restarting the following mesage was logged.

$ grep -i known logfile
!2011-08-27 18:24:31: WARNING: Known file list corrupted, contains invalid header.
 2011-08-27 18:24:31: Creditfile loaded, 7986 clients are known
 2011-08-27 18:26:41: Found 384 known shared files, 18066 unknown

Amuled is again sitting at 100% with known.met cycling in size while known2 stays static.

Yes I really do have that many files on share (distros and suchlike)

As Stu says, it appears the met format can't cope with stuff this big. Perhaps some sort of run/compile time alternative such as a mysql interface needs to be considered.

Opinions?

Logged

stoatwblr

  • Sr. Member
  • ****
  • Karma: 12
  • Offline Offline
  • Posts: 318
Re: known_files a bit fragile....
« Reply #13 on: August 27, 2011, 07:53:06 PM »

After a little more digging...

It may not be changing known2, but it _IS_ recalculating all the hashes on shared files.

Is this expected behaviour if it thinks the header is corrupt?

Logged

Stu Redman

  • Administrator
  • Hero Member
  • *****
  • Karma: 214
  • Offline Offline
  • Posts: 3739
  • Engines screaming
Re: known_files a bit fragile....
« Reply #14 on: August 27, 2011, 10:39:44 PM »

You know, sharing 18000 files makes no sense. But what happens here shouldn't happen of course.

I just shared two wxWidgets folders with 18000 files. Yes, behavior is silly. Hashing is supposed to be a background task, but every 30 files known.met is rewritten. Completely. This looks like its size is growing/shrinking. And the writing is done in the foreground, which makes the whole hashing-in-background pointless. GUI is completely blocked.

10601 fixes the problem for me (by updating known.met only after 300Mb and not after 30 files anymore). Please try it.

(And I see I'm tired - I also slipped in my patch to open known2_64.met in read only mode for reading, which was supposed to be an extra revision.  :-[ )
Logged
The image of mother goddess, lying dormant in the eyes of the dead, the sheaf of the corn is broken, end the harvest, throw the dead on the pyre -- Iron Maiden, Isle of Avalon
Pages: [1] 2 3 4