aMule Forum

English => Feature requests => Topic started by: m2kio on April 07, 2005, 01:42:18 PM

Title: quick rehashing suggestion
Post by: m2kio on April 07, 2005, 01:42:18 PM
when i have renamed a file and want to keep on sharing it, amule hashes it again. on the other hand it remembers files by name and does not rehash them if the name has not changed.

so i assume amule already has some code to remember and recognize a file.

i suggest that a file is not only recognized by name but also by file size and/or modification date and/or inode. these are informations harder to change than file name. in case of doubt amule could compare the md4sum of the first chunk to decide between known or new.

hashing is a pretty longish task and it would be good to reduce work here.

... m2kio !
Title: Re: quick rehashing suggestion
Post by: Kry on April 07, 2005, 02:11:31 PM
A file is identified on the filesystem by name. We can't change that.
Title: Re: quick rehashing suggestion
Post by: Xaignar on April 07, 2005, 03:55:49 PM
The problem with using such things as the inode to identify the files is that it isn't portable.
Title: Re: quick rehashing suggestion
Post by: m2kio on April 07, 2005, 04:14:58 PM
Quote
Originally posted by Xaignar
The problem with using such things as the inode to identify the files is that it isn't portable.

i'm aware of this. the most simple idea is the file length. long files only very occassionally have the same file length. (ok, maybe except iso images) and the file length is even part of the ed2k identifier. so something like the following in the hasher should do it:

Code: [Select]
if filename is unknown
  if file length is known
    if md4(first chunk of file) == md4(first chunk of known file)
      change name in data base
      return // don't hash again
    endif
  endif
endif
hash(file)  // unknown
the file name is more easily changed than anything else for a file.
so a check file length + md4(1st block) is more accurate than looking up the file name.
everything else were just additional ideas which could be optionally enabled per platform.

... m2kio !
Title: Re: quick rehashing suggestion
Post by: ken on April 07, 2005, 09:47:00 PM
A different solution to the same problem is to allow aMule to rename files in the Shared Files screen.  When it does this, it will of course update the known*.met entries for that file so that it won't have to rehash.
Title: Re: quick rehashing suggestion
Post by: Xaignar on April 07, 2005, 09:50:01 PM
Ken: Yes, but it's not very pleasent to have to go through aMule to rename shares, and you can bet that most people wont bother anyway. :P
Title: Re: quick rehashing suggestion
Post by: Vollstrecker on April 07, 2005, 10:33:41 PM
Why have the filename to be used. If a file is finished, I think it doesn't change anymore. So the file could be recognised by a md5sum or so instead of the filename. I think doing a md5 on the whole feile is much faster than rehashing and would allow to rename it.
Title: Re: quick rehashing suggestion
Post by: Xaignar on April 07, 2005, 10:39:48 PM
That wont be all that much faster than normal hashing, and only because the hashing thread also calculates the AICH hashset.
Title: Re: quick rehashing suggestion
Post by: m2kio on April 07, 2005, 10:42:14 PM
it's just the hashing which i wanted to _avoid_ !

Grüße nach Hessen  8)