aMule Forum

Please login or register.

Login with username, password and session length
Advanced search  

News:

We're back! (IN POG FORM)

Author Topic: quick rehashing suggestion  (Read 3504 times)

m2kio

  • Full Member
  • ***
  • Karma: 0
  • Offline Offline
  • Posts: 152
    • http://little-bat.de
quick rehashing suggestion
« on: April 07, 2005, 01:42:18 PM »

when i have renamed a file and want to keep on sharing it, amule hashes it again. on the other hand it remembers files by name and does not rehash them if the name has not changed.

so i assume amule already has some code to remember and recognize a file.

i suggest that a file is not only recognized by name but also by file size and/or modification date and/or inode. these are informations harder to change than file name. in case of doubt amule could compare the md4sum of the first chunk to decide between known or new.

hashing is a pretty longish task and it would be good to reduce work here.

... m2kio !
Logged

Kry

  • Ex-developer
  • Retired admin
  • Hero Member
  • *****
  • Karma: -665
  • Offline Offline
  • Posts: 5795
Re: quick rehashing suggestion
« Reply #1 on: April 07, 2005, 02:11:31 PM »

A file is identified on the filesystem by name. We can't change that.
Logged

Xaignar

  • Admin and Code Junky
  • Hero Member
  • *****
  • Karma: 19
  • Offline Offline
  • Posts: 1103
Re: quick rehashing suggestion
« Reply #2 on: April 07, 2005, 03:55:49 PM »

The problem with using such things as the inode to identify the files is that it isn't portable.
Logged

m2kio

  • Full Member
  • ***
  • Karma: 0
  • Offline Offline
  • Posts: 152
    • http://little-bat.de
Re: quick rehashing suggestion
« Reply #3 on: April 07, 2005, 04:14:58 PM »

Quote
Originally posted by Xaignar
The problem with using such things as the inode to identify the files is that it isn't portable.

i'm aware of this. the most simple idea is the file length. long files only very occassionally have the same file length. (ok, maybe except iso images) and the file length is even part of the ed2k identifier. so something like the following in the hasher should do it:

Code: [Select]
if filename is unknown
  if file length is known
    if md4(first chunk of file) == md4(first chunk of known file)
      change name in data base
      return // don't hash again
    endif
  endif
endif
hash(file)  // unknown
the file name is more easily changed than anything else for a file.
so a check file length + md4(1st block) is more accurate than looking up the file name.
everything else were just additional ideas which could be optionally enabled per platform.

... m2kio !
Logged

ken

  • Hero Member
  • *****
  • Karma: 4
  • Offline Offline
  • Posts: 825
Re: quick rehashing suggestion
« Reply #4 on: April 07, 2005, 09:47:00 PM »

A different solution to the same problem is to allow aMule to rename files in the Shared Files screen.  When it does this, it will of course update the known*.met entries for that file so that it won't have to rehash.
Logged

Xaignar

  • Admin and Code Junky
  • Hero Member
  • *****
  • Karma: 19
  • Offline Offline
  • Posts: 1103
Re: quick rehashing suggestion
« Reply #5 on: April 07, 2005, 09:50:01 PM »

Ken: Yes, but it's not very pleasent to have to go through aMule to rename shares, and you can bet that most people wont bother anyway. :P
Logged

Vollstrecker

  • Administrator
  • Hero Member
  • *****
  • Karma: 67
  • Online Online
  • Posts: 1549
  • Unofficial Debian Packager
    • http://vollstreckernet.de
Re: quick rehashing suggestion
« Reply #6 on: April 07, 2005, 10:33:41 PM »

Why have the filename to be used. If a file is finished, I think it doesn't change anymore. So the file could be recognised by a md5sum or so instead of the filename. I think doing a md5 on the whole feile is much faster than rehashing and would allow to rename it.
Logged
Homefucking is killing prostitution

Xaignar

  • Admin and Code Junky
  • Hero Member
  • *****
  • Karma: 19
  • Offline Offline
  • Posts: 1103
Re: quick rehashing suggestion
« Reply #7 on: April 07, 2005, 10:39:48 PM »

That wont be all that much faster than normal hashing, and only because the hashing thread also calculates the AICH hashset.
Logged

m2kio

  • Full Member
  • ***
  • Karma: 0
  • Offline Offline
  • Posts: 152
    • http://little-bat.de
Re: quick rehashing suggestion
« Reply #8 on: April 07, 2005, 10:42:14 PM »

it's just the hashing which i wanted to _avoid_ !

Grüße nach Hessen  8)
Logged