aMule Forum

English => Feature requests => Topic started by: mk89 on July 13, 2010, 12:20:13 PM

Title: IpFilter implementation
Post by: mk89 on July 13, 2010, 12:20:13 PM
Hi guys!

As you know, when you run aMule and you've a big ipfilter file, as the current (~ 18 MB), you have to wait the program that writes every line in its memory (as a map).  This not only wastes time, but also uses  a lot of memory.
I've read CIPFilter is  used in ServerList.cpp, ClientList.cpp, when you do a search, and so on.

The idea is to use a sqlite database to store this ipfilter. In this way you don't have to store the entire ipfilter in memory, since db access is really fast, but you just need a .db file.
When you read the ipfilter for the first time, you can parse the file and create the .db. You can use this to query what you need.
For instance: if you need to check if an ip is in a range (defined in the ipfilter file), you just have to search if this IP is between ip-start and ip-end.
In this way you can cut the part of inserting IPs in the RangeMap (using a O(n) algorithm).

Thank you.
Title: Re: IpFilter implementation
Post by: GonoszTopi on July 13, 2010, 12:28:43 PM
The idea is to use a sqlite database to store this ipfilter.
I'd really like to see a working implementation or some real-life measurements to prove it being better than our current way.

using a O(n) algorithm
O(n) is unaffordable. There also exists an O(log2(n)) algorithm, and even that isn't the best.
Title: Re: IpFilter implementation
Post by: mk89 on July 13, 2010, 12:49:08 PM
First of all, thank you for your answer.

O(n) is unaffordable. There also exists an O(log2(n)) algorithm, and even that isn't the best.

I meant that inserting in RangeMap requires O(n), since you need to scroll the whole iterator.

I'd really like to see a working implementation or some real-life measurements to prove it being better than our current way.

Parsing the ipfilter is done only once, since you don't usually change your ipfilter every days.
However, I can provide a simple script to convert it to .db. Having that, we can do some benchmark to see if lookup is faster.
Memory usage should be strongly reduced, so we just need to implement a faster lookup.
The idea is to save, in the sqlite .db, ranges as integers, not as strings (as "10.0.0.1" ).
In this way - I think - the lookup should be more efficient.

I'll try to provide some benchmark.
Title: Re: IpFilter implementation
Post by: Kry on July 13, 2010, 06:06:12 PM
And we add another extra dependency
Title: Re: IpFilter implementation
Post by: mk89 on July 13, 2010, 11:19:24 PM
And we add another extra dependency
Of course, but it's just an idea you're free to not include in aMule. (btw, on Ubuntu it's just libsqlite3-0 :) )
I've already tested the db creation using a Python script and it takes a few seconds to build a 5MB .db file of the current IPFilter.
However, I'll try to reimplement this function and I'll bring you up to date :)
Title: Re: IpFilter implementation
Post by: Stu Redman on July 14, 2010, 10:34:03 PM
I'd like to see how you check if a certain address is filtered or not with a SQLite table.

I've been musing about SQLite too, but for different reasons.

And did you check the IP filter code of the current SVN version?
Title: Re: IpFilter implementation
Post by: mk89 on July 15, 2010, 01:11:22 AM
I'd like to see how you check if a certain address is filtered or not with a SQLite table.

I've been musing about SQLite too, but for different reasons.

And did you check the IP filter code of the current SVN version?

My idea, at the moment, was this: store IPs as integers, so for example:
start   -    end
0                   10
11        -         19

If you want to check if an IP has to be filtered, you just convert it to an int and check if it's included in start and end by using a SELECT query. I've tested it using a Python script and it works well. I just have to do some benchmarks.

Quote
And did you check the IP filter code of the current SVN version?

No, I'll do it :)