aMule Forum
English => en_Bugs => Topic started by: stoatwblr on June 06, 2009, 08:34:11 PM
-
I'm finding that if there is a system crash, there is a high probablity that amule.conf will be truncated to 0 bytes. I'm assuming that's because the conf file is rewritten in place periodically as the DL/UL stats are updated.
NOTE: This is always a risk if a file is open and being written at the time there's a crash, even on a journalled filesystem.
The safest way of rewriting any file is to do it beside the existing one and then to rename it into place.
-
*Bump*
Guys, has this been noted and will the code be tweaked?
-
Yes and not really any time soon. It's not as simple as you see it.
-
It's a serious issue. The current way of updating is a race condition which has repeatedly resulted in corrupted/truncated copies of amule.conf
With journal updating=ordered(default for Ext3/4, ntfs and BSD journallng), there's a better than 50% chance the file will get trashed if a machine is front-panel reset/power goes off while amule(d) is running.
-
Which is not aMule's fault, I'm sure you see.
Incidentally, amule.conf also gets corrupted if you take your harddrive out and throw it down from the 20th floor of a building, if you submerge your computer in acid, or if you nuke the country your computer is in. There is a limit for paranoid security settings, and hardware failures (unexpected power off is a hardware failure) are in the thin red line.
The filesystem should ensure the consistency of the data, and recover from failures gracefully. A lazy approach like the one this filesystems are taking causes data loss, of course.
-
It's a serious issue. The current way of updating is a race condition which has repeatedly resulted in corrupted/truncated copies of amule.conf
What is "it"? Where is this "race condition"?
With journal updating=ordered(default for Ext3/4, ntfs and BSD journallng), there's a better than 50% chance the file will get trashed if a machine is front-panel reset/power goes off while amule(d) is running.
So, hmm - don't press that button, please? Corrupted amule.conf may be least of your problem if you do.
-
It's a serious issue. The current way of updating is a race condition which has repeatedly resulted in corrupted/truncated copies of amule.conf
What is "it"? Where is this "race condition"?
The race condition is caused by overwriting the configuration file in place - when you do this the file is truncated to zero bytes, then filled.
Most of the time you can get away with doing things like this, but it leaves a window of vulnerability and it's bad practice. The safe method is to write a new file and then rename it over the old file. (Yes the inode changes, but you're not locking the file open permanently, are you?)
As an example of why rewriting in place is bad practice: About a decade ago I spent more than a week tracking down why an ISP sendmail installation was apparently randomly rejecting email to valid users. It turned out that a homegrown alias update script was taking about 1 second to complete - during that time the alias.db and virtusers.db files were truncated to 0 bytes - During the time between the files being 0 bytes and being valid dbm tables, all inbound mail was being bounced.
Multiply that by the server in question being an ISP machine handling 40-100 pieces of email per seconds and you start seeing why race conditions can be very bad news.
-
Which is not aMule's fault, I'm sure you see.
The exact words used by several professional programmers when I described how amule.conf was being rewritten were:
"####ing kids in bedrooms!"
Which might give you an idea of how they regard the practice.
As for throwing the drive down the stairs - no need when the neighbourhood power supply keeps dropping out unexpectedly thanks to copper thieves attacking distribution stations.
-
A bunch of emule mods have an automatic backup feature for config files (mantains a copy) . This may be useful in case of disk failure or power failure.
This is not a very important feature but make sense in some circumstances. Naturally the motivation behind this feature is bit different than "..in case I push the power button" ;)
edit :
remove
"..in case I push the power button" , and add "copper thieves" which are a quite different topic.(we have wrote messages at same time)
P.s: Have you considered buying an UPS or a Generator?
P.s:About " The safe method is to write a new file and then rename it over the old file" ; i'm a bit noob but this makes sense.
-
Which is not aMule's fault, I'm sure you see.
The exact words used by several professional programmers when I described how amule.conf was being rewritten were:
"####ing kids in bedrooms!"
Which might give you an idea of how they regard the practice.
That's cool. As you have proxied their opinion to me, rely my opinion to them: "Go fuck yourself, you ignorant piece of shit that talks about things he doesn't know shit about".
You can apply it to you too if that's your opinion. I already told you that it's more complicated than it seems. For example, we're not fucking rewriting the file, we're using wxWidget's wxFileConfig facility to access, read and modify the configuration file. If it is your opinion that the file should be flush()ed in any special way, take it with them. See, now it's showing that you're blaming us and flaming us without having any fucking clue what's going on, and looking like a complete cunt.
Now, if you think that for every write access to the config file we're going to a) create a new wxFileConfig object with a new path b) Copy all configuration data to the new file c) Modify the contents of the new file d) Overwrite the old file with the new file, you're as they say, S.O.L.
Hint: Filesystem integrity is at kernel/driver abstraction level, not application level. Applications shouldn't have to deal with that shit, because they're on a different abstraction level. Seriously, if you're having the problem that your filesystem is left in a fucked up state because of a power failure on a regular basis, put your own driver/script measures to protect sensitive content in place.
As for throwing the drive down the stairs - no need when the neighbourhood power supply keeps dropping out unexpectedly thanks to copper thieves attacking distribution stations.
So don't worry, we will modify our code, make it overcomplicated and redundant and full of unnecessary overhead and everything so the copper thieves can continue acting without you losing your aMule configuration!
CALL THE FUCKING POLICE, MAN. YOU HAVE ISSUES WITH TRYING TO SOLVE SYMPTOMS, NOT PROBLEMS
-
Oh I forgot to say the magic words, "no offense meant"
-
Except to those so-called "professional programmers", those can go fuck themselves.
-
Except to those so-called "professional programmers", those can go fuck themselves.
Given that they write code for spacecraft control systems and banking systems I think I know whose opinion is more valid.
Here's a free clue: Google. Bugtraq, "race condition"
-
Here's a free clue: I know that shit. It doesn't apply to a userspace non-critical program.
Their opinion is worth shit because they don't know what they're having an opinion about. Would I go out of my way to do this in a sensitive system? Hell yes. As a matter of fact I do in my job. Do I think it makes any sense to change it in aMule? Hell no. I already explained to you why.
This is not about whose opinion is more valid. This is about them insulting us without knowing the facts, which is mainly your fault anyway. They can still go fuck themselves, tho.
-
Given that they write code for spacecraft control systems and banking systems I think I know whose opinion is more valid.
Now you are starting to piss me off too. Why don't you use a spacecraft control system or banking system to download your stuff?
And read before posting:
For example, we're not fucking rewriting the file, we're using wxWidget's wxFileConfig facility to access, read and modify the configuration file. If it is your opinion that the file should be flush()ed in any special way, take it with them.
Go to the right place with your complaint (wx tracker) before Kry feels the need to write in 100pt letters.
-
If what you say is true then there's a bug in WxWidget's handling of files which you need to workaround.
The fact that your programming model is broken is not mitigated by the factor of outside problems. There are any number of reasons a machine may be unexpectedly powered down or rebooted and software which can't cope with that is by defnition BROKEN.
The point remains that the file is momentarily being truncated to 0 bytes and there's a window of vulnerability to the file being left that way - which is not good if amuled is set to startup upon reboot - the first thing it does is to write a bogus new config file and then barfs because autoconnection is off by default.
By the way: If you show the same kind of anger management problems in your dayjob when bugs are pointed out as you do here then it's a wonder you manage to stay employed. I do note that Amule(edit, I put emule previously, that was a typo) has had a spectacularly large number of appearances on the Bugtraq list and I have to wonder if your $dayjob coding has as many issues.
-
If what you say is true then there's a bug in WxWidget's handling of files which you need to workaround.
No, the library has to fix it if it's their error.
The fact that your programming model is broken
Which it isn't, we update the config file every time the values change using the interface from the framework
is not mitigated by the factor of outside problems.
There is nothing to mitigate, as there is no bug in our code.
There are any number of reasons a machine may be unexpectedly powered down or rebooted
No, not really.
and software which can't cope with that is by defnition BROKEN.
Guess the linux kernel is broken then. Or your filesystem. Or libc. Or the framework you are using under wx. Or wx itself. My bet is in #2 or #3.
The point remains that the file is momentarily being truncated to 0 bytes
... for certain filesystems and configurations ...
and there's a window of vulnerability to the file being left that way
Only in case of a catastrophic event.
- which is not good if amuled is set to startup upon reboot - the first thing it does is to write a bogus
Not bogus at all, it's the default configuration. A perfectly valid file.
new config file and then barfs because autoconnection is off by default.
Which is a sensible default.
By the way: If you show the same kind of anger management problems
I'm not angry, therefore I don't need to manage anything.
in your dayjob when bugs are pointed out as you do here then it's a wonder you manage to stay employed.
In my dayjob, where I'm paid to be nice to the occasional brickheaded moron, I am perfectly hypocritically nice to them. However, in my job I don't deal with customer support, so it's a very rare occurrence.
Of course, you have to keep in mind that this is not an aMule bug. Just because the symptom shows in aMule and can be workarounded in aMule doesn't mean it is an aMule bug.
I do note that Emule has had a spectacularly large number of appearances on the Bugtraq list and I have to wonder if your $dayjob coding has as many issues.
This is not eMule, remember? As for my dayjob coding, it's superb. It has to be for such a sensitive area.
Based on the complete stupidity, brickheadness and absolute lack of coherence of your posts, I have to wonder if your $dayjob is test monkey in a chemical treatment lab where they test the effect of throwing bricks at the heads of retarded monkeys to see the IQ decline.
-
There are not much situations the can come up.
1. Power fails before you save your changes -> Nothing happens, your changes are lost.
2. Power fails after you saved your changes and the file is written -> Nothing happens, your changes are saved.
3. Power fails after you saved your changes and file is not written -> Nothing happens, your changes are lost.
4. Power fails after you saved your config while writing the file -> When powering up again, journal recovers your old config and your changes are lost, or journal recovers your changes and your changes are saved.
All those 4 points can happen when you write to a new file and rename it to the desired name. If you get any other situation than these 4, you just use the wrong fs.
All applies to automatically written files aswell.
-
In a hilarous note, I get this bug with my aMule when my kernel "oops"es, along with firefox and KDE and evolution and pidgin and xchat and other applications getting the same config-reset or corruption problem. I don't blame the applciations because I'm not stupid. I blame XFS for missing the recovery, and the kernel for freezing.
My computer never shuts down or restarts in the cold because it has a built-in UPS: it's a laptop. So kernel freezes are the only problems I get that cause XFS to fuck me over and as rare as they are, I can deal with it. When I think I can't, I'll investigate it and propose a filesystem driver patch or change to other FS with better handling of errors. But for now, I'm happy with XFS's speed.
-
There are not much situations the can come up.
1. Power fails before you save your changes -> Nothing happens, your changes are lost.
2. Power fails after you saved your changes and the file is written -> Nothing happens, your changes are saved.
3. Power fails after you saved your changes and file is not written -> Nothing happens, your changes are lost.
4. Power fails after you saved your config while writing the file -> When powering up again, journal recovers your old config and your changes are lost, or journal recovers your changes and your changes are saved.
Look one posting down - Krys is seeing it himself with XFS and yet he thinks it's ok.
The problem is the level of journalling used in the filesystem. Only full journalling can guarantee file integrity and it slows disk activity down by between 25-50%. Journalled filesystems as they are deployed today are intended to protect the filesystem - not individual files. There are specific warnings in Ext3 and XFS documentation that the default (ordered) mode won't save a file which is being written at the time the time the FS goes down.
My FS is ext4 and I was seeing the truncations journalling=ordered (linux default). So far there have been no truncations with full journalling when the kernel oopes (truncation was happening on ext3/ordered too).
NTFS uses ordered journalling (not adjustable) and there are a huge number of reasons why a windows box can go crunch (There's the old chestnut of "unplugged a USB drive" too).
As for his other comment which boils down to "Other programs behave like this, so why are you picking on me?":
Back in the days when I ran ORBS, I used to get a lot of whining like that ("XYZ is running an open relay, why are you picking on me?"), to which the canned response was "Thank you for the pointer. These are now being investigated and will be listed if found to have security vulnerabilities"
Needless to say I'll be raising bugs against the ones he's mentioned and I've already notified a few people in appropriate places that there's a general issue which needs addressing in a number of OS packages.
The bottom line is that rewriting any file (let alone a config file) in place is "A very bad idea" and not making a backup is "just plain nuts". Just because lemmings jump off a cliff doesn't mean you should too.
-
You just can't read. I don't think is ok for XFS to do this to my files. The point is that I know who to blame for it: XFS. That's the reason they warn you about it, because it's their decision not to change it. If you care about file contents this much and have often hard locks or power offs of the system, don't use a filesystem that will destroy your data in those cases. It's that simple. So far I can deal with it (also because I keep my own script backing up sensitive files from time to time, thank you very much), if I decide it's not worth it I'll change it.
But I'll never be as retarded as to ask people to workaround it on userspace.
Now, you mention that "rewriting any file (let alone a config file) in place is "A very bad idea" and not making a backup is "just plain nuts"". We happen to agree with you! That's why we do backups of every file with a safe function before writing new data to them, when they are sensitive. Of course, given that we use wxFileConfig for abstraction of the file config handling, it's not aMule's business to change the flush() behaviour. LEARN TO READ ALREADY.
Also, again vulnerabilities. You keep using that word. I don't think it means what you think it means.
-
Incidentally I look a lot like Inigo Montoya
-
Look one posting down - Krys is seeing it himself with XFS and yet he thinks it's ok.
Why don't you just do what you tell me. Read this:
I don't blame the applciations because I'm not stupid. I blame XFS for missing the recovery
You see: You're wrong.
The problem is the level of journalling used in the filesystem. Only full journalling can guarantee file integrity and it slows disk activity down by between 25-50%. Journalled filesystems as they are deployed today are intended to protect the filesystem - not individual files.
So you think it's intended that you have an fs with 100% file integrity and 0 useable files? File integrity means, that the file and the filesystem is in an defined state. Written or not written. Somewhat in between is not defined.
There are specific warnings in Ext3 and XFS documentation that the default (ordered) mode won't save a file which is being written at the time the time the FS goes down.
I can't see any warning in Documentation/filesystems/ext3.txt, but I see:
/ The Journaling Block Device layer (JBD) isn't ext3 specific. It was designed
/ to add journaling capabilities to a block device. The ext3 filesystem code
/ will inform the JBD of modifications it is performing (called a transaction).
/ The journal supports the transactions start and stop, and in case of a crash,
/ the journal can replay the transactions to quickly put the partition back into
/ a consistent state.
and
/ * ordered mode
/ In data=ordered mode, ext3 only officially journals metadata, but it logically
/ groups metadata and data blocks into a single unit called a transaction. When
/ it's time to write the new metadata out to disk, the associated data blocks
/ are written first.
Which means, that a journaling filesystem should do exactly what I stated.
My FS is ext4 and I was seeing the truncations journalling=ordered (linux default). So far there have been no truncations with full journalling when the kernel oopes (truncation was happening on ext3/ordered too).
There has been no truncations, but they happend on other fs, too?
NTFS uses ordered journalling (not adjustable) and there are a huge number of reasons why a windows box can go crunch
You can fuck up a windows box in uncounteable ways. What's your point?
(There's the old chestnut of "unplugged a USB drive" too).
Beside the fact, that USB write actions are buffered and you can plug the drive out before any action is started, it is still a problem of the fs. This time multiplied with the fact that the user can influence the participating parts and missbehaves, because he doesn't eject the drive the right way.
As for his other comment which boils down to "Other programs behave like this, so why are you picking on me?":
Back in the days when I ran ORBS, I used to get a lot of whining like that ("XYZ is running an open relay, why are you picking on me?"), to which the canned response was "Thank you for the pointer. These are now being investigated and will be listed if found to have security vulnerabilities"
Which in this case is no problem of the software, the admin just didn't it configure right. What the hell has this to do with our it's the problem of another software that is used discussion?
Needless to say I'll be raising bugs against the ones he's mentioned and I've already notified a few people in appropriate places that there's a general issue which needs addressing in a number of OS packages.
Needless to say that the maintainers will forward the bugs to upstream, because it's a generall problem of the software and not of the packaging.
The bottom line is that rewriting any file (let alone a config file) in place is "A very bad idea" and not making a backup is "just plain nuts".
So you're bottom line is, that we should rewrite major parts of a toolkit we use, which we use just because it provides the functionality we need and we don't want to rewrite this all, plus we should reimplement the fs-drivers which fuck up their job. You should know that first there are not just 2 fs's out there, second not all are free and third this will multply the size of the binaries, and you will be the first that is whining about the huge memory usage and your "expert programmers" will tell you to tell us, that there are shared libs and framworks that can be used todeal with such stuff.
Just because lemmings jump off a cliff doesn't mean you should too.
When you direct the first lemming over a bridge, the others will follow, too.
Go dying.
-
I do note that Amule(edit, I put emule previously, that was a typo) has had a spectacularly large number of appearances on the Bugtraq list and I have to wonder if your $dayjob coding has as many issues.
I just saw your ninjaedit. You're right, aMule has a spectacularly large number of appearances on bugtraq: four (http://search.securityfocus.com/swsearch?sbm=%2F&metaname=alldoc&query=amule&x=0&y=0).
One of them is a security problem with wxExecute that could cause a problem if the user previews a file that has text like "'; rm -rf /" at the end (an extreme example, it can be any shell command). It was fixed in 2.2.5.
Another one is related to the fact webserver images could be guessed without logging in three years ago, published after we already released a fix.
A third one isn't in aMule but in aStats, and it's from 5 years ago!
And the last one is some dude asking how to find viruses in p2p networks and someone mentioning using aMule to search for them.
So yeah, a SPECTACULAR amount of appearances. I'm humbled by your words, good sir!
-
My FS is ext4
No offence, but here's your problem. I've had plenty of weird data losses (fortunately nothing serious) with ext4, too and we're not the only ones, if you search the internet.
It's a file system still in development and just not ready yet. (I'm speaking of my own experience with 2.6.29 ext4, I know that for instance the Fedora devs think differently.)
-
Main problem is EXT4 FS "delayed allocation" bug. I hope it is fixed in 2.6.30 kernels. An automatic configuration backup feature can be requested for Windows systems, if devs agreed then it will be added to aMule. But in Linux there is a easy way:
cp $HOME/.aMule/amule.conf $HOME/.aMule/amule.conf.backup && amule
P.S.: Kry, please calm down man. Any of aMule users won't want you to get a heart attack. :D
-
That's a common misconception. I am pretty calm when writing those posts :D
-
And so the thread gets wrapped with a single comic strip about how aMule is coded.
(http://gunshowcomic.com/comics/20090114.png)
-
This explains the filesystem issue, and WHY it's primarily an application problem.
https://bugs.edge.launchpad.net/ubuntu/+source/linux/+bug/317781/comments/45
This isn't just aMule, a lot of programs wrongly assume that what's written to the disk is automatically committed immediately and just because other people make bad assumptions and write code which is vulnerable to config file trashing doesn't follow that you have to follow suit - trying to have the FS second guess what needs immediate committal will probably result in the same kind of performance issues that happened when Firefox 3.0.0 used fsync() for every snigle write
The problem is particularly noticeable with aMule because it writes to disk so much.
Short answer is use fdatawrite() to commit to disk, or fsync() if that's not available
Long answer is write newconfig; mv config config~; mv newconfig config; fdatawrite().
Just because Wxwidgets gives you a tool to blow your leg off (file rewriting in place) doesn't mean you should use it, and the fact that the file is showing up truncated shows that while you may think it's editing the file in place, it's clearly not doing that at FS level.
-
May I ask why your computer is crashing so often in the first place? Do you live in a zone with an unstable power supply?
Anyway, we make backups of other important files, so why not backup the configuration file, too? I have to go now, but I'll implement it later. (Unless I forget to...)
-
Back it up on shutdown, not on any other occasion.
-
Done.
BTW: Yay for code reuse!
-
That's what it's for!
-
May I ask why your computer is crashing so often in the first place? Do you live in a zone with an unstable power supply?
If it was that easy I'd have fixed it (Several UPSes on hand).
It turns out the Intel Desktop Motherboard model D865GLC(*) has hardware issues with RAM addressing. It's 100% reproduceable across the 20+ systems I've tested. (Anyone with one of these boards can trigger the fault by using memtest86+ and getting it to probe memory. The system will lockup halfway through the first test pass. It manifests in Linux as a system lockup during heavy I/O activity and I've been unable to trigger it at all in windowsXP)
Anyway, we make backups of other important files, so why not backup the configuration file, too? I have to go now, but I'll implement it later. (Unless I forget to...)
Thanks. I see you've implemented this, but it's not quite enough to solve the problem:
(I've had another couple of truncation incidents recently and noticed the bak file was anything up to 14 days old.)
There's no point in only making a backup at exit, because if there's a crash it will l never be written.
There should be a backup made each time a new config file is written, or at the very least each time anything other than a statistics change occurs.
Statistics in config files:
I'll second the comment made in another thread that writing out statistics into the config file is a Really Bad Idea.
This whole thread would have been a non-issue if it wasn't for that being done. Config files should be as static as possible.
(*) It's suprising that Intel can produce unstable motherboards! We ended up replacing 22 desktop machines 2 years ahead of schedule at $orkplace as a result of finding this problem as it explained why some researchers were getting random lockups. I don't have the budget to change the home box out just yet.
-
Well, the topic goes some weird way... but may i return to the subj?
I've had an issue several times (my power supply isn't the best, and i can't afford an UPS now), and the workaround was easy: to edit an init script so it checked a .conf file and replaced it with backup if size=0.
But i was wondering about the cause of this. amule(d) edits conf file regularly to update stats, but it's a conf file - a file for amule configuration. not for logs or stats or whatever. isn't it a somekind wrong place for those? i always thought that conf file must contain only the data that is pivotal for program's proper working, so the editing of this file would be rare. what's the idea of putting the stats in conf?
PS i've lost my conf file because i've installed XFS on my server being impressed by Kry's story. So i won't blame XFS, i will blame Kry! Agrrrrrrr!
-
Go ahead and blame me for your inability to choose a proper filesystem for your situation.
-
hey guys, what's with your sense of humour? are you talking too much with simple-minded complainers?
But anyway, i'm really curious about what stats are doing in conf file. may be i don't understand something?
-
hey guys, what's with your sense of humour? are you talking too much with simple-minded complainers?
Where's yours? I've been told I'm way too deadpan when poking fun at someone, and it seems it works online as well!