Topic: Site downtime explanation, and lot of extra text. (Read 32160 times)

MsZ · « **Reply #15 on:** March 04, 2009, 08:21:30 PM »

Well, I never used Reiser seriously, just "tried it" sometimes. I knew it had some annoying issues so I never took it seriously. I use mainly ext3 with some XFS archives, or a RAID1 array of two XFS disks containing some important data, and I do some DVD backups once in a while, y'know, things like these.

I also had a major data loss once, though. But it wasn't a filesystem fault, it was just... well, a faulty cable. The filesystem (ext3) was damn solid. But it was no use, the metadata were unreadable.
I lost 151GIGS of very precious stuff. I had a hard time to take everything back, but at the end now I am much more careful when it comes down to data storage. I learnt mdadm, I studied the RAID structures and I tested many filesystems. Now I happily live with my ext3 and XFS HDDs. P-e-r-f-e-c-t-l-y.

Later I investigated and I discovered that the faulty cable may have twisted some signals and the HDD heads may have overwritten some filesystem metadata and/or superblocks. No way to recover it, only light a candle and say some prayers to the poor defunct.
And give a good dd if=/dev/zero of=/dev/hdd bs=4096 and a mke2fs -j /dev/hdd soon thereafter. $:-\$

I was naive, too; I learnt three precious lessons:
1-data aren't eternal;
2-data aren't everything;
3-backups save the day.

Don't worry. Now I am sure I will never, ever be involved with anyting related to ReiserFS. Not anymore. The poor faith I had in Reiser now is all gone.

stoatwblr · « **Reply #16 on:** March 22, 2009, 02:13:57 AM »

There is NO excuse for not making backups.

On a server, there is NO excuse for not running some degree of RAID (disks are cheap, RAID1 is easy, RAID5 overheads doesn't even register on a post-2004 CPU, hardware raid is better, but only if you use a decent hardware controller, else software raid is _faster_)

I work on multi-terabyte systems every day. If things break I have 200 staff at my door in 5 minutes and 40,000 people making worried queries within 6 hours.

We've used Ext3, Reiser, GFS and others and every single one of them has barfed and lost its marbles at some point.

Reiser is actually the only one which was actually fully recoverable in all cases without resorting to pulling backup tapes.

The Reiser repair tools are _incredibly_ powerful and also as easily capable of blowing your leg off if you use them incorrectly, but when some (literal) rocket scientist puts critical files on a non-backed-up area of the FS, then you have to use them - and we DID recover all of the data.

That said (I've used Reiser since 2000 when it was beta-quality at best)

ReiserFS is _very_ sensitive to shitty hardware and it drives busses/controllers/ disks hard. One of the most common early failures I encountered on hobby-grade systems while testing was marginal integrated Intel Neptune chipset controllers which Linux would put in non-DMA mode, but which the distro would swap back to DMA mode during bootup. 99.9% of the time that would be fine, but every so often......BOOM! - and 28 hours running a block level filesystem rebuild because it was my home box and it was faster to do that than restore from all them fiddly backup CDs. That was my first lesson that if Linux turned off DMA for a disk, LEAVE IT OFF! (Intel IDE controllers were dodgy right through to ICH6 and a lot of cheap/nasty scsi controllers _ARE_ nasty.)

(I had a bunch of arguments with Hans about this and also with Andre Hedrick before accepting that Andre was right about forcing conservativeness into hardware support defaults and Hans was right about assuming some klutz(me) hadn't tried to "fix" things to get greater performance)

There are other issues with ReiserFS at the moment. Hans personal life has intruded far too much into development, but he does like to rub people up the wrong way - linux kernel inclusions are as much political as based on technical merit. I'm adopting a wait-and-see philosophy but it's already painfully clear to me that GFS is nowehere near as good as it's cracked up to be in a multi-path, multi-server 8Gb SAN environment (ie, it breaks often enough to be painful and we pay high 5 figure support fees each year) - which is where Reiser development shuld have been by now if the roadmap hadn't been derailed.

Repeatng the first line of my post - There is no excuse for not making backups.

Backups are your recovery plan when everything's dead or the hoster turns the server off. Everything else is aimed at keeping the thing running and avoiding having to use the backups.

Flame away. I've run hobby sites and had the problem of them becoming monsters. If you need some help setting up Bacula for remote access, then ask away (this is one of the best options, it means your backups are physically well separated from the server if things go REALLY wrong and it stops someone breaking in and going "rm -rf /path/to/backups/; mt -f /dev/nst0 erase" - This HAS happened - it put a mid-size ISP in New Zealand out of action in 1999.)

Kry · « **Reply #17 on:** March 22, 2009, 02:29:00 AM »

We are not certainly going to flame you for saying backups are necessary, or that ReiserFS is such a piece of inconsistent shit that low-to-medium end hardware makes a different on it not destroying your data.

There is no excuse for not making backups, but there's no excuse for being a filesystem that makes backups necessary by force.

Noone is going ot flame you for that. The fact you think anyone is going to do so is because you sound like a sanctimonious prick, and you know it. Maybe you should work on improving that.

fabtar · « **Reply #18 on:** March 24, 2009, 02:21:11 PM »

Quote from: stoatwblr on March 22, 2009, 02:13:57 AM

On a server, there is NO excuse for not running some degree of RAID (disks are cheap, RAID1 is easy, RAID5 overheads doesn't even register on a post-2004 CPU, hardware raid is better, but only if you use a decent hardware controller, else software raid is _faster_)

I think RAID is not going to save from all kind of Filesystem failures.
IMHO Trusting only on the RAID doesn't suffice, you have to do backups on an indepented filesystem (automatic&periodic on another disk with its own filesystem and better located on a remote server) in order to feel safe.
Regards

P.s: you have already pointed backups are crucial but I like to only to add this consideration to avoid readers from considering RAID as an alternative solution or panacea.

fabtar · « **Reply #19 on:** June 07, 2009, 08:02:56 PM »

Quote from: User294 on May 30, 2009, 04:10:11 AM

RAID is not going to save anyone from filesystem failures, wrong actions, etc at all. RAID can only (sometimes) save from hard drive failures (and even here, sometimes RAID could actually cause total data destruction in such event). RAID in fact only allows some kind of rapid crash-recovery without extensive offline times in case if you're lucky. So, RAID never replaces backups. It just yet another countermeasure against troubles.

You are right, RAID doesn't prevent any filesystem failure, RAID may save from hardware failures only (which could damage filesystem), I was thinking both things when writing and my assertion has became a totally wrong mix.

aMule Forum

News:

Author Topic: Site downtime explanation, and lot of extra text. (Read 32160 times)

MsZ

Re: Site downtime explanation, and lot of extra text.

stoatwblr

Reiser and backups.

Kry

Re: Site downtime explanation, and lot of extra text.

fabtar

Re: Site downtime explanation, and lot of extra text.

fabtar

Re: Site downtime explanation, and lot of extra text.