Why Don’t I/We Use RHEL Red Hat Enterprise Linux Instead of Centos?

This a question one of our good client’s asked me one day and I have to admit I wasn’t prepared for this one, it’s something we’ve never put active thought in but was rather a matter of instinct. While we do use various OS’s for different platforms including our own in-house Linux, Cloud, Hosting Control Panel, applications and clients including BSD based OS’s such as FreeBSD, this is something we’ve never been asked.

RHEL (Red Hat Enterprise Linux) has been a clear leader since the early days of providing a standardized, mission critical platform for business applications in the Linux/Unix environment. It was actually my first Linux install back when I was in highschool and I’ve personally maintained Unix/Linux systems for over 16 years now. In that time I have found the strengths and weaknesses of RHEL in terms of our business and clients. However, the Centos project is a legal clone of RHEL with the only difference being artwork and the name Centos it functions identically as RHEL and is completely Open Source.

Since our team of experts does everything in-house we don’t really on Vendor support, it means when something is going on with a server we can solve the problem on our own very quickly with our own team. We don’t have the need to call a third party and ask how-to fix the problem and in fact it’s quicker for us to just do it ourselves whereas I’ve learned many organizations heavily rely on this type of third-party support. The goal with my ventures has always been that our teams should be self-sufficient for both security and efficiency.

Centos being Open Source is a huge advantage for us, we can customize and redistribute OS’s and deploy them on servers without having to touch a button or connect to monitor or KVM (yes I do realize RHEL can be installed headless/kickstart but not in the way we deploy our custom OS images-possibly for another blog). The only con with Centos is that there is a small delay in updates since it depends on the upstream source which is RHEL but this is a minor issue and all major updates are done to Centos almost instantly.

To conclude, my hats off to Linus Torvalds for inventing the Linux Kernel, the RHEL team and especially the CentOS team and I hope this explains why CentOS for my company’s is the best fit for our needs at the moment.

Dedicated Server Uptime Samples

I just logged into two random dedicated servers and I am always happy about the time uptimes we have:

13:05:37 up 960 days, 21 min,  1 user,  load average: 0.00, 0.01, 0.05

14:11:14 up 835 days, 18:01,  6 users,  load average: 0.09, 0.02, 0.01

In the case of both servers they have never been down, they were literally installed on a rack from the time shown above.

The reason our uptime is always fantastic is not only because our facilities being out of the core disaster areas.  We never overload or oversell our servers.  We are not a budget provider, but still offer excellent value in my opinion.  We’ve had a lot of clients switch to us from other hosts primarily based on the reasoning “no amount of features or gimmicks in the world matter if you have an unreliable service”.

 

Rebooting a Linux Dedicated server with no hard drives from the shell

I just thought I would finally test this so I simulated a complete RAID array failure by pulling all of the drives at once.

This results in an input/output error when trying to do anything so the question is can you still reboot in this situation?

[root@testserver /]# reboot
-bash: /sbin/reboot: Input/output error
[root@testserver /]# shutdown -rn now
-bash: /sbin/shutdown: Input/output error
[root@testserver /]# shutdown
-bash: /sbin/shutdown: Input/output error
[root@testserver /]# uptime
13:47:10 up 41 min,  1 user,  load average: 0.00, 0.00, 0.00

Reboot by sending commands directly to /proc

[root@testserver /]# echo 1 > /proc/sys/kernel/sysrq
[root@testserver /]# echo b > /proc/sysrq-trigger

And sure enough the server rebooted, it could be handy if someone has a remote server without remote hands or remote-reboot (in this case we have both on-site so there was no risk and this was a test server).

What dmesg looks like when the drives are removed and arrays degraded:
[  559.302943] ata3: exception Emask 0x10 SAct 0x0 SErr 0x1810000 action 0xe frozen
[  559.302988] ata3: SError: { PHYRdyChg LinkSeq TrStaTrns }
[  559.303048] ata3: hard resetting link
[  559.303054] ata3: nv: skipping hardreset on occupied port
[  560.024048] ata3: SATA link down (SStatus 0 SControl 300)
[  565.024048] ata3: hard resetting link
[  565.024054] ata3: nv: skipping hardreset on occupied port
[  565.327053] ata3: SATA link down (SStatus 0 SControl 300)
[  565.327064] ata3: limiting SATA link speed to 1.5 Gbps
[  570.327045] ata3: hard resetting link
[  570.327050] ata3: nv: skipping hardreset on occupied port
[  570.630048] ata3: SATA link down (SStatus 0 SControl 300)
[  570.630059] ata3.00: disabled
[  570.630078] ata3: EH complete
[  570.630087] sd 2:0:0:0: rejecting I/O to offline device
[  570.630104] ata3.00: detaching (SCSI 2:0:0:0)
[  570.630125] sd 2:0:0:0: [sda] killing request
[  570.630153] md: super_written gets error=-5, uptodate=0
[  570.630159] md/raid10:md2: Disk failure on sda2, disabling device.
[  570.630162] md/raid10:md2: Operation continuing on 1 devices.
[  570.630257] end_request: I/O error, dev sda, sector 58605128
[  570.630291] md: super_written gets error=-5, uptodate=0
[  570.633517] sd 2:0:0:0: [sda] Synchronizing SCSI cache
[  570.633651] sd 2:0:0:0: [sda]  Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
[  570.633659] sd 2:0:0:0: [sda] Stopping disk
[  570.633680] sd 2:0:0:0: [sda] START_STOP FAILED
[  570.633684] sd 2:0:0:0: [sda]  Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
[  570.655206] RAID10 conf printout:
[  570.655210]  — wd:1 rd:2
[  570.655214]  disk 0, wo:0, o:1, dev:sdb2
[  570.655217]  disk 1, wo:1, o:0, dev:sda2
[  570.659025] RAID10 conf printout:
[  570.659029]  — wd:1 rd:2
[  570.659032]  disk 0, wo:0, o:1, dev:sdb2
[  570.659313] md: md1 still in use.
[  570.738752] md: md2 still in use.
[  570.739106] md/raid1:md1: Disk failure on sda3, disabling device.
[  570.739109] md/raid1:md1: Operation continuing on 1 devices.
[  570.739380] md/raid10:md0: Disk failure on sda1, disabling device.
[  570.739382] md/raid10:md0: Operation continuing on 1 devices.
[  570.739412] md: unbind<sda2>
[  570.747449] md: export_rdev(sda2)
[  570.868144] RAID1 conf printout:
[  570.868148]  — wd:1 rd:2
[  570.868168]  disk 0, wo:0, o:1, dev:sdb3
[  570.868175]  disk 1, wo:1, o:0, dev:sda3
[  570.873025] RAID1 conf printout:
[  570.873029]  — wd:1 rd:2
[  570.873032]  disk 0, wo:0, o:1, dev:sdb3
[  570.999292] md: unbind<sda3>
[  571.007119] md: export_rdev(sda3)
[  573.633246] ata4: exception Emask 0x10 SAct 0x0 SErr 0x1810000 action 0xe frozen
[  573.633292] ata4: SError: { PHYRdyChg LinkSeq TrStaTrns }
[  573.633331] ata4: hard resetting link
[  573.633335] ata4: nv: skipping hardreset on occupied port
[  574.354052] ata4: SATA link down (SStatus 0 SControl 300)
[  579.354032] ata4: hard resetting link
[  579.354037] ata4: nv: skipping hardreset on occupied port
[  579.657041] ata4: SATA link down (SStatus 0 SControl 300)
[  579.657052] ata4: limiting SATA link speed to 1.5 Gbps
[  584.657032] ata4: hard resetting link
[  584.657038] ata4: nv: skipping hardreset on occupied port
[  584.960047] ata4: SATA link down (SStatus 0 SControl 300)
[  584.960058] ata4.00: disabled
[  584.960076] ata4: EH complete
[  584.960086] sd 3:0:0:0: rejecting I/O to offline device
[  584.960094] ata4.00: detaching (SCSI 3:0:0:0)
[  584.960124] sd 3:0:0:0: [sdb] killing request
[  584.960148] md: super_written gets error=-5, uptodate=0
[  584.960220] end_request: I/O error, dev sdb, sector 58605120
[  584.960265] md: super_written gets error=-5, uptodate=0
[  584.960322] end_request: I/O error, dev sdb, sector 58605128
[  584.960357] md: super_written gets error=-5, uptodate=0
[  584.960393] end_request: I/O error, dev sdb, sector 58605128
[  584.960428] md: super_written gets error=-5, uptodate=0
[  584.962495] sd 3:0:0:0: [sdb] Synchronizing SCSI cache
[  584.962765] sd 3:0:0:0: [sdb]  Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
[  584.962772] sd 3:0:0:0: [sdb] Stopping disk
[  584.962786] Buffer I/O error on device md2, logical block 524292
[  584.962805] sd 3:0:0:0: [sdb] START_STOP FAILED
[  584.962810] sd 3:0:0:0: [sdb]  Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
[  584.962824] lost page write due to I/O error on md2
[  584.962841] end_request: I/O error, dev sdb, sector 58605128
[  584.962877] md: super_written gets error=-5, uptodate=0
[  584.962921] md: md1 still in use.
[  584.962931] Buffer I/O error on device md2, logical block 524293
[  584.963007] lost page write due to I/O error on md2
[  584.963020] Buffer I/O error on device md2, logical block 1048646
[  584.963095] lost page write due to I/O error on md2
[  584.963104] Buffer I/O error on device md2, logical block 1048647
[  584.963179] lost page write due to I/O error on md2
[  584.963188] Buffer I/O error on device md2, logical block 1048648
[  584.963274] lost page write due to I/O error on md2
[  584.963280] md: md2 still in use.
[  584.963299] Buffer I/O error on device md2, logical block 1048694
[  584.963381] lost page write due to I/O error on md2
[  584.963391] Buffer I/O error on device md2, logical block 1056863
[  584.963468] lost page write due to I/O error on md2
[  584.963478] Buffer I/O error on device md2, logical block 1056864
[  584.963553] lost page write due to I/O error on md2
[  584.963562] Buffer I/O error on device md2, logical block 6299690
[  584.963635] lost page write due to I/O error on md2
[  584.963800] Aborting journal on device md2-8.
[  584.963836] EXT4-fs error (device md2) in ext4_delete_inode: Readonly filesystem
[  584.963868] EXT4-fs (md2): ext4_da_writepages: jbd2_start: 987 pages, ino 28972747; err -30
[  584.963877] md: super_written gets error=-19, uptodate=0
[  584.963883]
[  584.963888] EXT4-fs (md2): ext4_da_writepages: jbd2_start: 7896 pages, ino 28972690; err -30
[  584.963893]
[  584.963953] EXT4-fs warning (device md2): ext4_end_bio: I/O error writing to inode 28972747 (size 36864 starting block 689771)
[  584.964303] JBD2: I/O error detected when updating journal superblock for md2-8.
[  584.964309] EXT4-fs error (device md2): ext4_journal_start_sb: Detected aborted journal
[  584.964316] EXT4-fs (md2): Remounting filesystem read-only
[  584.972785] md0: detected capacity change from 30005002240 to 0
[  584.972794] md: md0 stopped.
[  584.972810] md: unbind<sdb1>
[  584.979298] md: export_rdev(sdb1)
[  584.979344] md: unbind<sda1>
[  584.987280] md: export_rdev(sda1)
[  585.165084] md: super_written gets error=-19, uptodate=0
[  585.165102] md: super_written gets error=-19, uptodate=0
[  589.309845] EXT4-fs error (device md2): ext4_find_entry: reading directory #262476 offset 0
[  589.963162] EXT4-fs (md2): ext4_da_writepages: jbd2_start: 1024 pages, ino 263495; err -30
[  589.963314]
[  599.310238] EXT4-fs error (device md2): ext4_find_entry: reading directory #262476 offset 0
[  604.963046] EXT4-fs (md2): ext4_da_writepages: jbd2_start: 1024 pages, ino 262464; err -30
[  604.963153]
[  609.310592] EXT4-fs error (device md2): ext4_find_entry: reading directory #262476 offset 0
[  614.963071] EXT4-fs (md2): ext4_da_writepages: jbd2_start: 1024 pages, ino 28186168; err -30
[  614.963176]
[  614.963181] EXT4-fs (md2): ext4_da_writepages: jbd2_start: 8192 pages, ino 28186171; err -30
[  614.963298]
[  614.963301] EXT4-fs (md2): ext4_da_writepages: jbd2_start: 8192 pages, ino 28972747; err -30
[  614.963405]
[  614.963408] EXT4-fs (md2): ext4_da_writepages: jbd2_start: 8192 pages, ino 28972690; err -30
[  614.963507]
[  619.310906] EXT4-fs error (device md2): ext4_find_entry: reading directory #262476 offset 0
[  619.963133] EXT4-fs (md2): ext4_da_writepages: jbd2_start: 1024 pages, ino 263495; err -30
[  619.963244]
[  629.311267] EXT4-fs error (device md2): ext4_find_entry: reading directory #262476 offset 0
[  634.963038] EXT4-fs (md2): ext4_da_writepages: jbd2_start: 1024 pages, ino 262464; err -30
[  634.963144]
[  639.311561] EXT4-fs error (device md2): ext4_find_entry: reading directory #262476 offset 0
[  644.963069] EXT4-fs (md2): ext4_da_writepages: jbd2_start: 1024 pages, ino 28186168; err -30
[  644.963172]
[  644.963176] EXT4-fs (md2): ext4_da_writepages: jbd2_start: 8192 pages, ino 28186171; err -30
[  644.963288]
[  644.963291] EXT4-fs (md2): ext4_da_writepages: jbd2_start: 8192 pages, ino 28972747; err -30
[  644.963395]
[  644.963397] EXT4-fs (md2): ext4_da_writepages: jbd2_start: 8192 pages, ino 28972690; err -30
[  644.963499]
[  649.311846] EXT4-fs error (device md2): ext4_find_entry: reading directory #262476 offset 0
[  649.963202] EXT4-fs (md2): ext4_da_writepages: jbd2_start: 1024 pages, ino 263495; err -30
[  649.963319]
[  653.202216] ata3: exception Emask 0x10 SAct 0x0 SErr 0x50000 action 0xe frozen
[  653.202317] ata3: SError: { PHYRdyChg CommWake }
[  653.202379] ata3: hard resetting link