RAID in 2018

Still Not Quite Obsolete

I’ve talked to a lot of professionals in the IT industry and some surprisingly don’t even know what RAID is!  Others think it is unnecessary, while some think RAID is a replacement for backups still (something admins and hardware techs have been harping about for decades now).  First, I’ll give a quick introduction into what RAID is, what it isn’t and its applications in the real world.

RAID stands for Redundant Array of Independent Disks.  I think the term is a little bit unnecessary in todays’ world but let’s break it down.

First of all we are talking about an array of connected, separate hard disk drives.  These could be 2.5″, 3.5″, SAS, SATA or SSD as far as our implementation and OS they are all essentially the same to the computer that they are connected to.

There are 5 levels or versions of RAID as follows:

  1. RAID 0 AKA striping (two drives required).  This takes two identical hard drives and combines their performance and capacities to make what appears to be a single drive.  Performance with 0 is excellent but the disadvantage is that a failure of any single disk will result in dataloss and the array going offline.  There is no recovery except for backups.   I never recommend RAID 0.
  2. RAID 1 AKA mirroring (two drives required).  It is called mirroring because both drives contain an identical copy of the data. Performance is enhanced on reads because data can be read twice as fast but simultaneously reading from the 2 separate hard drives at once.  There is a performance penalty in terms of writing since the data must be written to both drives at once (however this is usually not an issue for most servers since the majority are read intensive on average).
  3. RAID 5 (3 + drives required).  RAID 5 has in the distant past been one of the most common RAIDs as it provides enhanced performance and some redundancy but it is very prone to faults, failures and slow rebuild times.  It uses a parity drive that is essentially spread between the others but this parity often results in performance degradation unless a hardware RAID card is used.    It can withstand a single drive failure but NOT 2 drives.  Performance of reads is good but the parity calculations slow down performance.
  4. RAID 6 (4+ drives required).  Similar to RAID 6 but two drives are used for parity so it could survive 2 drives failing and is more fault tolerant.  It takes even longer to rebuild on RAID 6 than RAID 5. Performance of reads is good but the parity calculations slow down performance.
  5. RAID 10 AKA 1+0 (requires 4 or more drives).  It is a combination of the sum of two RAID 1 arrays, striped together as a RAID 0.  It delivers excellent performance and is fault tolerant (a drive of each RAID 1 could die without any ill effect aside from some performance reduction).  Rebuild times are similar to RAID 1 and are much faster than RAID 5 or 6.

Rather than over complicating this issue I will try to give a practical take in 2018 of what RAID means.  Some have said RAID is obsolete but usually they are referring to the nearly impossible resync or rebuild times on large multi-terabyte RAID 5/6 arrays.  I would agree there as I’ve never liked RAID 5 or 6 and whether you like it or not it is very impractical to use.

So what is the best way to go?

RAID 1 If you only have 2 drives then I think RAID 1 is an excellent trade off.  It is quick and easy to resync/rebuild, a single drive can die and you will still not have any data loss, yet when both are active you have a performance boost in

RAID 10 If you have 4 drives you gain extra performance in a RAID 10 configuration with fault tolerance that a single drive on each RAID 1 could die without dataloss.

The main disadvantage is that with RAID 1 and RAID 10 you are essentially losing 50% of your storage space but since storage/drives are relatively cheap I think it’s been a worthy tradeoff.

There are some people who spout that “drives are more reliable today” and “you don’t need RAID anymore” but I hardly find this true.  I’d actually argue that SSD drives may be more unreliable or unpredictable than mechanical hard drives.  One thing we can all agree on is that the most likely component to fail in a server is a hard disk and that’s not likely to change any time soon as much as we like to believe flash based storage is more reliable.  I’d also ask anyone who thinks running on a single drive (even with backups) that isn’t the performance benefit and redundancy worth running RAID?  I’m sure most datacenter techs and server admins would agree that it is much better to hotswap/replace a disk than it is to deal with downtime and restoring from backups right?

Now for the warnings.  RAID “protection” is NOT a replacement for backups even if nothing ever dies.  The reason for this is to understand the misleading term of RAID “protection” that some in the industry use.  It is true in sense that you are protected from dataloss if a single drive fails (or possibly 2 in some RAID levels).  However this doesn’t take into account natural disasters, theft, accidental or willful deletion or destruction of data.

I’d say as it stands in 2018 and beyond that everyone should be using at least RAID1 or RAID10 if possible in nearly every use case.  There are a few possible exceptions to this rule but they are rare and even then you should aim for as much redundancy as possible.

In conclusion, if you can use RAID 1, preferably RAID 10. If you can’t use RAID, learn and use it anyway.


My Take On WannaCry

Reading media coverage of the WannaCry, ransomware attack has been excruciatingly frustrating because little to no information was offered on how infection happens and how to protect yourself.

This issue has been a bit frustrating and unhelpful as an IT professional and user if I didn’t find the right answers there is something seriously wrong.  I couldn’t find the important information in any of the mainstream articles so certainly a novice or amateur user would have no chance of protecting themselves.

How Did WannaCry Infect and Spread?

Long version here from Malwarebytes

One of the key ways is still the oldest “phishing” trick in the book, via e-mail which many users are tricked into opening infected attachments.  This was not readily available in media coverage and this simple warning or announcement could have prevented a lot of new infections.  I believe this is a key factor that has not been discussed since many networks will be behind NAT and external SMB services would be blocked, having users on the LAN install the worm is an easy way to get inside and spread the infection to areas that are hardened on the outside.

The more technical explanation there is an exploit called “ETERNALBLUE” which was a hacking tool leaked from the NSA which exploited a weakness in Microsoft’s implementation of SMB (Server Message Block/filesharing protocol).   This has been widely reported but the simple way to prevent automatic infection through this method has not.

Once infected the worm essentially scans your LAN and then the internet to spread the infection further which quickly multiplied the damage and scope of this attack.

How to protect yourself?

  1. First and foremost is to update your Microsoft Windows regardless of OS (whether you have XP, Vista, 7, 10, 12 or any Server) because all Microsoft versions are apparently impacted by MS17-010 ETERNALBLUE/WannaCry
  2. Disable SMB/Filesharing in Windows and if that is not possible at least use firewall settings to block SMB/filesharing/CIFS.
  3. If the above is not possible you should physically unplug any impacted machines from the network (it could be a simple as disabling all ports on your network/switch or even unplugging entire switches if possible).

Who is to blame?

There is plenty of blame to go around but currently a lot of it is coming from Microsoft who is blaming users for not patching and the NSA for hoarding these exploits and not notifying them or users beforehand.

In all fairness Microsoft did issue patches for even unsupported OS’s like Vista and XP on March 14th, 2017.

Many have mused that the NSA should have at last notified Microsoft the moment they realized their hacking tools were leaked.

At the end of the day the question is how could Microsoft have left open such a serious vulnerability for so long?  Was it an intentional backdoor and was it collaboration between Microsoft and the NSA or other third parties?

Some Can’t Patch

Some systems may be running on internal networks on their own LAN but were still infected so they wouldn’t be patched.  To make matters worse the chances are these would more likely be critical data and infrastructure that are impacted in this case.

Other machines are not managed properly or remotely and are deployed with internet access making them sitting ducks for these types of attacks.

There are also some who just don’t patch because the risk to impacting existing services is too great.  Although I would argue the risk is much higher to not patch and not upgrade or migrate your applications to a more secure platform if you get hit with ransomware like this.

These Issues Are Nothing New

With the Snowden revelations many have worried that US tech companies being forced to provide backdoor access to the NSA would be vulnerable should other hackers discovery the vulnerabilities or intentional backdoors on their own, or in this case when the tools and exploits were somehow leaked.

In the wider scope of things Microsoft has seen worms of this scale in the past, it’s nothing new.  There are no worldwide protocols for notifying users or defending against such worms and this will certainly become an increasingly problem with more and more devices online especially with IoT and so many devices that are connected that we don’t think about, and that don’t get patched or may not have an easy or automatic way of updating.