Is RAID Dead ?
The Redundant Array of Inexpensive Disks or RAID has been a part of the IT vernacular for nearly 30 years.
The goal was simple - to increase performance, reliability, availability or capacity by combining multiple physical disks. From RAID-0 and 1 through RAID 4, 5 or 6 and combinations like RAID 10 or 0+1 you can provide any of the aforementioned capabilities to your application.
First introduced way back in 1987, most early RAID implementations were done in software - especially on Unix. This was typically applied to Just a Bunch of Disks or JBOD. One of the early ISVs in that space was Veritas with their VFS file system and eventually what became Storage Foundations. But one argument for some time was why would I want to take away CPU cycles from my application to manage the RAID work.
Eventually RAID capability was transformed into hardware via ASICs into what we commonly call today as a RAID Controller. They took over all of the block manipulation replication, parity calculations, etc. Now when an application needed to write data, all it did was pass that block to the RAID Controller. Once it was received - the controller would tell the application "ok I've got it" then perform the various actions to properly place the data on whatever RAID Array configuration had been defined.
Also in those early days - spinning drives were less reliable than they are today. So the data protection capabilities of RAID was a huge selling point. If you experienced a drive failure - you could hot swap it and the controller would rebuild your RAID set. There was even the concept of the Hot-Spare - so a drive installed in the array and running, but not configured in the RAID set - so it could be quickly added to a RAID set if more than 1 drive failed.
As performance needs increased, manufacturers started adding "cache" to the controllers. This additional RAM memory would allow the controller to either A) satisfy a read request of data already in the cache or B) accept a large amount of write data then, process it separately after telling the app ok.
This approach was further reinforced with either multiple controllers and/or battery backup to insure data availability in case of a failure.
With large disk arrays from vendors like EMC - a storage admin could configure multiple RAID sets as required by the application and have them all running simultaneously on the same array. Many vendors also had replication options that would allow data to be replicated to another array.
Eventually RAID became ubiquitous - you didn't buy a server or storage without it. And all of the various server and storage vendors kept upping the size of cache or speed at which is could process reads and writes.
We also started to see increases in both drive RPM and Size. RPM provided additional performance or IOPS and size obviously provided additional capacity with less "spindles".
But then we started to see some problems. As disk sizes rose, so did the rebuild time for a recovery. With modern 8TB drives it can take days or even weeks to rebuild a RAID 5 or 6 stripe. Also RAID could be sensitive to changes in firmware or drivers and not work as expected.
And often during the rebuild performance would take a hit as the controller was working to read the parity and rebuild the missing blocks. And if you had a controller failure - then sometimes you couldn't recover at all.
In the early 2000's two factors started to change things.
First was the open source movement. With Linux and Apache projects, the early systems were simple - often based on desktop machines that didn't have RAID - typically it was what is called JBOD or just a bunch of disks.
Second was the now hyperscale companies like Google, Yahoo, etc. They were all about the concept of "cheap fast" To them it was easier to have their software provide all of the performance and availability than it was to spend on expensive and complex hardware. And as they scaled to levels never before seen in corporate IT, RAID as we know wasn't really being used.
Most often, the approach to providing data redundancy and protection was replication - so replicating a block of data across multiple nodes and maintaining the metadata about those blocks for rapid recovery. Two of the most well known implementations of this approach was the Google File System and then the Hadoop File System (HDFS).
These servers did not have RAID Controllers - just SATA or SAS Host Bus Adapters (HBA) that would present the disks to the OS. So back to JBOD.
And while this approach of using 2 or 3 copies of every data block provided good recovery performance - it also ate up a lot of disk. For example if you had 3 servers with 6TB each - they could actually only support 3TB of new data each as they distributed copies of the data across the other two nodes.
But what this approach could allow was scale - massive scale. Think 10's of thousands of nodes. And it was cheap. If a node failed - who cares, just replace the node - the app had already recovered using one of the 2 other copies of the data and then would rebuild the third.
The replication approach was initially taken by the modern hyperconverged solutions like Nutanix and VMware's vSAN. You would create an initial cluster with a minimum of 3 servers and the software would pool the disk resources and allow you to present volumes to the hypervisor for consumption. As data was written it would be replicated across the 3 nodes. With this model - availability and performance actually increased as the number of nodes increased. And just like with HDFS for example there are no RAID configurations - just JBOD, the software manages the rest.
But as I stated, you effectively had to configure each node with double the storage you think you needed. While many customers appreciated the building block approach to hyperconverged along with the associated scalability, reliability and performance, the servers often didn't support large disk volumes - and when you had to divide in half to get your actual usable space, it seemed wasteful.
So the industry started to implement a very old technology known as erasure coding to reduce the replication overhead. The algorithm's that modern erasure coding is based on - called Reed Solomon dates all the way back to 1960.
Erasure Coding is used in all sorts of applications from DVD and BluRay and Data Transmission. In fact some RAID-6 implementations utilize the Reed-Solomon error code correction as part of the dual data check computations required for RAID-6. Erasure coding is based on some very fancy math known as Univariate Polynomials. Erasure Coding allows vendors using replication techniques like RF2 (two copies) and RF3 (3 copies) to reduce the disk overhead by calculating some parity and placing it instead of the actual data on disk.
So in some ways it's back to the future. We started with JBOD, then added software RAID and then Hardware RAID. But as Disk technologies matured with higher speeds and much higher capacity - traditional RAID constructs like RAID5 started becoming a constraint. And as the storage demand grew to Petabyte and Exabyte scale, the problem became even more daunting. The solutions that came from Google and others to address those challenges are now coming back to mainstream computing.
So now we are back at JBOD and viewing large collections of disk spread locally across many nodes and using a combination of replication and erasure coding to meet those initial goals of performance, availability, reliability and capacity.
So back to the title of the post - Is RAID Dead ?
Not entirely, but it's coming. Today enterprises have huge investments in RAID along with all of the associated data, resources, training, etc. And so that is not going away anytime soon.
But as companies begin to implement new solutions like Hadoop, Object Stores, Hyperconverged solutions, etc - the use of traditional RAID will begin to diminish. There is a huge movement back towards local server storage from arrays due to increase performance, reduced complexity and cost. I don't see that trend slowing down.
And while one could academically argue that Replication + Erasure Coding is in it's own right a form of RAID, these solution use JBOD as a base - and so there will be less demand for ordering systems with RAID Controllers. In fact I would think we will start to see more embedded disk HBA configurations.
All of the cloud providers use this approach to support the massive amounts of data they need to store. And now it's coming to the enterprise in a big way.
The goal was simple - to increase performance, reliability, availability or capacity by combining multiple physical disks. From RAID-0 and 1 through RAID 4, 5 or 6 and combinations like RAID 10 or 0+1 you can provide any of the aforementioned capabilities to your application.
First introduced way back in 1987, most early RAID implementations were done in software - especially on Unix. This was typically applied to Just a Bunch of Disks or JBOD. One of the early ISVs in that space was Veritas with their VFS file system and eventually what became Storage Foundations. But one argument for some time was why would I want to take away CPU cycles from my application to manage the RAID work.
Eventually RAID capability was transformed into hardware via ASICs into what we commonly call today as a RAID Controller. They took over all of the block manipulation replication, parity calculations, etc. Now when an application needed to write data, all it did was pass that block to the RAID Controller. Once it was received - the controller would tell the application "ok I've got it" then perform the various actions to properly place the data on whatever RAID Array configuration had been defined.
Also in those early days - spinning drives were less reliable than they are today. So the data protection capabilities of RAID was a huge selling point. If you experienced a drive failure - you could hot swap it and the controller would rebuild your RAID set. There was even the concept of the Hot-Spare - so a drive installed in the array and running, but not configured in the RAID set - so it could be quickly added to a RAID set if more than 1 drive failed.
As performance needs increased, manufacturers started adding "cache" to the controllers. This additional RAM memory would allow the controller to either A) satisfy a read request of data already in the cache or B) accept a large amount of write data then, process it separately after telling the app ok.
This approach was further reinforced with either multiple controllers and/or battery backup to insure data availability in case of a failure.
With large disk arrays from vendors like EMC - a storage admin could configure multiple RAID sets as required by the application and have them all running simultaneously on the same array. Many vendors also had replication options that would allow data to be replicated to another array.
Eventually RAID became ubiquitous - you didn't buy a server or storage without it. And all of the various server and storage vendors kept upping the size of cache or speed at which is could process reads and writes.
We also started to see increases in both drive RPM and Size. RPM provided additional performance or IOPS and size obviously provided additional capacity with less "spindles".
But then we started to see some problems. As disk sizes rose, so did the rebuild time for a recovery. With modern 8TB drives it can take days or even weeks to rebuild a RAID 5 or 6 stripe. Also RAID could be sensitive to changes in firmware or drivers and not work as expected.
And often during the rebuild performance would take a hit as the controller was working to read the parity and rebuild the missing blocks. And if you had a controller failure - then sometimes you couldn't recover at all.
In the early 2000's two factors started to change things.
First was the open source movement. With Linux and Apache projects, the early systems were simple - often based on desktop machines that didn't have RAID - typically it was what is called JBOD or just a bunch of disks.
Second was the now hyperscale companies like Google, Yahoo, etc. They were all about the concept of "cheap fast" To them it was easier to have their software provide all of the performance and availability than it was to spend on expensive and complex hardware. And as they scaled to levels never before seen in corporate IT, RAID as we know wasn't really being used.
Most often, the approach to providing data redundancy and protection was replication - so replicating a block of data across multiple nodes and maintaining the metadata about those blocks for rapid recovery. Two of the most well known implementations of this approach was the Google File System and then the Hadoop File System (HDFS).
These servers did not have RAID Controllers - just SATA or SAS Host Bus Adapters (HBA) that would present the disks to the OS. So back to JBOD.
And while this approach of using 2 or 3 copies of every data block provided good recovery performance - it also ate up a lot of disk. For example if you had 3 servers with 6TB each - they could actually only support 3TB of new data each as they distributed copies of the data across the other two nodes.
But what this approach could allow was scale - massive scale. Think 10's of thousands of nodes. And it was cheap. If a node failed - who cares, just replace the node - the app had already recovered using one of the 2 other copies of the data and then would rebuild the third.
The replication approach was initially taken by the modern hyperconverged solutions like Nutanix and VMware's vSAN. You would create an initial cluster with a minimum of 3 servers and the software would pool the disk resources and allow you to present volumes to the hypervisor for consumption. As data was written it would be replicated across the 3 nodes. With this model - availability and performance actually increased as the number of nodes increased. And just like with HDFS for example there are no RAID configurations - just JBOD, the software manages the rest.
But as I stated, you effectively had to configure each node with double the storage you think you needed. While many customers appreciated the building block approach to hyperconverged along with the associated scalability, reliability and performance, the servers often didn't support large disk volumes - and when you had to divide in half to get your actual usable space, it seemed wasteful.
So the industry started to implement a very old technology known as erasure coding to reduce the replication overhead. The algorithm's that modern erasure coding is based on - called Reed Solomon dates all the way back to 1960.
Erasure Coding is used in all sorts of applications from DVD and BluRay and Data Transmission. In fact some RAID-6 implementations utilize the Reed-Solomon error code correction as part of the dual data check computations required for RAID-6. Erasure coding is based on some very fancy math known as Univariate Polynomials. Erasure Coding allows vendors using replication techniques like RF2 (two copies) and RF3 (3 copies) to reduce the disk overhead by calculating some parity and placing it instead of the actual data on disk.
So in some ways it's back to the future. We started with JBOD, then added software RAID and then Hardware RAID. But as Disk technologies matured with higher speeds and much higher capacity - traditional RAID constructs like RAID5 started becoming a constraint. And as the storage demand grew to Petabyte and Exabyte scale, the problem became even more daunting. The solutions that came from Google and others to address those challenges are now coming back to mainstream computing.
So now we are back at JBOD and viewing large collections of disk spread locally across many nodes and using a combination of replication and erasure coding to meet those initial goals of performance, availability, reliability and capacity.
So back to the title of the post - Is RAID Dead ?
Not entirely, but it's coming. Today enterprises have huge investments in RAID along with all of the associated data, resources, training, etc. And so that is not going away anytime soon.
But as companies begin to implement new solutions like Hadoop, Object Stores, Hyperconverged solutions, etc - the use of traditional RAID will begin to diminish. There is a huge movement back towards local server storage from arrays due to increase performance, reduced complexity and cost. I don't see that trend slowing down.
And while one could academically argue that Replication + Erasure Coding is in it's own right a form of RAID, these solution use JBOD as a base - and so there will be less demand for ordering systems with RAID Controllers. In fact I would think we will start to see more embedded disk HBA configurations.
All of the cloud providers use this approach to support the massive amounts of data they need to store. And now it's coming to the enterprise in a big way.
Comments
Post a Comment