One of the most interesting things I get to deal with at Unidesk is the integration and testing of different storage solutions for VDI. Local SAS, Local SSD, FusionIO, iSCSI SAS, NFS, etc etc. It’s been a ton of fun simply because there are so many disk configuration options out there and lots to learn about each one. But, what’s been most interesting to me lately is the idea of the Hybrid Storage Arrays that are a mix of traditional rotating disk and SSD.
Recently I have been involved with a couple of projects using these hybrid systems and through these was able to learn a little about how they work and what you should ask about when buying one. I figured this quick note would be useful to others just starting to look at the hybrid arrays.
First off, a little about the VDI workload (and you should note I am only thinking about the desktop workload at this point, nothing else). VDI desktops have been killing arrays since the first batches were brought on-line. They typically will have higher disk IO requirements than virtualized servers, they will often have more WRITES in the read:write ratio than virtualized servers will, they tend to boot and be most active in large groups, and finally there are just more desktops than there are servers, thus more IO!
Hybrid arrays can be found almost everywhere now and info on how they work is available (here's some on EMCs FAST). EMC FAST is one offering out there (image below) along with products like the XVS system from Equallogic.
Understanding that desktop IO and disk requirements are different than servers is your first step in buying an array to support your VDI project. A lot has been written about disk requirements in the last 2 years, but my favorite is still Ruben Spruijt’s (@rspruijt on twitter) article on how storage design impacts your VDI environment. If you haven’t heard of this problem start there… then come back :-)
Now with a basic understanding of VDI storage needs, we need to take a quick peek at the arrays themselves. Typically the arrays are a mix of SAS or SATA drives and some small percentage of SSD. From a straight drive perspective you may even see configs where the number of SSD drives equals the number of SAS or SATA drives. But from a writable storage perspective (due to the size of the drives) the SSD may only be 10 or 20% of the total usable space. In larger arrays this is even configurable by purchasing more trays of SDD. (Cha-CHING $$$$)
But the pure mixing of SSDs and SAS drives isn’t what makes a Hybrid array useful. It’s the movement of “hot” data to handle IO load that makes them special. And this is where the key questions come in…
How are large numbers of writes (even random writes) handled?
This is important because it has nothing to do with hot data! As noted before VDI is heavier on writes than virtualized servers often are. And when a whole bunch of users login and start pounding the keyboard at 9 am the number of writes generated there will be larger than your typical server cluster. So how will the array handle that? Does it simply let the writes land on whatever disk they are were destined for? Is there a cache for writes? If so how does that work and how many random writes can it handle? All this SSD is useless for serving up IO if every morning the write IO kills the rotating media and grinds the array to a halt.
What is “Hot” data in their array?
This is an interesting one, because the first thing to understand is that hot data is often NOT Write traffic. Remember for an array to be aware of whether data is hot or not it has to have the data already and have historical info regarding the IO. Newly written data will often not have that info. Even if you are changing existing “hot” data in the world of transient desktops and streaming apps, you are more than likely writing to a different location than yesterday(this is why the previous question is important). So what is ‘hot’ data?
Hot data is often READ data. So how does the system define HOT data? Number of IOs against the data? Are their Tiers to it? Does it require that it live on the array for a specific amount of time before being considered? If so, so how long is that wait? Understand what the array will see as hot and you can begin to see if it fits your needs or not.
How does the data move between the SSD and the rotating disks?
Here is where it gets fun. Once data is marked hot, what makes it move to the SSD drives? And if data is moving because it is 'hot' how much of it is moved at a time? Can you move small amounts of “hot” data? Like 10 or 20 or 50MB read by every desktop? Or can it only move 512MB, 1 or 2GB at a time? If it only moves in large chunks (think 1GB here) what determines that? I mean I may have 100MB of HOT data for boots in that 1GB space is that enough to move then entire 1GB? And do we want to?
How much and what it moves becomes important when you think about how much SSD there is in the system. You may only have 10 or 20% of your usable disk space in SSD and getting the truly hot pages (the shared images or in Unidesk land Shared Layers) into SSD is really key.
The whole trick with Hybrid Arrays is to leverage the more expensive SSD to serve up IO, while still keeping costs in check by using SAS or SATA to store the data that is not getting pounded. For this to work the storage array you pick should match the needs of your VDI environment and be able to do the following:
- Handle the Writes! If the writes don’t see a benefit of the SSD you are missing half their value
- Mark data as ‘hot’ appropriately and in a timely manner to have an impact on the changing/spikey workloads of desktops (ie no users at 7 am, 500 users login at 8am…)
- Move that hot data to SSD efficiently (and quickly) this should be measured in minutes not hours.
Hope this helps. Happy shopping!