Hello, Claus here again. This time we’ll take a look at the work we’ve done to support systems with three tiers of storage (NVMe + SSD + HDD) in Storage Spaces Direct (S2D).
If you’re familiar with Storage Spaces in Windows Server 2012/R2, it’s important to understand that both caching and storage tiering works very differently in Storage Spaces Direct. The cache is independent of the storage pool and virtual disks (volumes), and the system manages it automatically. You don’t specify any cache as part of creating a volume. Storage tiering is now real-time and is used for both the way data is written (all storage configurations), as well as the media the data is written to (NVMe + SSD + HDD systems only).
SSD + HDD
To help explain how Storage Spaces Direct works in a NVMe + SSD + HDD storage configuration, let’s take a look at the common SSD + HDD storage configuration. In this storage configuration, Storage Spaces Direct will automatically use the SSD devices for cache and the HDD devices for capacity (see diagram below).
Figure 1 – Storage Spaces Direct with SSD and HDD
In this storage configuration, Storage Spaces Direct automatically creates two storage tiers, respectively named performance and capacity. The difference is the way S2D writes data, with the performance tier optimized for IO performance (hot data) and the capacity tier optimized for storage efficiency (cold data).
NVMe + SSD + HDD
In a NVMe + SSD + HDD storage configuration, Storage Spaces Direct automatically uses the NVMe device for cache and both the SSD and HDD devices for capacity (see diagram below)
Figure 2 – Storage Spaces Direct with NVMe, SSD and HDD
In this storage configuration, Storage Spaces Direct will also automatically create two storage tiers, respectively named performance and capacity. However, in this configuration the difference is twofold, a) the way data is written, and b) the media type (SSD or HDD). The performance tier is optimized for hot data and stored on SSD devices. The capacity tier is optimized for cold data and stored on HDD devices.
Exploring an NVMe + SSD + HDD system
Here I have four servers running Windows Server 2016 Technical Preview 5, with Storage Spaces Direct already enabled (see this blog post for more detail).
Let’s find out what disk devices, by bus type, I have in this cluster:
Get-StorageSubSystem Clu* | Get-PhysicalDisk | Group-Object BusType | FT Count, Name Count Name ----- ---- 64 SAS 8 NVMe
The above shows that I have 8 NVMe devices (2 per node), so Storage Spaces Direct automatically uses these devices for cache. What about the remaining disks?
Get-StorageSubSystem Clu* | Get-PhysicalDisk | ? BusType -eq "SAS" | Group-Object MediaType | FT Count, Name Count Name ----- ---- 48 HDD 16 SSD
The above shows that I have 16 SSD devices and 48 HDD devices, and Storage Spaces Direct automatically uses these devices for capacity.
All in all, I have a system with NVMe devices used for cache, and a combination of SSD and HDD devices used for data. So what storage tiers did Storage Spaces Direct automatically create on this configuration?
Get-StorageTier | FT FriendlyName, ResiliencySettingName, MediaType, PhysicalDiskRedundancy -autosize FriendlyName ResiliencySettingName MediaType PhysicalDiskRedundancy ------------ --------------------- --------- ---------------------- Capacity Parity HDD 2 Performance Mirror SSD 2
I have two tiers: a performance tier with mirror resiliency (hot data) on SSD devices, and a capacity tier with parity resiliency (cold data) on HDDs. Both tiers will tolerate to two failures (disk or node).
This provides much more flexibility, as you can create volumes consisting only of performance tier, which provide highest IO performance and is backed by fast flash storage (cheapest IOPS). You can create volumes consisting only of capacity tier, which provides the best storage efficiency and is backed by hard drives (cheapest capacity). And you can create volumes consisting of both performance and capacity, that automatically keep the hottest data on flash storage and the coldest data on hard drive storage.
Volume types
Let’s go ahead and create a volume using just the performance tier, a volume using just the capacity tier, and a volume using both performance and capacity tier. I will create each volume with 1,000GB of usable capacity:
New-Volume -StoragePoolFriendlyName S2D* -FriendlyName SQLServer -FileSystem CSVFS_REFS -StorageTierFriendlyName Performance -StorageTierSizes 1000GB New-Volume -StoragePoolFriendlyName S2D* -FriendlyName Archive -FileSystem CSVFS_REFS -StorageTierFriendlyName Capacity -StorageTierSizes 1000GB New-Volume -StoragePoolFriendlyName S2D* -FriendlyName VM -FileSystem CSVFS_REFS -StorageTierFriendlyName Performance, Capacity -StorageTierSizes 100GB, 900GB
Let’s take a look at how much raw storage each of these volume consumed:
Get-VirtualDisk | FT FriendlyName, Size, FootprintOnPool -autosize FriendlyName Size FootprintOnPool ------------ ---- --------------- SQLServer 1073741824000 3221225472000 Archive 1073741824000 2148288954368 VM 1073741824000 2255663136768
Even though they’re all the same size, they consume different amounts of raw storage. Let’s calculate the storage efficiency for each of the volumes:
Get-VirtualDisk | ForEach {$_.FriendlyName + " " + [Math]::Round(($_.Size/$_.FootprintOnPool),2)} SQLServer 0.33 Archive 0.5 VM 0.48
The storage efficiency for the “SQLServer” volume is 33%, which makes sense since it is created from the performance tier, which is 3-copy mirror on SSD. The storage efficiency for the “Archive” volume is 50%, which makes sense since it is created from the capacity tier, which is LRC erasure coding on HDD tolerant to two failures. I will dig further into LRC erasure coding in a future blog post, including explain storage efficiency with various layouts. Finally, the storage efficiency for the “VM” volume is 48%, which is the resulting efficiency from 100GB of performance tier (33% efficiency) and 900GB of capacity tier (50% efficiency).
Finally, lets us take a look at the storage tiers that make up each of the volumes:
Get-VirtualDisk | Get-StorageTier | FT FriendlyName, ResiliencySettingName, MediaType, Size -autosize FriendlyName ResiliencySettingName MediaType Size ------------ --------------------- --------- ---- Archive_Capacity Parity HDD 1073741824000 SQLServer_Performance Mirror SSD 1073741824000 VM_Performance Mirror SSD 107374182400 VM_Capacity Parity HDD 966367641600
You can see that the FriendlyName is a combination of the friendly name for the volume name and the friendly name of the storage tier that contributed storage.
Choosing volume types
You may wonder why there are different volume types and when to use what type. This table should help:
Mirror | Parity | Multi-Resilient | |
Optimized for | Performance | Efficiency | Balanced performance and efficiency |
Use case | All data is hot | All data is cold | Mix of hot and cold data |
Efficiency | Least (33%) | Most (50+%) | Medium (~50%) |
File System | ReFS or NTFS | ReFS or NTFS | ReFS only |
Minimum nodes | 3+ | 4+ | 4+ |
You should use mirror volumes for the absolute best storage performance and when all the data on the volume is hot, e.g. SQL Server OLTP. You should use parity volumes for the absolute best storage efficiency and when all the data on the volume is cold, e.g. backup. You should use multi-resilient volumes for balance performance/efficiency and when you have a mix of hot and cold data on the same volume, e.g. virtual machines. ReFS Real-Time Tiering will automatically tier the data inline between the mirror and parity portions of the multi-resilient volume for best read/write performance of the hot data and best storage efficiency of the cold data.
I am excited about the work we’ve done to support NVMe + SSD + HDD storage configurations. Let me know what you think.
– Claus Joergensen