Quantcast
Channel: Storage at Microsoft
Viewing all articles
Browse latest Browse all 268

Deep Dive: Volumes in Storage Spaces Direct

$
0
0

Cosmos Darwin

This kid has been at Microsoft for one year!

Hi there! I’m Cosmos. I joined the High Availability & Storage PM team one year ago this week. I thought it was about time I should post my first blog. You can follow me on Twitter @cosmosdarwin.

Introduction

In Storage Spaces Direct, volumes derive their fault tolerance from mirroring, parity encoding, or both. We’ve taken to calling this last option mixed resiliency, or multi-resiliency, or sometimes “hybrid” resiliency, and it’s very exciting.

Briefly…

  • Mirroring is similar to distributed, software-defined RAID-1. It provides the fastest possible reads/writes, but isn’t very capacity efficient, because you’re effectively keeping full extra copies of everything. It’s best for actively written data, so-called “hot” data.
  • Parity is similar to distributed, software-defined RAID-5 or RAID-6. Our implementation includes several breakthrough advancements developed by Microsoft Research. Parity can achieve far greater capacity efficiency, but at the expense of computational work for each write. It’s best for infrequently written, so-called “cold” data.
  • Beginning in Windows Server 2016, one volume can be part mirror, part parity, and ReFS will automagically move data back and forth between these “tiers” in real-time depending on what’s hot and what’s not. This mixed resiliency gives the best of both – fast, cheap writes of hot data, and better efficiency for cooler data. As Claus says, what’s not to like!

Storage Efficiency

If that was too much, too fast, stay tuned for an upcoming blog post from Claus on this very subject.

Let’s See It

So, that’s the concept. How can you see all this in Windows? The Storage Management API is the answer, but unfortunately it’s not quite as straightforward as you might think. This blog aims to untangle the many objects and their properties, so we can get one comprehensive view, like this:

Screenshot

Volumes, their capacities, how they’re filling up, resiliency, footprints, efficiency, all in one easy view.

The first thing to understand is that in Storage Spaces Direct, every “volume” is really a little hamburger-like stack of objects. The Volume sits on a Partition; the Partition sits on a Disk; that Disk is a Virtual Disk, also commonly called a Storage Space.

Storage Management API Stack

Classes and their relationships in the Windows Storage Management API.

Let’s grab properties from several of these, to assemble the picture we want.

We can get the volumes in our system by launching PowerShell as Administrator and running Get-Volume. The key properties are the FileSystemLabel, which is how the volume shows up mounted in Windows (literally – the name of the folder), the FileSystemType, which shows us whether the volume is ReFS or NTFS, and the Size.

Get-Volume | Select FileSystemLabel, FileSystemType, Size

Given any volume, we can follow associations down the hamburger. For example, try this:

Get-Volume -FileSystemLabel <Choose One> | Get-Partition

Neat! Now, the Partition isn’t very interesting, and frankly, neither is the Disk, but following these associations is the safest way to get to the underlying VirtualDisk (the Storage Space!), which has many key properties we want.

$Volume = Get-Volume -FileSystemLabel <Choose One>
$Partition = $Volume | Get-Partition
$Disk = $Partition | Get-Disk
$VirtualDisk = $Disk | Get-VirtualDisk

Voila! (We speak Français in Canada.) Now we have the VirtualDisk underneath our chosen Volume, saved as $VirtualDisk. You could shortcut this whole process and just run Get-VirtualDisk, but theoretically you can’t be sure which one is under which Volume.

We now get to deal with two cases.

Case One: No Tiers

If the VirtualDisk is not tiered, which is to say it uses mirror or parity, but not both, and it was created without referencing any StorageTier (more on these later), then it has several key properties.

  • First, its ResiliencySettingName will be either Mirror or Parity.
  • Next, its PhysicalDiskRedundancy will either be 1 or 2. This lets us distinguish between what we call “two-way mirror” versus “three-way mirror”, or “single parity” versus “dual parity” (erasure coding).
  • Finally, its FootprintOnPool tells us how much physical capacity is occupied by this Space, once the resiliency is accounted for. The VirtualDisk also has its own Size property, but this will be identical to that of the Volume, plus or minus some modest metadata.

Check it out!

$VirtualDisk | Select FriendlyName, ResiliencySettingName, PhysicalDiskRedundancy, Size, FootprintOnPool

If we divide the Size by the FootprintOnPool, we obtain the storage efficiency. For example, if some Volume is 100 GB and uses three-way mirror, its VirtualDisk FootprintOnPool should be about 300 GB, for 33.3% efficiency.

Case Two: Tiers

Ok, that wasn’t so bad. Now, what if the VirtualDisk is tiered? Actually, what is tiering?

For our purposes, tiering is when multiple sets of these properties coexist in one VirtualDisk, because it is effectively part mirror, part parity. You can tell this is happening if its ResiliencySettingName and PhysicalDiskRedundancy properties are completely blank. (Helpful! Thanks!)

The secret is: an extra layer in our stack – the StorageTier objects.

Storage Management API Stack

Sometimes, volumes stash some properties on their StorageTier(s).

Let’s grab these, because its their properties we need. As before, we can follow associations.

$Tiers = $VirtualDisk | Get-StorageTier

Typically, we expect to get two, one called something like “Performance” (mirror), the other something like “Capacity” (parity). Unlike in 2012 or 2012R2, these tiers are specific to one VirtualDisk. Each has all the same key properties we got before from the VirtualDisk itself – namely ResiliencySettingName, PhysicalDiskRedundancy, Size, and FootprintOnPool.

Check it out!

$Tiers | Select FriendlyName, ResiliencySettingName, PhysicalDiskRedundancy, Size, FootprintOnPool

For each tier, if we divide the Size by the FootprintOnPool, we can obtain its storage efficiency.

Moreover, if we divide the sum of the sizes by the sum of the footprints, we obtain the overall efficiency of the mixed resiliency or “multi-resilient” Volume (or… VirtualDisk…? Whatever!).

U Can Haz Script

This script puts it all together, along with some formatting/prettifying magic, to produce this view. You can easily see your volumes, their capacity, how they’re filling up, how much physical capacity they occupy (and why), and the implied storage efficiency, in one easy table.

Let me know what you think!

Screenshot

Volumes, their capacities, how they’re filling up, resiliency, footprints, efficiency, all in one easy view.

Notes:

  1. This screenshot was taken on a 4-node system. At 16 nodes, Dual Parity can reach up to 80.0% efficiency.
  2. Because it queries so many objects and associations in SM-API, the script can take up to several minutes to run.
  3. You can download the script here, to spare yourself the 200-line copy/paste: http://cosmosdarwin.com/Show-PrettyVolume.ps1
# Written by Cosmos Darwin, PM
# Copyright (C) 2016 Microsoft Corporation
# MIT License
# 8/2016

Function ConvertTo-PrettyCapacity {

    Param (
        [Parameter(
            Mandatory=$True,
            ValueFromPipeline=$True
            )
        ]
    [Int64]$Bytes,
    [Int64]$RoundTo = 0 # Default
    )

    If ($Bytes -Gt 0) {
        $Base = 1024 # To Match PowerShell
        $Labels = ("bytes", "KB", "MB", "GB", "TB", "PB", "EB", "ZB", "YB") # Blame Snover
        $Order = [Math]::Floor( [Math]::Log($Bytes, $Base) )
        $Rounded = [Math]::Round($Bytes/( [Math]::Pow($Base, $Order) ), $RoundTo)
        [String]($Rounded) + $Labels[$Order]
    }
    Else {
        0
    }
    Return
}


Function ConvertTo-PrettyPercentage {

    Param (
        [Parameter(Mandatory=$True)]
            [Int64]$Numerator,
        [Parameter(Mandatory=$True)]
            [Int64]$Denominator,
        [Int64]$RoundTo = 0 # Default
    )

    If ($Denominator -Ne 0) { # Cannot Divide by Zero
        $Fraction = $Numerator/$Denominator
        $Percentage = $Fraction * 100
        $Rounded = [Math]::Round($Percentage, $RoundTo)
        [String]($Rounded) + "%"
    }
    Else {
        0
    }
    Return
}

### SCRIPT... ###

$Output = @()

# Query Cluster Shared Volumes
$Volumes = Get-StorageSubSystem Cluster* | Get-Volume | ? FileSystem -Eq "CSVFS"

ForEach ($Volume in $Volumes) {

    # Get MSFT_Volume Properties
    $Label = $Volume.FileSystemLabel
    $Capacity = $Volume.Size | ConvertTo-PrettyCapacity
    $Used = ConvertTo-PrettyPercentage ($Volume.Size - $Volume.SizeRemaining) $Volume.Size

    If ($Volume.FileSystemType -Like "*ReFS") {
        $Filesystem = "ReFS"
    }
    ElseIf ($Volume.FileSystemType -Like "*NTFS") {
        $Filesystem = "NTFS"
    }

    # Follow Associations
    $Partition   = $Volume    | Get-Partition
    $Disk        = $Partition | Get-Disk
    $VirtualDisk = $Disk      | Get-VirtualDisk

    # Get MSFT_VirtualDisk Properties
    $Footprint = $VirtualDisk.FootprintOnPool | ConvertTo-PrettyCapacity
    $Efficiency = ConvertTo-PrettyPercentage $VirtualDisk.Size $VirtualDisk.FootprintOnPool

    # Follow Associations
    $Tiers = $VirtualDisk | Get-StorageTier

    # Get MSFT_VirtualDisk or MSFT_StorageTier Properties...

    If ($Tiers.Length -Lt 2) {

        If ($Tiers.Length -Eq 0) {
            $ReadFrom = $VirtualDisk # No Tiers
        }
        Else {
            $ReadFrom = $Tiers[0] # First/Only Tier
        }

        If ($ReadFrom.ResiliencySettingName -Eq "Mirror") {
            # Mirror
            If ($ReadFrom.PhysicalDiskRedundancy -Eq 1) { $Resiliency = "2-Way Mirror" }
            If ($ReadFrom.PhysicalDiskRedundancy -Eq 2) { $Resiliency = "3-Way Mirror" }
            $SizeMirror = $ReadFrom.Size | ConvertTo-PrettyCapacity
            $SizeParity = [string](0)
        }
        ElseIf ($ReadFrom.ResiliencySettingName -Eq "Parity") {
            # Parity
            If ($ReadFrom.PhysicalDiskRedundancy -Eq 1) { $Resiliency = "Single Parity" }
            If ($ReadFrom.PhysicalDiskRedundancy -Eq 2) { $Resiliency = "Dual Parity" }
            $SizeParity = $ReadFrom.Size | ConvertTo-PrettyCapacity
            $SizeMirror = [string](0)
        }
        Else {
            Write-Host -ForegroundColor Red "What have you done?!"
        }
    }

    ElseIf ($Tiers.Length -Eq 2) { # Two Tiers

        # Mixed / Multi- / Hybrid
        $Resiliency = "Mix"

        ForEach ($Tier in $Tiers) {
            If ($Tier.ResiliencySettingName -Eq "Mirror") {
                # Mirror Tier
                $SizeMirror = $Tier.Size | ConvertTo-PrettyCapacity
                If ($Tier.PhysicalDiskRedundancy -Eq 1) { $Resiliency += " (2-Way" }
                If ($Tier.PhysicalDiskRedundancy -Eq 2) { $Resiliency += " (3-Way" }
            }
        }
        ForEach ($Tier in $Tiers) {
            If ($Tier.ResiliencySettingName -Eq "Parity") {
                # Parity Tier
                $SizeParity = $Tier.Size | ConvertTo-PrettyCapacity
                If ($Tier.PhysicalDiskRedundancy -Eq 1) { $Resiliency += " + Single)" }
                If ($Tier.PhysicalDiskRedundancy -Eq 2) { $Resiliency += " + Dual)" }
            }
        }
    }

    Else {
        Write-Host -ForegroundColor Red "What have you done?!"
    }

    # Pack

    $Output += [PSCustomObject]@{
        "Volume" = $Label
        "Filesystem" = $Filesystem
        "Capacity" = $Capacity
        "Used" = $Used
        "Resiliency" = $Resiliency
        "Size (Mirror)" = $SizeMirror
        "Size (Parity)" = $SizeParity
        "Footprint" = $Footprint
        "Efficiency" = $Efficiency
    }
}

$Output | Sort Efficiency, Volume | FT

Viewing all articles
Browse latest Browse all 268

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>