Deploying Data Deduplication for VDI storage in Windows Server 2012 R2

This post is a part of the nine-part “What’s New in Windows Server & System Center 2012 R2” series that is featured on Brad Anderson’s In the Cloud blog. Today’s blog post covers Data Deduplication and how it applies to the larger topic of “Transform the Datacenter.” To read that post and see the other technologies discussed, read today’s post: “Delivering Infrastructure as a Service (IAAS).”

With the Windows Server 2012 R2 Preview, Data Deduplication is extended to the remote storage of the VDI workload:

CSV Volume support

Faster deduplication of data

Deduplication of open (in use) files

Faster read/write performance of deduplicated files

See http://blogs.technet.com/b/filecab/archive/2013/07/31/extending-data-deduplication-to-new-workloads-in-windows-server-2012-r2.aspx for more details.

Why do I want to use Data Deduplication with VDI?

To start with: You will save space! Deduplication rates for VDI deployments can range as high as 95% savings. Of course that number will vary depending on the amount of user data, etc and it will also change over the course of any one day.

Data Deduplication optimizes files as a post processing operation. That means, as data is added over the course of a day, it will not be optimized immediately and take up extra space on disk. Instead, the new data will be processed by a background deduplication job. As a result, the optimization ratio of a VDI deployment will fluctuate a bit over the course of a day, depending on home much new data is added. By the time next optimization is done, savings will be high again.

Saving space is great on its own, but it has an interesting side effect. Volumes that were always too small, but had other advantages are suddenly viable. One such example are SSD volumes. Traditionally, you had to deploy very many of these drives to reach volume sizes that were viable for a VDI deployment. This was of course expensive for the disks, but also considering the increased needs for JBODs, power, cooling, etc. With Data Deduplication in the picture SSD based volumes can suddenly hold vastly more data and we can finally utilize more of their IO capabilities without incurring additional infrastructure costs.

On the other hand, due to the fact that Data Deduplication consolidates files, more efficient caching mechanisms are possible. This results in improving the IO characteristics of the storage subsystem for some types of operations.

As a result of these, we can often stretch the VM capacity of the storage subsystem without buying additional hardware or infrastructure.

How do I deploy VDI with Data Deduplication in Windows Server 2012 R2 Preview then?

This turns out to be relatively straight forward, assuming you know how to setup VDI, of course. The generic VDI setup will not be covered here, but rather we will cover how Data Deduplication changes things. Let’s go through the steps:

1. Machine deployment

First and foremost, to deploy Data Deduplication with VDI, the storage and compute responsibilities must be provided by separate machines.

The good news is that the Hyper-V and VDI infrastructure can remain as it is today. The setup and configuration of both is pretty much unaltered. The exception is that all VHD files for the VMs must be stored on a file server running Windows Server 2012 R2 Preview. The storage on that file server may be directly attached disks or provided by a SAN/iSCSI.

In the interest of ensuring that storage stays available, the file server should ideally be clustered with CSV volumes providing the storage locations for the VHD files.

2. Configuring the File Server

Create a new CSV volume on the File Server Cluster using your favorite tool (we would suggest System Center Virtual Machine Manager). Then enable Data Deduplication on that volume. This is very easy to do in PowerShell:

Enable-DedupVolume C:\ClusterStorage\Volume1 –UsageType HyperV

This is basically the same way Data Deduplication is enabled for a general file share, however it ensures that various advanced settings (such as whether open files should be optimized) are configured for the VDI workload.

In the Windows Server 2012 R2 Preview one additional step has to be done that will not be required in the future. The default policy for Data Deduplication is now to only optimize files that are older than 3 days. This of course does not work for open VHD files since they are constantly being updated. In the future, Data Deduplication will address this by enabling “Partial File Optimization” mode, in which it optimizes parts of the file that are older than 3 days. To enable this mode in the Preview, run the following command

Set-DedupVolume C:\ClusterStorage\Volume1 –OptimizePartialFiles

3. VDI deployment

Deploy VDI VMs as normal using the new share as the storage location for VHDs.

With one caveat.

If you made a volume smaller than the amount of data you are about to deploy on it, you need some special handling. Data Deduplication runs as a post-processing operation.

Let us say we want to deploy 120GB of VHD files (6 VHD files of 20GB each) onto a 60 GB volume with Data Deduplication enabled.

To do this, deploy VMs onto the volume as they will fit leaving at least 10GB of space available. In this case, we would deploy 2 VMs (20GB + 20GB + 10GB < 60GB). Then run a manual deduplication optimization job:

Start-DedupJob C:\ClusterStorage\Volume1 –Type Optimization

Once this completes, deploy more VMs. Most likely, after the first optimization, there will be around 10GB of space used. That leaves room for another 2 VMs. Deploy these 2 VMs and repeat the optimization run.

Repeat this procedure until all VMs are deployed. After this the default background deduplication job will handle future changes.

4. Ongoing management of Data Deduplication

Once everything is deployed, managing Data Deduplication for VDI is no different than managing it for a general file server. For example, to get optimization savings and status:

Get-DedupVolume | fl
 Get-DedupStatus | fl
 Get-DedupJob

It may at times occur that a lot of new data is added to the volume and the standard background task is not able to keep up (since it stops when the server gets busy). In that case you can start a “throughput” optimization job that will simply keep going until the work is done:

Start-DedupJob D: –Type Optimization

Wrap-up

Overall, deploying Data Deduplication for VDI is relatively simple operation, though it may require some additional planning along the way.

To see all of the posts in this series, check out the What’s New in Windows Server & System Center 2012 R2 archive.

Deploying Data Deduplication for VDI storage in Windows Server 2012 R2

Why do I want to use Data Deduplication with VDI?

How do I deploy VDI with Data Deduplication in Windows Server 2012 R2 Preview then?

1. Machine deployment

2. Configuring the File Server

3. VDI deployment

4. Ongoing management of Data Deduplication

Wrap-up

Trending Articles

Practice Sheet of Right form of verbs for HSC Students

Download: FK ft Shenky – Nakuyewa ”Prod by: Shenky”

How to win at Markstrat (Markstrat Tips and Tricks) – Vodites

Ominde Commission Report and Recommendations – Ominde Report of 1964

Bureau of Internal Revenue: Regional Offices (Directory)

GO 53 on Enhancement of Ex-gratia upto 5 Lakhs Toddy Tappers in Telangana

Cakewalk CA-2A Leveling Amplifier v2.0.1.97 WiN, v2.0.1.96 OSX Incl Keygen

Mp3 Download: Mdu - Kunjenjenjena

How the kill the job , when DTP request running for long hours.

Microsoft Intune から展開しているアプリのアップデートについて

18-year-old girl was beaten for half an hour by two Northampton men in 'an...

Car crash in Dunton Bassett leaves driver in critical condition

Macky 2, Two Others In Road Accident

Application log 00000000000000089514: Could not convert queue DLVST90CLNT

Detroit mafia: D’Anna Brothers agree to plea deal

Delivery block field greyed out using VA02

Muloraki Au

【個人撮影】スマホのプライベート映像♪「中に出さないで///」カラオケ屋での生ハメ撮りが流出ｗ【リベンジポルノ】＠PornHub

BREAKING NEWS: Diamond Platnumz Is Reported Dead After Ghastly Car Accident

FIAT 500 B0111 B0112