Hi folks, Ned here again. By now, you know that DFS Replication has some major new features in Windows Server 2012 R2. Today I talk about one of the most radical: DFSR database cloning.
Prepare for a long post, this has a walkthrough…
The old ways are not always the best
DFSR – or any proper file replication technology - spends a great deal of time validating that servers have the same knowledge. This is critical to safe and reliable replication; if a server doesn’t know everything about a file, it can’t tell its partner about that file. In DFSR, we often refer to this “initial build” and “initial sync” processing as “initial replication”. DFSR needs to grovel files and folders, record their information in a database on that volume, exchange that information between nodes, stage files and create hashes, then transmit that data over the network. Even if you preseed the files on each server before configuring replication, the metadata transmissions are still necessary. Each server performs this initial build process locally, and then the non-authoritative server checks his work against an authoritative copy and reconciles the differences.
This process is necessarily very expensive. Heaps of local IO, oodles of network conversation, tons of serialized exchanges based on directory structures. As you add bigger and more complex datasets, initial replication gets slower. A replicated folder that contains tens of millions of preseeded files can take weeks to synchronize the databases, even with preseeding stopping the need to send the actual files.
Furthermore, there are times when you need to recreate replication of previously synchronized data, such as when:
1. Upgrading operating systems
2. Replacing computer hardware
3. Recovering from a disaster
4. Redesigning the replication topology
Any one of these requires re-running initial replication on at least one node. This has been the way of DFSR since Microsoft introduced it in Windows Server 2003 R2.
Cutting out the middle man
DFSR database cloning is an optional alternative to so-called classic initial replication. By providing each downstream server with an exported copy of the upstream server’s database and preseeded files, DFSR reduces or eliminates the need for over-the-wire metadata exchange. DFSR database cloning also provides multiple file validation levels to ensure reconciliation of files added, modified, or deleted after the database export but before the database import. After file validation, initial sync is now instantaneous if there are no differences. If there are differences, DFSR only has to synchronize the delta of changes as part of a shortened initial sync process.
We are talking about fundamental, state of the art performance improvements here, folks. To steal from my previous post, let’s compare a test run with ~10 terabytes of data in a single volume comprising 14,000,000 preseeded files:
“Classic” initial sync | Time to convergence |
Preseeded | ~24 days |
Now, with DB cloning:
Validation Level | Time to export | Time to import | Improvement % |
2 – Full | 9 days, 0 hours | 5 days, 10 hours | 40% |
1 – Basic | 2 hours, 48 minutes | 9 hours, 37 minutes | 98% |
0 – None | 1 hour, 13 minutes | 6 hours, 8 minutes | 99% |
I think we can actually do better than this – we found out recently that we’re having some CPU underperformance in our test hardware. I may be able to re-post even better numbers someday.
For instance, here I created exactly one million files and cloned that volume, using VMs running on a 3 year old test server.
The export:
The import:
That’s just over 12 minutes total. It’s awesome, like a grizzly bear that shoots lasers from its eyeballs. Yes, I own of these shirts and I am not ashamed.
At a high level
Let’s examine the mainline case of creating a new replication topology using DB cloning:
1. You createa replication group and a replicated folder, then add a server as a member of that topology (but no partners, yet). This will be the “upstream” (source) server.
2. You let initial build complete
3. You export the cloned database from the upstream server
4. You preseed the files to the downstream (destination) server and copy in the exported clone DB files
5. You import the cloned database on the downstream server
6. You add the downstream server to the replication group and RF membership, just like classic DFSR
7. You let the initial sync validation complete
If you did everything right, step 7 is done instantly, and the server is now replicating normally for all further file additions, modifications, and deletions. It’s straightforward stuff, with only a handful of steps.
Walkthrough
Let’s get some hands-on with DB cloning. Below is a walkthrough using the new DFSR Windows PowerShell module and the mainstream “setting up a new replication topology” scenario.
Requirements and sample setup
- Active Directory domain with at least one domain controller (does not need to run Windows Server 2012 R2)
- AD schema updated to at least Windows Server 2008 R2 (there are no forest or domain functional level requirements)
- Two file servers running Windows Server 2012 R2 and joined to the domain (Windows Server 2012 and earlier file servers cannot participate in cloning scenarios, but do support replication with Windows Server 2012 R2)
You can use virtualized DFSR servers or physical ones; it makes no difference. This walkthrough uses the following domain environment as an example:
- One domain controller
- Two member servers, named SRV01 and SRV02
Configure DFSR
To configure the DFSR role on SRV01 and SRV02 using Windows PowerShell, run the following command on each server:
Install-WindowsFeature –Name Fs-Dfs-Replication -IncludeManagementTools
Alternatively, to configure the DFSR role using Server Manager:
1. Start Server Manager.
2. Click Manage, and then click Add Roles and Features.
3. Proceed to the Server Roles page, then select DFS Replication, leave the default option to install the Remote Server Administration Tools selected, and continue to the end.
Configure volumes
On SRV01 and SRV02, configure an F:, G:, and H: drive with NTFS. Each drive should be at least 2GB in size. If your test servers do not already have these drives configured or don’t have additional disks, you can shrink the existing C: volume with Resize-Partition, DiskMgmt.Msc, or Diskpart.exe, and then format the new volumes. Multiple drives allows you test cloning multiple times without starting over too often – remember, DFSR databases are per-volume, and therefore cloning is as well.
For example, using Windows PowerShell with a virtual machine that has one 40GB disk and C: volume:
Get-Partition | Format-Table -auto
Resize-Partition -DiskNumber 0 -PartitionNumber 1 -Size 33GB
New-Partition -DiskNumber 0 –Size 2GB -DriveLetter f | Format-Volume
New-Partition -DiskNumber 0 –Size 2GB -DriveLetter g | Format-Volume
New-Partition -DiskNumber 0 –Size 2GB -DriveLetter h | Format-Volume
Clone a DFSR database
1. On the upstream server SRV01 only, create H:\RF01 and create or copy in some test files (such as by copying the 2,000 largest immediate file contents of the C:\Windows\SysWow64 folder).
Important:Windows Server 2012 R2 Preview contains a bug that restricts cloning to under 3,100 files and folders – if you add more files, cloning export never completes. Ehhh, sorry: we fixed this issue before Preview shipped but even then it was too late due to the build’s age. Do not attempt to clone more than 3,100 files while using the Preview version of Windows Server 2012 R2 Basic validation. If you want to use more files, use –Validation None. The RTM version of DFSR DB cloning will not have this limitation.
Use the New-DfsReplicationGroup, New-DfsReplicatedFolder, Add-DfsrMember, and Set-DfsrMembership cmdlets to create a replicated folder and membership for SRV01 only, using only the H:\RF01 directory replicated folder. You must specify PrimaryMember as $True for this membership, so that the server performs initial build with no need for partners. You can run these commands on any server.
Note: Do not add SRV02 as a member nor create a connection between the servers in this new RG. We don’t want that server starting classic replication.
Sample:
New-DfsReplicationGroup -GroupName "RG01"
New-DfsReplicatedFolder -GroupName "RG01" -FolderName "RF01"
Add-DfsrMember -GroupName "RG01" -ComputerName SRV01
Set-DfsrMembership -GroupName "RG01" -FolderName "RF01" -ContentPath "H:\RF01" -ComputerName SRV01 -PrimaryMember $True
Update-DfsrConfigurationFromAD –ComputerName SRV01
Note the sample output below and how I used the built-in –Verbose parameter to see more AD polling details:
2. Wait for a DFS Replication Event 4112 in the DFS Replication Event Log, which indicates that the replication folder initialized successfully as primary.
Note below in the sample output how I have a 6020 event; in a cloning scenario, it is expected and supported, unlike the implied messaging.
3. Export the cloned database and volume config XML for the H: drive. Export requires the output folder for the database and configuration XML file already exist. It also requires that no replicated folders on that volume be in an initial build or initial sync phase of processing.
Sample:
New-Item -Path "H:\Dfsrclone" -Type Directory
Export-DfsrClone -Volume H: -Path "H:\Dfsrclone"
Note the use of the –Validation parameter in the sample out below. Cloning provides three levels of file validation during the export and import processing. These ensure that if you are allowing users to alter data on the upstream server while cloning is occurring, files are later reconciled on the downstream.
- None - No validation of files on source or destination server. Fastest and most optimistic. Requires that you preseed data perfectly. Any modification of data during the clone processing on the servers will not be detected or replicated until it is later modified after cloning.
- Basic - (Default behavior, Microsoft recommended). Each file’s existing database record is updated with a hash of the ACL, the file size, and he last modified time. Good mix of fidelity and performance. This is the recommended validation level, and the maximum one you should use if you are replicating more than 10TB of data. Yes, we are going to support much more than 10TB and 11M files in WS2012 R2 as long as you use cloning; we’ll give you an official number at RTM.
- Full - Same hashing mechanism used by DFSR during normal operations. Hash stored in database record for each file. Slowest but highest fidelity. If you exceed 10TB, we do not recommend using this value due to the comparatively poor performance.
We recommend that you do not allow users to add, modify, or delete files on the source server as this makes cloning less effective, but we realize you live in the real world. Hence, the validation code.
Important:You should not let users modify or access files on the downstream (destination) server until cloning completes end-to-end and replication is working. This is no different from our normal “classic” initial sync replication best practice for the past 8 years of DFSR, as there is a high likelihood that users will lose their changes through conflict resolution or movement to the preexisting files store. If users need to access these files and make changes, only allow them to access the original source server from which you exported.
Note the hint outputs above. The export cmdlet shows a suggested copy command for the database export folder. It also suggests preseeding hints for any replicated folders on that volume that will clone. All you have to do is fill in your destination server name and RF path.
4. Wait for a DFS Replication Event 2402 in the DFS Replication Event Log, which indicates that the export completed successfully. As you can see from the sample outputs, there are four event IDs of note when exporting: 2406, 2410 (there may be many of these, they are progress indicators), 2402, and finally 2002 (which brings the volume back online for normal replication).
As you can see from my example, I cloned more than 3,100 files. I told you we fixed it already!
5. Preseed the file and folder data from the source computer to the new downstream computer that will clone the DFS Replication database.
Important:There should be no existing replicated folder content (folders, files, or database) on the downstream server's volume that will perform cloning – let the preseeding fill it all in in this mainstream scenario. Microsoft recommends that you do not create network shares to the data until completion of cloning and do not allow users to add, modify, or change files on the downstream server until post-initial replication is operational.
Sample preseeding command hint:
Robocopy.exe "H:\RF01" "\\SRV02\H$\RF01" /E /B /COPYALL /R:6 /W:5 /MT:64 /XD DfsrPrivate /TEE /LOG+:preseed.log
Important:Do not use the robocopy /MIR option on the root of a volume, do not manually create the replicated folder on the downstream server, and do not run robocopy files that you already copied previously (i.e. if you have to start over, delete the destination folder and file structure and really start over). Let robocopy create all folders and copy all contents to the downstream server, via the /e /b /copyall options, every time you run it. Otherwise, you are very likely to end up with hash mismatches.
Robocopy can be a bit… finicky.
6. Copy the contents of the exported folder, both the database and xml, to the downstream server and save them in a temporary folder on the volume that contains the populated file data.
Sample database file copy command:
Robocopy.exe "H:\Dfsrclone" "\\SRV02\h$\dfsrclone" /B
7. On the downstream server SRV02, ensure that you correctly performed preseeding by using the Get-DfsrFileHash cmdlet to spot-check folders and files, and then compare to the upstream copies.
This sample shows hashes for all the files beginning with “p”:
PS C:\> Get-DfsrFileHash \\SRV01\H$\RF01\pri*
PS C:\> Get-DfsrFileHash "\\SRV02\H$\RF01\pri*"
Sample output showing an easy “eyeball comparison”
I recommend you run this on multiple small file subsets and at a few subfolder levels. There are many other examples of using the new Get-DfsrFileHash cmdlet here on TechNet already, including using the compare-object cmdlet to get fancy-schmancy.
8. Ensure that the System Volume Information\DFSR folder does not exist on this downstream SRV02 server H: drive.
Important: Naturally, this server must not already be participating in replication on that volume and if it is, you cannot clone into it.
Sample (note: you may need to stop the DFSR service, run this, and then start the DFSR service):
RD 'H:\System Volume Information\DFSR' –Recurse -Force
Note: When re-using existing files that were previously replicated, you are likely to run into some benign errors when running this command due the MAX_PATH limitations of RD, where some of the Staging folder contents will be too long to delete. You can ignore those warnings, or if you want to clean out the folder completely, you can use this workaround:
A. Create an empty folder on H: called "H:\empty"
B. Run the following command:
robocopy h:\empty "h:\system volume information\dfsr" /MIR
C. Delete the now empty “system volume information\DFSR” folder after the robocopy command completes.
9. Import the cloned database on SRV02. For example:
Import-DfsrClone -Volume H: -Path "H:\Dfsrclone"
10. Wait for a DFSR Informational Event 2404 in the DFS Replication Event Log, which indicates that the import completed successfully. As you can see from the sample outputs, there are four event IDs of note when importing: 2412, 2416 (there may be many of these, they are progress indicators), 2418, and finally 2404.
11. Add the downstream SRV02 server as a member of the replication group using Add-DfsrMember, set its membership using Set-DfsrMembership for the -ContentPath matching H:\rf01, and create bi-directional replication connections between the upstream and downstream servers using Add-DfsrConnection.
Add-DfsrMember -GroupName "rg01" -ComputerName srv02
Add-DfsrConnection -GroupName "rg01" -SourceComputerName srv01 -DestinationComputerName srv02
Set-DfsrMembership -GroupName "rg01" -FolderName "rf01" -ComputerName srv02 -ContentPath "h:\rf01"
Update-DfsrConfigurationFromAD srv01,srv02
Note in the sample output how I use Get-DfsrMember in a pipeline to force AD polling operations on all members in the RG01 replication group, instead of having to run for each server. Imagine how much easier this will make administering environments with dozens or hundreds of DFSR nodes.
12. Wait for the DFSR informational event 4104, which indicates that the server is now normally replicating files. Unlike your previous experience, there will not be a preceding 4102 even when enabling replication of a cloned volume. If there are any changed files on the upstream server since you performed export cloning, those files will replicate inbound to the downstream server authoritatively and you will see 4412 conflict events. If you allowed users to modify data on the downstream server – and again, you shouldn’t - while cloning operations were ongoing, those files will conflict (and lose) or move to the preexisting folder, and any files the user had deleted will replicated back in again from the upstream. This is identical to classic initial sync behavior.
Cheat sheet
Now that you have tried out the controlled scenario once, here is a cut down “quick steps” version you can use for further testing with those F: and G: drives on your own; once you use those up, you will need to remove the server from replication for those volumes in order to try some more experimentation with things like a 3rd server or cloning from an existing replicated folder.
In this case, I am using the F: drive with its RF02 replicated folder in the RG02 replication group. Keep in mind – you don’t have to keep creating new RGs and we support cloning multiple custom writable RFs on a volume. These are just simplified walkthroughs, after all.
On the upstream SRV01 server:
New-DfsReplicationGroup "RG02" | New-DfsReplicatedFolder -FolderName "RF02" | Add-DfsrMember -ComputerName SRV01
Set-DfsrMembership –GroupName "RG02" -ComputerName SRV01 -ContentPath F:\Rf02 -PrimaryMember $True -FolderName "RF02"
Update-DfsrConfigurationFromAD
Get-WinEvent “Dfs replication” MaxEvents 10 | fl
New-Item -Path "f:\Dfsrclone" -Type Directory
Export-DfsrClone -Volume f: -Path "f:\Dfsrclone"
Robocopy.exe "F:\RF02" "\\SRV02\F$\RF02" /E /B /COPYALL /R:6 /W:5 /MT:64 /XD DfsrPrivate /TEE /LOG+:preseed.log
Robocopy.exe f:\Dfsrclone \\srv02\f$\Dfsrclone
On the downstream SRV02 server (note: you may need to stop the DFSR service to perform the first step; be sure to start it up again so that you can run the import)
RD "H:\System Volume Information\DFSR" –Force -Recurse
Import-DfsrClone -Volume C: -Path "f:\Dfsrclone"
Get-WinEvent “Dfs replication” MaxEvents 10 | fl
Add-DfsrMember -GroupName "RG02" -ComputerName "SRV02" | Set-DfsrMembership -FolderName "RF02" -ContentPath "f:\Rf02"
Add-DfsrConnection -GroupName "RG02" -SourceComputerName "SRV01" -DestinationComputerName "SRV02"
Update-DfsrConfigurationFromAD SRV01,SRV02
Get-WinEvent “Dfs replication” MaxEvents 10 | fl
Some simple troubleshooting
While the future TechNet content on DB cloning contains a complete troubleshooting section, here are some common issues seen by first-time users of this new feature:
Symptom | Export-DfsrClone does not show RootFolderPath or PreseedingHint output for SYSVOL or read-only replicated folders. After running Import-DfsrClone, SYSVOL and read-only replicated folders are not imported. |
Cause | DFSR cloning does not support SYSVOL or read-only replicated folders in Windows Server 2012 R2. Those folders are skipped by cloning. This behavior is by design. |
Resolution | Configure replication of read-only replicated folders using classic initial sync. Configure SYSVOL by promoting domain controllers normally. |
Symptom | Export-DfsrClone does not show RootFolderPath and PreseedingHint output for one or more custom replicated folders. After running Import-DfsrClone, not all custom replicated folders are imported. |
Cause | DFSR cloning does not support replicated folders that are currently in initial sync or initial building. Those replicated folders are skipped by cloning. |
Resolution | Ensure that all replicated folders on a volume are in a normal, non-initial building, non-initial synchronizing state. Any replicated folders that did not get DFSR event 4112 (primary server) after initial build started, or event 4104 (non-primary server) after initial sync completed, are not capable of cloning yet. If your event logs have wrapped, you can use WMI to determine if a replicated folder is ready to clone:
PS C:\> Get-WmiObject -Namespace "root\Microsoft\Windows\DFSR" -Class msft_dfsrreplicatedfolderinfo -ComputerName <some server> | ft replicatedfoldername,state -auto –wrap |
Symptom | Import-DfsrClone fails with errors: “Import-DfsrClone : Could not import the database clone for the volume h: to "H:\dfsrclone". Confirm that you are running in an elevated Windows PowerShell session, the DFSR service is running, and that you are a member of the local Administrators group. Error code: 0x80131500. Details: The WS-Management service cannot process the request. The WMI service or the WMI provider returned an unknown error: HRESULT 0x80041001” |
Cause | You do not preseed the replicated folders onto the destination volume with the same name and relative path. |
Resolution | Ensure that you preseed the source replicated folders onto the destination volume using the same folder names and relative paths (i.e. if the source replicated folder was on “d:\dfsr\rf01”, the destination volume must contain <volume>:\dfsr\rf01” |
Symptom | DFSR event 2418 shows a significant mismatch count. Cloning takes as long as classic non-preseeded initial sync. |
Cause | Files were not preseeded onto the destination server correctly or at all. |
Resolution | Validate your preseeding technique and results. Reattempt the export and import process. |
Symptom | Export-DfsrClone never completes or returns any output when using–Validation Basic or not specifying -Validation. |
Cause | Code defect in Windows Server 2012 R2 Preview build only, when cloning more than 3100 files on a volume. |
Resolution | This is a known issue in the Preview build. This was resolved in later builds. As a workaround, limit the number of files replicated with basic validation to under 3,100 per volume. If you wish to see the cloning performance with a larger dataset, use 3,100 much larger sample files (such as ISO, CAB, MSI, VHD, or VHDX files). Alternatively, use validation level none (0) instead of basic. |
Where you can learn more
We have a comprehensive set of cloning and preseeding TechNet content on the way, as well as updates to the DFSR FAQ. These include steps on cloning an existing replica, dealing with hub servers that have many unique replicated folders from branch offices, using cloning to recover a corrupted database, and replacing or upgrading servers. Not to mention the new supported DFSR size limits!
Once those are public, I will update this post with live links.
- Ned “hmmm, I didn’t make any Star Wars references after all” Pyle