Should I use a Lagged Copy in my DAG?

Customer asks:

I hope you remember the discussion we had back in October 2010 around the need for backups for Exchange.  Well, I feel like we are running in circles around Exchange 2010 backup and recovery.  I felt relatively comfortable with the idea until recently when we started making some further changes to the environment in response to the growth of the log files.  We are attempting to switch to circular logging because Avamar doesn’t reset the archive bit and thus the logs never get truncated and grow until the volume is full.  Allegedly having 3 copies of the db’s is supported/recommended using circular logging but it has raised some questions:
*   If we have 2 db servers running on vm’s in the Cincinnati data center, is it advisable to use the lag copy for the 3rd db server in Lexington or does this violate the “rule of 3″?
*   Is it better to have all 3 in the same DAG with no lag at all vs. having the lag copy?
*   In what scenario would having the lag copy be beneficial?
*   If we enable circular logging on the active db server, do all of the other servers mirror that setting?  Is this Circular Logging Continuos Replication (CLCR)?
*   If the passive copies don’t pick up the change in log file settings, is it possible/advisable to have an alternate log mechanism on the other copies?
*   Now that we have increased our capacity in Avamar, what would a reasonable DR backup plan look like?  Currently we are using weekly.
*   How do SAN snapshots fill in the gaps between backups and log files?  We use Dell/Equallogic storage today and while the auto-snapshot feature sounds nice, in reality the snapshots are huge and thus we can only keep one snapshot per db.  I’m sure you can comment on how much better EMC’s tools are and I actually welcome that discussion.

Ok… Here’s what I wrote back:

First, thank you for your message.  I have escalated your issue with the Avamar client.  Not truncating logs at the end of a successful backup is errant behavior — we need to find out why this this is not working for you.
Second… I’ll take your questions in a Q&A format:
Q *   If we have 2 db servers running on vm’s in the Cincinnati data center, is it advisable to use the lag copy for the 3rd db server in Lexington or does this violate the “rule of 3″?
A *  A lagged copy violates the “rule of three” — the three copies become block-for-block voting members in repairing torn pages (likely if running on SATA Direct Attached/non-intelligent storage devices).  A lagged copy will not be counted as a voting member.
Q  *   Is it better to have all 3 in the same DAG with no lag at all vs. having the lag copy?
A * If you are running on an intelligent array (like ALL of EMC’s arrays which perform advanced CRC tracking and scanning), you will not need Exchange to vote on page repairs — there won’t be any! — so a lagged copy can be used to backwards in time.  Technically, you would need two “near” copies, and one “far” lagged copy to satisfy your temporal repair issues.  More on this below.
Q  *   In what scenario would having the lag copy be beneficial?
A * a corruption event that effects the Active copies of the database — you could use the lagged copy with its unplayed logs to isolate the corruption within an unplayed log and prevent the corruption from infecting the only unaffected copy you have left…  This is why a backup becomes important…
Q  *   If we enable circular logging on the active db server, do all of the other servers mirror that setting?  Is this Circular Logging Continuos Replication (CLCR)?
A *   In a word, YES.  See http://technet.microsoft.com/en-us/library/dd876874.aspx#FleMaiPro
Q *   If the passive copies don’t pick up the change in log file settings, is it possible/advisable to have an alternate log mechanism on the other copies?
A *  no
Q  *   Now that we have increased our capacity in Avamar, what would a reasonable DR backup plan look like?  Currently we are using weekly.
A *  Excellent Question!!  Microsoft’s Exchange Product group has begun to intermix the notions of HA, Backup/Recovery, and DR.  In a perfect world, you would have five copies of each database (sounds like SAP eh?) Two at the primary data center (PDC), three at the backup data center (BDC), plus one more lagged copy at the BDC.  In that architecture, you can use Exchange Native Backup, but that doesn’t answer your question.  Assuming that we can get Avamar to truncate your logs, a DR plan would look like this: Use a two-site DAG with two local copies and one remote copy — all Active copies.  You will place a database activation block on the remote databases to prevent accidental routing of mail to the remote databases.  Now… Install the Avamar client on the remote mailbox server and back it up every night (full only!!).  Keep the backups for 14-days.  Also, at the PDC, configure Reserve LUN Pool space for one of the mailbox servers.  It does not matter if the server is Active or Passive.  Configure Replication Manager to create VSS snapshots of all the databases on that server.  Execute snaps once per day.  Never back them up.  Expire the snapshots after eight days.  This will provide:
*Local HA in real-time via DAG-Replication — the ability to perform patch management every week with no downtime,
*Local BC at less than 24-hour RPO.  This will allow you to recover your database in the strange event that a rolling corruption event makes it to the remote data center.
*Remote failover for DR/BC and Site Maintenance.
*Remote BC/DR in the event of a smoking hole during a corruption event.
Q *   How do SAN snapshots fill in the gaps between backups and log files?  We use Dell/Equallogic storage today and while the auto-snapshot feature sounds nice, in reality the snapshots are huge and thus we can only keep one snapshot per db.  I’m sure you can comment on how much better EMC’s tools are and I actually welcome that discussion.
A *  Hmmm… Yes.  EMC’s snapshots on CX4 and VMAX require only the snapshot space required to hold the deltas, not the entire LUN.  Array-based snapshots do not actually fill a gap between log files and backups, however they provide an additional layer of recoverability that closes a temporal gap associated with recovery from backup.  For example, I can recover a 1TB database from a snapshot in about 4-minutes.  I can recover a 1TB database from DISK (B2D) in about 14-hours!!
I hope this helps!!

Ok, but what are you DOING?!

Dell is buying 3PAR.  Oracle has Sun and Exadata.  EMC now has Greenplum.  Cisco sells telephony and servers.  IBM is selling SATA drives to the enterprise as XIV and is reselling NTAP, but its storage architect is out on the loose, again.  Who did I leave out?  Oh, HP…  The only one NOT making moves is Microsoft?!  Actually that’s not completely true… Microsoft continues to “innograte” — the act of innovating by integrating acquired technology into your existing products’ evolution.  DATAllegro, Opalis, and so on…

I keep thinking about the famous economic principle that states “you will ultimately be undone by your past”.  If EMC could just shed its dependency on Dell’s relationships with all those purchasing agents…  If Cisco could just lose the MDS and the Catalyst… if IBM could just forget about the XIV and buy NTAP already!…  And HP… Oh HP…

It has become painfully clear that the order of the components in the OSI model, in fact, serves as a roadmap.  Never forget that the Application is always at the top.  Whatever the application needs, the application gets.  If you don’t sell an application, don’t expect to tell anyone what to do.  Ever.  If your application is actually a utility for another application, don’t forget that fact.  Your “utility” is NOT the application.

Oracle has Java, yes.  Java is a development tool, not an application, but how about Siebel, PeopleSoft, JDEdwards?  Dell has … nothing.  HP has … nothing.  IBM has DB2 and Lotus.  Cisco has Unity — yes Voicemail is an application.  EMC has … hmmm VMware? VMware is actually an infrastructure tool — it’s like a server hardware manufacturer that lets you use whatever server vendor you want.  EMC also has Documentum — a utility that is configured as an application.  Microsoft, on the other hand, has all the applications you can shake a stick at.  If Microsoft says their application needs 100-spinning dancers to run, guess what you’re buying?

The way to true technology marketplace leadership is through applications that people actually use.  People love Microsoft Office.  They love their iPhones, their Androids, their Blackberries (often seen as tools… but they’re not — an iPhone is a collection of applications — remember… “there’s and APP for that!!!”).  People also love web-based applications like Facebook, Salesforce, gmail, and LinkedIn.  It seems to me… that HP, IBM, Dell, and EMC would do well to think about what people are using to “run their lives” and follow those markets.

Does it matter that Dell has 3PAR?  Does it matter that EMC has DataDomain?  Does it matter that Oracle has Sun and VirtualIron and BAE, and Java? — I don’t think so — but Oracle DOES have Siebel, PeopleSoft, and JDEdwards — real APPLICATIONS.  So at the end of the day, I think it becomes pretty clear who will be pulling and who will be pushing…  Take a look at what people are DOING.  The truth will show you the way:

  • Email — Google, Microsoft, Yahoo, Apple (MobileMe).  All Cloud-based email systems (and Microsoft even has a version of email that runs inside your firewall <g>)
  • Banking — a scattered field with many banks offering their own applications, plus Quicken
  • Social Networking — Facebook is dominating, but LinkedIn, MySpace, etc. continue to stay afloat
  • Media Sharing — Flickr, SnapFish, PhanFare
  • Media Consumption — Netflix, Pandora, Amazon’s Kindle, iTunes — all retailers with massive followings
  • Spreadsheets and Documents — Microsoft OWNS this space with trickles from OpenOffice, and iWork

So, just some idle advice from the sidelines for Dell, HP, even EMC — look at what people are doing; and go DO that!

A Primer for Laptop SSD Buyers (draft)

This blog entry attempts to assist the mobile user as he/she begins to search for an SSD replacement for his/her existing rotational drive.

Components:

Every drive has it’s own Controller, MLC NAND, and firmware features.  Every manufacturer puts each drive line together using using these building blocks.  Each aspect of the Controller, the NAND, and the Firmware have profound effects on a drive’s performance, longevity, and interoperability with the operating system supporting it.

The controllers are manufactured by: Sandforce, Samsung, Indilinx (Barefoot & Amigos), and Jmicron.  Each controller manufacturer may have several models of controllers to pick from.  The most popular consumer controllers at the moment are: Indilinx IDX110M00-FC “Barefoot”, Intel PC29AS21AA0, JMicron JMF612, Toshiba T6UG1XBG, Samsung S3C29RBB01-YK40, Marvell 88SS8014-BHP2, SandForce SF-1200/1500, and now the Marvell 88SS9174-BJP2 SATA-III SSD controller.  Samsung & Indilinx seem to have the best reputation for “studder-free” performance.  Samsung’s controller is in it’s second version.  Indilinx has been around the longest and is referred to as “stutter free, has integrated cache, and provides “smooth operation”.  Sandforce has introduced “DuraClass Technology” in its controllers (SF-1500, and the lower cost, lower performance SF-1200)  and firmware set to offer an entirely new way to commit data into the NAND Flash Array (more on this below).  It is so revolutionary that it virtually eliminates the need for TRIM.

The Multi Layer Cell NAND comes from: Jmicron, Samsung, & Intel.  Each NAND manufacturer, of course, has various MLC sets of varying speeds and densities.

The firmware can contain leveling code in addition to support for TRIM (an a standardized approach to “telling” the OS which cells are available for writing.  If the OS supports TRIM, then the drive does not need to manage cell usage, the OS will take over.)  Other features include garbage collection (the process of moving data from “used” cells into other cells to make incoming writes more efficient), power management, TRIM management, data protection algorithms for reducing data loss in light of power failure, data management to reduce maximum write latency, “full drive” data management features to increase performance as the drive reaches capacity.

Examples of controllers and which drives use them:

Samsung S3C29RBB01-YK40 (second generation Samsung controller):
Samsung PB22-J
Corsair Performance Series (seems to actually BE the Samsung PB22-J)

SandForce SF-1200 or SF-1500:
A-Data S599 Series
Corsair Force Series
Mushkin Callisto Series
OCZ Vertex LE Series
OCZ Vertex 2 Series
OCZ Agility Series
RunCore Pro-V Series
Super Talent TeraDrive Series
Unigen
AMP SATAsphere Series
IBM
Viking

Indilinx Barefoot:
Nearly every “mainstream” second-generation drive uses the Barefoot
OCZ Vertex Series
Crucial M225
Corsair Nova (V) and Reactor

Here is a matrix of the OCZ drives, the speeds, and controllers they use: http://www.ocztechnology.com/res/manuals/OCZ_%20SSDs_quick_compare_5.pdf
 

What to Look for in a drive:

Drives should be of the appropriate size for only those things that you want to run quickly.  For example, you may not want to store your 90GB music files on an SSD, but you will want to store your Operating System and paging file on an SSD.  You may want to store your multi-megabyte jpeg and raw photo files on SSD depending on what you are doing with them, i.e. editing, tagging, etc.

Drives should have an algorithm to avoid pre-mature aging.  Some drives have Background garbage collection routines.  While BGC does increase the overall write performance of your drive over time, it also shortens its life by over-using the cells as it relocates data.  TRIM is an OS-enabled partnership with the drive that allows the OS to understand which cells are “best” to write to at any moment.  TRIM promises to extend the life of drives by avoiding cell over-use and premature cell aging.  TRIM is loaded with faults however and is really intended for workstations and laptops.  TRIM does nothing to solve the long-term performance and durability of drives installed inservers and under RAID controllers.  To solve the issues of long-term performance stability and drive durability, other controller manufacturers have approached the issue from a completely new direction — see Write Amplification below.

Speed is certainly a concern for every SSD.  Several controllers directly address performance enhancements in several ways: 1) Throughput in IOs/s, 2) Bandwidth in MB/s, and then the notion of sustained performance.  Every controller manufacturer has new controllers (since mid-2009) that bring read and write performance into the 200MB/s range.  A small number of controller and Single Layer Cell combinations can bring performance over 300MB/s.

Firmware support is crucial (no pun intended).  Firmware updates allow the manufacturers to add features like TRIM, power management, and ECC recovery algorithms.  Almost every drive manufacturer has allowed for firmware updates to their drives.  The downside is that nearly every vendor currently dictates a complete re-initialization of the drive — total loss of every bit of data on the drive — a complete reload is necessary after the firmware is installed.  Manufacturers such as Crucial are working to avoid this inconvenience in the future firmware releases.

Price is always a concern.  SandForce, for example has released a new version of their amazing SF-1500 controller and firmware set in an attempt to compete with lower cost controllers from Indilinx (for example).  the new SF-1200 controller offers reduced performance compared to the SF-1500, but still offers SandForce’s DuraClass write leveling technology at a substantially lower cost.

A look at Specific Drives:

Crucial, Corsair, Intel, and OCZ all have their latest firmware sets, some drives were able to receive new firmware, other previous generation drives could not implement all the new feature sets.

Below is an example of firmware updates from Crucial.  The Crucial M225 and RealSSD C300 got firmware updates this year (January and May respectively).  Both drives have added support for TRIM.  You might notice that the M225 has added/modified their “wear leveling algorithm”.  These algorithms go above and beyond what TRIM provides.

RealSSD C300 (Marvel SATA-III controller)

Release Date: 5/20/2010
Change Log:
Improved Power Consumption
Improved TRIM performance
Enabled the Drive Activity Pin (Pin 11)
Improved Robustness due to unexpected power loss
Improved data management to reduce maximum write latency
Improved Performance of SSD as it fills up with data
Improved Data Integrity

M225 (Indilinx Barefoot controller)
Release Date: 1/21/2010
Change Log:
Fixed issue that sometimes causes firmware download problem
Fixed issue that could cause 256GB to be corrupted
Eliminated performance degradation over time with Wiper with 1819 FW
Fixed issue where the power cycle count was incorrectly being reported with 1819 FW
Fixed issue where some SATA 1 hosts weren’t correctly identifying the hardware
Fixed issue found in simulation (not in the field) where the free block count was incorrectly being reported
Fixed issue with remaining life not being properly displayed on SMART information
Added support for additional NAND manufactures and capacities
Made further improvements to wear leveling algorithm

Corsair Performance Series (P256)
Has a Samsung controller and Samsung-provided firmware

The “Force” series of drives from Corsair are based on the SandForce SF-1200; they were introduced in May of 2010 so this technology is really new and revolutionary. http://www.sandforce.com/index.php?id=19&parentId=2
OCZ also uses the SandForce controllers (the Vertex Limited Edition line uses the higher-end SF-1500 controller, while the Vertex 2, and Agility 2 lines use the less expensive SF-1200 controller).

Corsair has a great blog entry describing “write amplification”: http://blog.corsair.com/?p=3044

The SandForce controller mentioned above brings DuraClass Technology to market… Here’s an excerpt from Kevin Conley’s blog at Corsair:SandForce demonstrated that through its innovative DuraClass technology, Write Amplification factors below 1 could actually be achieved. Not only that but also without the use of a large (and expensive) external Data Cache. As noted in some other blogs, this data intelligence utilitizes data-dependent compression techniques coupled with other “secret sauce” algorithms to reduce the amount of data to write in the first place, in some cases quite significantly. The SandForce SSD processor then manages the programming of data using very efficient Page Management algorithms that prevent the need for Garbage Collection down the road. The net result of this is a Write Amplification much lower than other SSD controllers achieve, and thus the screaming fast write performance demonstrated by the Corsair Force Series solid-state drives. This is an even more amazing feat when considering these SSDs use MLC memory but compete with enterprise-class solutions utilizing much faster SLC memory.

Drives I would buy (based on price, speed, and durability/longevity:
OCZ Vertex 2 (SF-1500) – up to 50,000 4k IOPS
OCZ Vertex LE (SF-1500 LE)
OCZ Agility 2 (SF-1200) – up to 10,000 4k IOPS
Mushkin Callisto (SF-1200)
Corsair Force (SF-1200)



.vhd, direct iSCSI, and SCSI Paththrough

Many of my peers have debated the three basic storage device connectivity options for Hyper-V for many months. After much debate, I decided to jot-down some ideas to directly address concerns regarding SCSI-passthrough vs. iSCSI in-guest initiator access vs. VHD. I approach the issues from two vantage points, then make some broad generalizations, conclusions, and offer my sage wisdom 😉
  1. Device management
  2. Capacity limitations
  3. Recommendations


Device management:

  • SCSI-passthrough devices are drives presented to the parent partition — they are assigned to a specific child VM; the child VM then “owns” the disk resource. The issues that come from this architecture have to do with the “protection” of the device. Because not ALL SCSI instructions are passed into the child (by default), array-based management techniques cannot be used. Along comes EMC Replication Manager.  Thanks to the vigilant work of the EMC RM team, they have discovered the Windows Registry Entry for filtering SCSI commands and provided instructions for turning SCSI filtering off for the LUNs you need to snap and clone.  This is big news because Windows Server 2008 used to break SAN-based tools. For example, prior to this update you could not snap/clone the array’s LUNs because the array could not effectively communicate with the child VM. Now, array-based replication technologies CAN still be used. In addition to clones and snaps, the SCSI-passthrough device can be failed-over to a surviving Hyper-V node — either locally for High Availability or remotely for Disaster Recovery. Both RecoverPoint and MirrorView support Cluster Enabled automated failover.
  • …and now the rest of the story — Both Fibre Channel and iSCSI arrays can present storage devices to a Hyper-V parent, however differences is total bandwidth ultimately divide these two technologies. iSCSI is dependent on two techniques for increasing bandwidth past the 1Gbps (60MB/s) connection speed of a single pathway: 1.) iSCSI Multiple Connections per Session (MCS) and 2.) NIC-teaming. Most iSCSI targets (arrays) are limited to 4-iSCSI pathways per controller. When MCS or NIC-teaming is used, the maximum bandwidth the parent can bring to its child VMs is 240MB/s — a non-trivial amount, but 240MB/s is a “four NIC total — for the entire HV node — not just the HV child! On the other hand (not the Left Hand…), Fibre Channel arrays and HBA’s are equipped with dual-8Gbps interfaces — each interface can produce a whopping 720MB/s of sustained bandwidth when copying large block IO. In fact, 8Gbps interfaces can carry over 660MB/s when carrying 64KB IOs and slightly less as IO sizes drop to 8KB and below. When using Hyper-V with EMC CLARiiON arrays, EMC Powerpath software provides advanced pathway management and “fuses” the two 8GBps links together — bringing more than 1400GBps to the parent and child VMs. In addition, because FC uses a purpose-built lossless network, there is never competition for the network, switch backplane, or CPU.
  • iSCSI in-guest initiator presents the “data” volume to child VMs via in-parent networking out to an external storage device — CLARiiON, Windows Storage Server, NAS device, etc. iSCSI in-guest device mapping is Hyper-V’s “expected” pathway for data volume presentation to virtual machines — it truly offers the richest “features” from a storage perspective — Array-based clones and snaps can be taken with ease, for example. With iSCSI devices, there are no management limitations for Replication Manager: snaps and clones can be directly managed by the RM server/array. Devices can be copied and/or mounted to backup VMs, presented to Test/Dev VMs, and replicated to DR sites for remote backup.
  • …and now, the rest of the story — an iSCSI in-guest initiator must use the CPU of the parent in order to packetize/depacketize the data from the IP stream (or use the dedicated resources of a physical TCP Offloading NIC placed in the HV host) — this additional overhead is usually not noticed, except when performing high IO operations such as backups/restores/data loads/data dumps — keep in mind that Jumbo frames must be passed from the storage array, through the network layer, into each guest. Furthermore, each guest/child must use 4 or more virtual NICs to obtain iSCSI bandwidth near the 240MB/s target. The CPU cycles an in-guest initiator can consume are often 3-10% of the child’s CPU usage — the more child VMs, the more parent CPU will be devoted to packetizing data.

Capacity limitations:

  • VHDs have a well-known limit of 2TB, iSCSI and SCSI-passthrough devices are not limited to 2TB, and can be formatted for 16TB or more depending on the file system chosen. Beyond Hyper-V’s use of three basic VM connectivity types, there is the concept of the Clustered Shared Volume (CSV). Multiple CSVs can be deployed, but there primary goal for Hyper-V is to store virtual machines, not child VM data. CSVs can be formatted with GPT and allowed to grow to 16TB.
  • …and now, the rest of the story — Of course, in-guest iSCSI and SCSI Passthrough are exclusive of CSVs. VHDs can sit on CSV, but CSVs cannot present “block storage” to a child. Using a CSV implies that nothing on it will be more than 2TB in size. Furthermore… at more-than 2TB, recovery becomes more important than the size of the volume. Recovering a >2TB device at 240MB/s, for example, will take as little as 2.9 hours and usually as much as 8.3 hours — depending greatly on the number of threads the restoration process can run. >2TB restorations can take more than 24 hours if threading cannot be maximized. To address capacity issues related to file serving environments, a Boston-based company called Sanbolic has release a file system alternative to Microsoft’s CSV called Melio 2010. Melio is purpose-built to address clustered storage presented to Hyper-V servers that serve files. Meilo is multi-locking, and provides QoS, and enterprise reporting. http://www.sanbolic.com/Hyper-V.htm Melio is amazing technology, but honestly does nothing to “fix” the 2TB limit of VHDs.


Conclusion/Recommendations

  • iSCSI in-guest initiators should be used where cloning and snapping of data volumes is paramount to the operations of the VM under consideration. SQL Server and Sharepoint are two primary examples.
  • FC-connected SCSI devices should be used when high bandwidth applications are being considered.
  • Discrete array-based LUNs should always be presented for all valuable application data. Array-based LUNs allow cluster failover of discrete VMs with their data as well as array-based replication options.
  • CSVs should be used for “general purpose” storage of Virtual Machine boot drives and configuration files.
  • sanbolic Melio FS 2010 should be considered for highly versatile clustered shared storage.

Virtual Machine Licensing

It is a widely know “fact” that the Datacenter Edition of Windows Server 2008 allows an “unlimited” number of Virtual Machines to be installed and run on the machine for which WS2k8DCE was licensed. When you buy Datacenter Edition, you pay for each physical processor (sockets filled with CPUs) you are currently running in the server hardware. For example, if you have a Dell 1950 III, you might have two processors with four cores on each processor. In this case, you need to pay for, of course, two processors. WS2k8DC has an MSRP of $2999/proc — So, the OS will cost you $5998.
The good news is that you can now install any number of Windows Servers on that Dell 1950 III as virtual machines (running on the Hyper-V layer). So… if I were a clever admin, I’d put 2TB of PC2-5300 into the box and install a gajillion servers on it.
…and now the rest of the story:
First of all, I can’t put 2TB of RAM into the server. But let’s just say I could. If I want support from Microsoft, I can only place 384 child machines on a Datacenter server.
but wait, there’s more… if I like Live Migration and Failover Clustering, I can buy another 1950 and “join” it to my first DC in a MNS cluster (let’s forget about the network, the storage, etc. for now). I would need another licence of WS2k8DC at $5998. My OS license cost is now $11,996 (list) and I can run as many machines as I can shake a stick at… or can I?!
The part of the story that you won’t hear too often is the support limitations regarding virtual machine densities for stand-alone servers versus clustered servers. Net/Net: Clustered servers are only supported with less than 65 virtual machines per cluster node. In a 16-server cluster, you can run 1024 machines — that’s it, period. So — let’s go back to the CFO to pay for this… 16-nodes of Datacenter on Dell 1950 III’s will set me back $95,968. I can run 1024 servers — each server costs me $93.72. Amazing — really amazing. And let’s not forget — this INCLUDES the cost of the Hypervisors and rudimentory management software. It’s really amazing. — Singularly licensed servers would have cost me $999 each — for a total of $1,022,976. To be fair, no one would buy servers that way — with Open, Select, AE, etc… the cost of 1024 servers wouldn’t cost $999, it would be half that. However, it’s easy to see a difference between $93.72 and $500!
Let’s also consider the “flexibility” of the cost model: If I was willing to pay $500 for each server license, what is the minimum number of servers I can deploy and still justify the cost of WS2k8DC licenses in my 16-node cluster? Simple — we’ll divide $95968 by $500 and see that if we implement less than 192 virtual machines (as few as 12 servers per node), we need to start dropping nodes to make the math work.