Monday, March 7, 2011


Data Protection Ideas In A Virtual Environment

Virtualization is here to stay, and enterprises are adopting it across, both server and storage environments. Virtualization, while still in a nascent stage, is shaping up to be one of the major trends expected to influence enterprise infrastructure such as the server, storage, network, application and desktops.

Server virtualization is the forerunner in the adoption of virtualization. Some of its obvious benefits include: consolidation, reduced operating expenditure, increased utilization rates, increased business flexibility, lower hardware costs, better utilization of computing resources, reduced space requirements, power and cooling requirements and improved IT staff productivity.

With these benefits, organizations also have to deal with a host of new challenges that virtualization brings for the IT teams. One of the biggest ones being data protection. Some of these data protection challenges include increased risks to enterprise data and possible increased costs of protecting that data. Hence, data protection can no longer remain an after thought for organizations and they need to put in a considerable amount of effort in planning for the same.

By following some of the below mentioned data protection ideas  in virtualized environments, the IT department can reap the rewards of data protection in the enterprise.

Back Up Virtual Machines:

The 2010 Symantec Disaster Recovery survey findings reveals that only about half of the data within virtual systems is regularly backed up and only 6 to 10% of virtual environments protected by replication or failover technologies.

Failing to back up virtual machines is a risky endeavor. It is important to understand the reasons why many virtual machines have been left out of the backup strategy:

  • Virtual machine sprawl: virtual machines spread like crazy.  Many a times IT doesn’t know about new virtual machines. Even if they know about them, they are not aware of the Recovery Point Objectives (RPO)/Recovery Time Objectives (RTO) requirements.  
  • Cost of backup agents: In the past, IT would have to buy individual backup agents for each new server. This could potentially cost thousands of dollars and quickly eliminate any real savings created with virtualization. 
  • Tools from virtualization vendors: Existing tools have contributed to the failure to backup by making backup too complicated or promoting the use of virtualization vendor tools as a primary backup tool. 
  • I/O & bandwidth impact: Another concern has been the concern over dragging down the host machine and/or network by moving a lot of data for backup.  The whole idea of virtualization is to increase server utilization/CPU utilization/network utilization and if you are successful, there is less “slack in the system” to handle backup loads.

Backing up virtual machines is absolutely essential and there are a number of tools available at organization’s disposal to help them overcome the problem of failing to backup virtual machines.

Never Install Backup Agent on Every Guest

Virtualization has mirrored the growth of many disruptive technologies. Linux is a case in point. Linux is simply a part of the IT infrastructure in organizations today. However, in its early days, there were specialized professionals and niche technologies that supported this new technology

A common mistake that many IT organizations make is developing a divergent approach to making backups of their virtual servers. Past limitations in backup support from virtualization vendors, the skill set requirement of backup vs. virtualization professionals, and historically poor support from major backup vendors are some of the main causes for the same.

IT has consistently asked for a single vendor for managing both virtual and physical environments, in spite of some organizations having invested in two separate tools for backup (One for physical servers and one for virtual servers). This happens because a differing approach to backup leads to inconsistent data management, backup confusion, and conflict between various IT organizations. This issue can be resolved by bringing together the virtualization and backup teams in organizations, assign ownership, authority and resources for backup of both physical and virtual machines.

However, this approach results in significantly higher costs from backup agents and unnecessary management complexity.

Today, virtualization vendors have improved APIs to support centralized backup with granular restore. Many vendors have the ability to backup at the hypervisor level which has made it unnecessary to install a backup agent on every guest for backup purposes.

Backing Up Applications Once

The IT departments of many organizations are also backing up the same data twice in the virtual environment: the first time for full image recovery and the second time for more granular file and object recovery.  The reason for doing so is when IT needs to recover a single email or a single calendar item from Exchange, if only the virtual machine has been backed up, it will have to first restore the entire server, then recover the granular data.  The issue is that the whole operation takes twice as long, puts twice the amount of load on the network, and takes twice the storage capacity for the same data.

However, today, due to the availability of new capabilities from virtualization vendors and backup application vendors, there are solutions that allow the IT department to do a single backup and still recover granular object items.

Avoid Backing Up Of Redundant Data

There is tremendous amount of duplicate data on virtual machines.  Consider the duplicate data in the OS, particularly if you use a standard image.  It is not a wise strategy to backup all of that duplicate data as it congests the network, lengthens the backup window, and raises storage hardware costs. This is completely avoidable.  The IT department should consider the following strategies:

  • Leverage VMware’s new vSphere block level differential/incremental backup and restore
  • Implement block-level data deduplication during backup of virtual machines to identify and exclude redundant data

The amount of duplicate data in virtual environments is significantly higher than physical environments - duplicate OS, cloned machines, test machines etc., all of which increase the amount of similar blocks in infrastructure. Deduplication can improve the backup speed significantly.  In the case of virtual machines there is a significant amount of duplicate data, allowing for major space and time savings.

Consider Restore

One of the most common challenges in backup is the failure to consider recovery. This issue is more common in the virtualized environment since there are more options to restore than in the pure physical world.  The IT department needs to consider what it wants to restore and the level of granularity while considering restoration.  It is also wise to consider where to restore to (physical or virtual, onsite or offsite, etc.). Engagement of key stakeholders to plan objectives and associated processes is crucial. A written plan agreed on by all stakeholders and which can be tested regularly is ideal.

Finally, physical-to-virtual (P2V) restore is something virtualization vendors are great at helping with; however, virtual-to-physical (V2P) restores are a bit more difficult. There are good reasons to restore to both physical and virtual machines and the IT department should ensure it has the capability to go in both directions.


With strong controls being implemented on IT spending, virtualization, being more affordable, is gaining significant attention across all organizations. However, move to a virtualized IT infrastructure involves a number of changes. The process of virtualization poses a number of challenges for data protection including the risk of unprotected data, increased storage consumption, and changes to the traditional recovery processes.  By carefully evaluating these ten common pitfalls, and thereby implementing measures that can enable them to avoid these, the IT department can manage the risks surrounding unprotected data and the costs from implementing poor backup strategies.

“Many a times IT doesn’t know about new virtual machines. Even if they know about them, they are not aware of the RPO/RTO requirements.”


About bench3 -

Haja Peer Mohamed H, Software Engineer by profession, Author, Founder and CEO of "bench3" you can connect with me on Twitter , Facebook and also onGoogle+

Subscribe to this Blog via Email :