The Right Fit
This article appears courtesy of www.cioupdate.com
As the amount of data stored on our networks increases, it also takes more time to make backup copies of this data. When the time required for these backups extends beyond the overnight period, problems crop up.
One solution is to eliminate duplicate data that is backed up. How much can you save? A lot. In some cases, there is a more than a 10-to-1 savings; meaning that 90% of your data is duplicates. Eliminating these redundant fles can go a long way towards speeding up the backup process. As the screen shot of Symantec's PureDisk NetBackup shows, more than 95% of the data fles have been eliminated as a result of the deduplication process, going from a backup of more than 3GB to about 150MB.
De-duplication seems like a simple concept, but picking the right de-duplication product isn't. There are dozens of vendors, including:
- Acronis.com Backup and Recovery
- Atempo Time Navigator
- CA'a ARCserver
- Backup Cofo.com AIMStor
- FalconStor File-interface
- Deduplication System
- DataDomain DDX
- Quantum DXi3500
- Sepaton DeltaStor
- Symantec NetBackup Pure-Disk
There are also a lot of technical issues which need to be understood before taking a purchase decision with regard to a de-duplication product. Here is a checklist and some suggestions as you navigate these waters:
First off, where is the software agent that controls the de-duplication process located? Some products put their agents at the source, meaning on each and every server that will be backed up, and others on the actual backup appliance. You need to put it someplace, and, depending on your particular set of servers and circumstances and IT policies, you may prefer one or the other method. Some of the products, like CA's Arcserve Backup, can now work with agents in both locations.
Second, how does the de-duplication appliance appear to the backup software app? Some de-duplication boxes appear like a network-attached storage device, while others appear like a storage area network drive. Depending on the backup software that you already have, one of these might be more appealing to your situation.
Does the de-duplication agent have any granularity with any particular apps or OSs? Some products can examine individual email messages, or database records or fles that have changed on a particular virtual machine instance. As more and more shops make use of virtualisation technology, this factor becomes increasingly important, as the size of the virtual disk images can be enormous, yet they contain mostly the same common fles for the operating system and underlying applications. This makes these de-duplication products more useful when working with the backup software when the need arises to restore these particular fles from inside the virtual images.
Do you need special hardware or does the de-duplication function come included as part of the backup software? Quite a few of the usual backup software vendors are moving towards integrating deduplication functionality in their products. For example, enabling data de-duplication functionality on both Symantec's NetBackup 7 and Backup Exec 2010 requires only a single check mark in a pop-up box in one of their control menus.
Is de-duplication happening during the live stream of backup data or does some post-processing occur? This means that the backup could be frst staged to a hard drive designed for this purpose, and then the duplicates are later removed. If the former is the case, do you have enough storage capacity to hold all of your backup fles, and can you add more storage as your needs grow?
Finally, how does the de-duplication product fit into your overall storage resource management picture? Can you examine fle ageing reports that show which fles haven't been accessed by your users for more than 90 days, for example? Or understand how your storage area networks are using their disk arrays, and perhaps reconfigure them for more optimal usage? Or drill down and see how your particular applications are using your overall storage resources?
These and other analyses are valuable if you are going to be able to more effectively manage your storage needs.
Dave Strom is a freelance writer living in St. Louis and the former editor-in-chief of Network Computing magazine, DigitialLanding.com and Tom's Hardware.com. He has written two books and numerous articles on networking, the Internet and IT security topics. He can be reached at david@strom.com and his blog can be found at strominator.com.
To see more articles regarding IT management best practices, please visit www.cioupdate.com.
- Share[+]
- Digg
- Del.icio.us
- Reditt
- Yahoo Buzz
