Investigating NDMP performance

I spend some time on the Data Protector NDMP Integration, especially to compare it to CIFS/NFS backups available with the regular Disk Agent. Luckily an older EMC Celerra NS20 was available and I was able start immediately. A quick look at the NDMP support matrix revealed no restrictions since the unit was running T5.6.50.203.

To give you any idea on how the NS20 was configured, I just put together a quick drawing and some tables. This NS20 comes with one DAE4P used for FC and one used for SATAII drives. Redundant X-Blades (Data Movers) are configured as Active/Standby at the bottom provides NAS capability of the unit. Each X-Blade is equipped with two so called AUX ports used for tape connectivity. In my setup the first AUX port of each X-Blade was attached to the SAN infrastructure and zoned to an existing tape library mainly used for Object copies at the moment.



This is a small configuration with only a few spindles configured. But it gives us an idea on how NDMP performs. Since the performance of those SATA drives was so bad, even when configured as RAID10, I decided to perform my tests only on FC drives in RG0.

RG RAID Disk Type Disks Total Space LUNs
0 RAID 5 300GB 10k FC 5 942 GB 0, 1, 2, 3, 4, 5, 10, 11
1 RAID 1_0 750GB 7.2k SATAII 8 2751 GB 6, 7, 8, 9
200 HOT SPARE 300GB 10k 4Gb FC 1 268 GB 1023

I’ve carved two LUNs from RG0 and mapped them to the X-Blades. The LUNs are pooled and bound together using AVM on the NS20. Later a single CIFS file share has been created and populated with sample data.

LUN LUN Name Capacity RG RAID Owner
0 Celerra_NS20_0_root_disk (NAS/OS) 11 GB 0 RAID5 SP A
1 Celerra_NS20_1_root_ldisk (NAS/OS) 11 GB 0 RAID5 SP A
2 Celerra_NS20_2_d3 (NAS/OS) 2 GB 0 RAID5 SP A
3 Celerra_NS20_3_d4 (NAS/OS) 2 GB 0 RAID5 SP A
4 Celerra_NS20_4_d5 (NAS/OS) 2 GB 0 RAID5 SP A
5 Celerra_NS20_5_d6 (NAS/OS) 2 GB 0 RAID5 SP A
6 NS20_BE_1_000 250 GB 1 RAID1/0 SP A
8 NS20_BE_1_002 250 GB 1 RAID1/0 SP A
11 NS20_BE_0_001 250 GB 0 RAID5 SP A
7 NS20_BE_1_001 250 GB 1 RAID1/0 SP B
9 NS20_BE_1_003 250 GB 1 RAID1/0 SP B
10 NS20_BE_0_000 250 GB 0 RAID5 SP B
1023 LUN 1023 268 GB 200 Hot Spare SP A

Then the NDMP Media Agent was installed to my Windows 2008 x64 SP2 Cell Manager which already had access to the library. This does not hurt, as the existing MA is just expanded with some NDMP specific features.

Creating sample data using HP Library and Tape Tools is quite simple, since HPCreateData and HPReadData are integrated into LTT. Each job pointed to a different directory on the CIFS share. In sum about 13.5 GB in 192956 files have been created.

Folder Comp. File Size Depth Breadth Files/Dir Files
DATA_L 2:1 512KB-128MB 2 2 72 216
DATA_M 2:1 128KB-1MB 4 4 72 6120
DATA_S 2:1 4KB-64KB 6 6 20 186620

Now it was time for some baseline benchmarks using LTT and Data Protector. Read/Write Performance can be determined with LTT easily as well as the CIFS performance using single and multiple readers. Null device backups where performed using Data Protector and the OS null Device for the CIFS part. Celerra requires a configuration change to perform null Device Backups. (“Additional information” below)

Performance Baseline Benchmarks

As expected NDMP Null Device and local performance of the LTO3 drive was much higher than native network (CIFS) data transfers. There are tweaks around to optimize CIFS/NFS performance. I’m aware of those, but they have not been applied in that scenario. There is a single GbE link between a Windows 2008 host and the Celerra. There are only slight differences (up to 2%) in those tests regarding the block and transfer size. The most benefits have been seen for the LTO drive, especially with very large transfers (1MB).

Backup and Restore Benchmarks

A higher blocksize in general allows higher transfer rates. CIFS backup and especially restore performance are unacceptable slow. A CIFS backup requires a lot of tuning and testing and is less reliable. If you plan to use multiple Disk Agents on a share of a local disk, separate the file system in similar slices to parallelize operations correctly. This allows good restoration speeds using Data Protectors Parallel Restore afterwards.

Nice to know, an NDMP overwrite restore on the Celerra takes much more time than a redirected restore into a different directory.

Summary

Since there is no native Data Protector Disk Agent available for the major NAS platforms, NDMP offers many benefits over a CIFS/NFS backup. Only the catalogue is send to the CS while the NAS head does the whole data movement – directly to tape. In case of the Celerra Snapsure can take care of automatic replica creation as source for NDMP dumps. Some restrictions apply: NDMP does not allow Object Copy or restores to a non-NDMP clients, which could be an issue.

Further optimization is possible; it will take time depending on the configuration used. As the X-Blade/NAS head performs writing and reading data to and from tape, it should be the first stop. For example, changing readWriteBlockSizeInKB for PAX and bufsz for NDMP to 256k allows 12% faster backup performance. A next step would be to investigate parameters paxWriteBuff, nThread, nPrefetch, paxStatBuff and nFTSThreads.

Multiple Disk Agents on the same share are a good option to keep your feed rate high. My configuration is not ideal to demonstrate this, because three DAs where working on completely different folder structures. DATA_L which contains only a small set of very large files, DATA_M which contains a moderate amount of medium sized files and DATA_S which contains a lot of very small files. The result is easy to predict, DATA_L and DATA_M where complete quickly while DATA_S requires the most of the time. This also applies to restores. To make this clearer, without DATA_S in the restore job, I was able to achieve about 16 GB/hr for a CIFS restore session compared to 4.2 GB/hr.

Additional information

Everyone who is interested in null device backups on Celerra, try the following command. The backup software will load the cartridge, but no data will be written to tape. This is the only way to verify file system performance during backup time. As no data is written to media, Data Protector will report such a session as failed.

[nasadmin@NAS ~]# server_param server_2 -facility PAX -modify writeToTape -value 0
server_2 : done

Display the throughput during a NDMP backup session using server_pax server_2 -stats -verbose. The Get pool and Put pool are good indications if the system performs well or not.

Leave a Reply