Unable to append tape with MTWEOFI enabled

I’ve observed the following issue on Media Agents on Linux with MTWEOFI support turned on in omnirc (to optimize performance) when running recent versions of Data Protector such as A.09.09_116.

While the first write operation to a blank or expired media will succeed an append operation to a tape by the Linux Media Agent will generate the following error. The media in question will be marked as poor.

[Major] From: BMA@linux.syncer.de "MSL6480_D1" Time: 18.11.2017 01:56:54
[90:183]     Cannot backspace block. (Details unknown.)

[Major] From: BMA@linux.syncer.de "MSL6480_D1" Time: 18.11.2017 01:56:54
[90:65]     /dev/tape/by-id/scsi-350014380032e2efc-nst
    Cannot append to medium (Details unknown.)

An intermediate fix has been made available by Micro Focus support in QCCR2A77211_HF1 which includes an hot fix for the Linux BMA. This fix also requires to specify the following options while OB2EODMETHOD=1 was previously not required.

# Enable MTWEOFI support on Linux Media Agent / Performance

Updated Information for Data Protector 10.x

Unfortunately more recent Data Protector versions (A.10.00 to A.10.20) re-introduced the problem. A fix has been confirmed as working in the meantime. If you need this functionality please request hot fix QCCR2A81967_HF3 from software support.

Resolve Media Changer issues

Recently I had to troubleshoot an issue where loading and unloading tape drives in a tape library was causing trouble. Parallel operations executed on the same library changer device caused a subset of the operations to fail leading to aborted sessions. A Linux Media Agent controlling the library changer reported the following error.

[Normal] From: MMA@linux.syncer.de "MSL6480_D3" Time: 25.09.2017 16:02:33
    => UMA@linux.syncer.de@/dev/tape/by-id/scsi-350014380032e2efb
    Unloading medium to slot 7 from device /dev/tape/by-id/scsi-350014380032e2eff-nst

[Major] From: MMA@linux.syncer.de "MSL6480_D3" Time: 25.09.2017 16:04:13
[90:64]     => UMA@linux.syncer.de@/dev/tape/by-id/scsi-350014380032e2efb
    Cannot unload exchanger medium (System error)

I’ve noticed a slightly different error message after changing the primary Media Agent for the tape library to a Windows host.

[Normal] From: MMA@linux.syncer.de "MSL6480_D6" Time: 02.10.2017 11:28:42
    => UMA@windows.syncer.de@Changer2147483643:0:0:1
    Loading medium from slot 14 to device /dev/tape/by-id/scsi-350014380032e2eff-nst

[Major] From: MMA@linux.syncer.de "MSL6480_D6" Time: 02.10.2017 11:29:40
[90:63]     => UMA@windows.syncer.de@Changer2147483643:0:0:1
    Cannot load exchanger medium ([121] The semaphore timeout period has expired. )

The Windows error message was looking familiar. I’ve tried a modification of the omnirc on the Media Agent hosts responsible for the robot control and it resolved the issue.

# Cannot load exchanger medium (System error) / The semaphore timeout period has expired

Source devices in Object Copy/Replication

When using Object Copy or Replication in Data Protector it is important to ensure the right Media Agent is reading the source data. This ensures the data stream is either kept locally on a specific Media Agent or is using the right path through the network.

You could either use a static re-mapping of Source devices within each Copy Spec or use Device Policies globally.

Static re-mapping of Source devices

This could be a very time consuming task in a configuration with dozens of devices and multiple Copy specs. This is commonly seen in larger StoreOnce Catalyst, Data Domain Boost or File Library implementations.

You need to select every possible Source device that should not be used during the operation and Change it to a different Source device. In this example only the device JETDCM01_B2D_jetdcm01 should replace all other devices.

Please note: B2D Source side devices should never be used during Object Copy/Replication. They must be replaced accordingly. Replacing a Source device with a device from a different library will cause Mount Requests.

Use Device Policies to control Source devices

The benefit of using Device Policies is that this is a one-time operation. All devices within a specific device (B2D, File Library, Tape Library) use the same Device Tag. Then assign the Copy Policy to control which Media Agent should be used during Object Copy/Replication. This is possible as a group operation on multiple Devices. Even B2D Source side devices are supported for this operation.

Please note: B2D Source side device should never be used during Object Copy/Replication. They must be replaced accordingly. The configuration options of Device Policies on Source side devices has been removed in Data Protector A.10.01 (QCCR2A75017). This should be fixed in QCCR2A75018 soon.

If the Device Policies are configured no further manual modification in the Copy spec is required.

Related global options

A few recommended global options relevant for Object Copy using either Device Policies and/or Static re-mapping.


HPE Data Protector A.09.00: Patch release history

The content of this post has been obsoleted/removed. I’ve included the information in the continuously updated page https://www.syncer.de/?page_id=944. I think you’ll find it very useful.

I strongly recommend to bookmark this page and check by regularly.

Resolving FilterListenerService installation issue

You may run into this non-critical issue during upgrade to Data Protector A.09.09 on Windows Cell Managers in case to the installation was done to a non-default path, e.g O:\Omniback and the VEPA integration is also installed.

  • Error: could not locate INF file ‘C:\Program Files\Omniback\bin\HPEDpHsm.inf’ is displayed during the patch bundle setup
  • High CPU load caused by Filter Listener, FilterListenerService (Data Protector Filter Listener Service) on Windows client with VEPA after upgrade

This issue can be resolved three simple steps:

1. Stop the FilterListenerService

2. Install the HPEDpHsm.inf from the context menu

3. Execute the following command from an elevated command prompt and check if the service is running smoothly.

O:\OmniBack\bin>fltmc unload HpeDpHsm
O:\OmniBack\bin>fltmc load HpeDpHsm

Tape drive performance with MTWEOFI on Linux

Physical tape drives are a precious resource in today’s fast growing data centers. If you’re running backups or copies to physical tape drives performance is key. I did some research on this topic based on a customer request. They reported that backups on Windows clients where much faster compared to Linux clients when writing directly to tape (via locally Media Agent).

When checking the device configuration in Data Protector I’ve noticed that drive where using the default segment size for LTO drives (2000). After changing the segment size from 2000 to 20000 (my own default) the performance issue was gone!

The default value 2000 is for LTO1 drives and there is a close relationship between cartridge capacity and the size/number of segments that are created during a write operation. If you’re using modern drives such as LTO5, LTO6 or even LTO7 you can store multiple times the data that fits on a LTO1 cartridge (6000 GB vs. 100 GB native).

While the performance issue was resolved it was still not clear why a Windows client does not show the same reduced performance with a segment size of 2000.

It turned out that st driver in the Linux Kernel usually needs to flush its buffer when a file mark is written. It was obvious that there is a relation so I started looking into MTWEOFI. MTWEOFI allows the st driver to preserve the content of its buffer, enabling the next file operation to start immediately.

Starting with Data Protector A.09.05 the MTWEOFI functionality can be abled in the omnirc of the Linux Media Agent host. It is mandatory that the used Linux Kernel understands MTWEOFI. This is the case for RHEL 6.5 and later as well as for SLES11 and later (have not checked SLES10).

# Enable MTWEOFI support on Linux Media Agent / Performance

Performance tests

The following performance numbers have been collected during my tests to demonstrate the relationship of segment size and write throughput on Linux using Data Protector A.09.07. I’ve been using a single 16 GB file that was backed up directly to a LTO4 drive. The drive was configured with a 256k block size (default) and 32 agent buffers.

  1. Without MTWEOFI the drive would stop after every 2GB of data written resulting in a poor overall performance.
  2. A larger segment size (20000 over 2000) will cause the drive to stop less frequently resulting in a better performance.
  3. With MTWEOFI support the drive would not stop at all, even if a file mark needs to be written.

108,84 MB/s – default omnirc, segment size 2000 (default)
183,91 MB/s – default omnirc, segment size 20000
188,24 MB/s – modified omnirc, segment size 2000 (default)
190,48 MB/s – modified omnirc, segment size 20000

Test environment: Internal HP LTO4 SAS drive (Firmware U64D) attached to HP Smart Array P410 (Firmware 6.64) in HP ProLiant ML350 G6 running CentOS release 6.8, 2.6.32-642.1.1.el6.x86_64.

Continue reading

Resolve Cell Manager installation failures on RHEL7

Starting with Data Protector A.09.05 Red Hat Enterprise Linux 7.0 and 7.1 is now a supported Cell Manager platform. But running the regular installation procedure on a fresh RHEL 7.1 system does not work, even with all requirements met.

[root@linux ~]# cat /etc/redhat-release
Red Hat Enterprise Linux Server release 7.1 (Maipo)
[root@linux ~]# uname -a
Linux jetlnx01.godyo.int 3.10.0-229.el7.x86_64 #1 SMP Thu Jan 29 18:37:38 EST 2015 x86_64 x86_64 x86_64 GNU/Linux

The installation script omnisetup.sh -CM -IS will fail during certificate generation phase.

Installing OB2-CS packet
Preparing... ################################# [100%]
Updating / installing...
1:OB2-CS-A.09.00-1 ################################# [100%]
NOTE: No Data Protector A.09.00 Internal Database found. Initializing...
Configuring and starting up Internal Database... Done!
Configuring and starting up Internal Database Connection Pool... Done!
Initializing Internal Database version A.09.00... Done!

ERROR: Cell Manager certificates generation failed. (Return code = 3)
 For more detail please refer to /var/opt/omni/server/log/DPIDBsetup_3627.log
 warning: %post(OB2-CS-A.09.00-1.x86_64) scriptlet failed, exit status 3

When looking at the logs I found the root cause in /opt/omni/sbin/omnigencert.pl. The function ParseIpData() is not able to get any IP addresses from the /sbin/ifconfig output on the system. The command is part of net-tools-2.0-0.17.20131004git.el7.x86_64 required for installation. The ifconfig output is different compared to previous RHEL versions causing parsing failures.

I’ve developed a patch that resolves the parsing errors. You can download the patch and the modified omnigencert.pl here. Just copy the modified omnigencert.pl from the tarball to /opt/omni/sbin after OB2-CS-A.09.00-1.x86_64 is installed and before the certificate creation starts. The installation will go through smoothly.

Update: The Hewlett Packard Enterprise support provides a fix for this issue with SSPLNX900_005.


VMware Power On from 3PAR ZDB

This post will guide you through the basic configuration, backup and restore (Power On/Live Migrate) for VEAgent backups (VMware) hosted on 3PAR storage systems. This functionality has been introduced with Data Protector A.09.04_104. While this guide is considered as a good starting point you should also check the ZDBAdmin.pdf and ZDBIntegration.pdf for complete documentation, list of limitations and troubleshooting procedures.

Storage integration in Data Protector

There are a few steps that must be done once to make the ZDB work. The first step should be the installation of the required storage provider (HP P6000 / HP 3PAR SMI-S Agent) on your backup host. Of course the Virtual Environment Integration is required to perform VMware backups. This guide assumes that you’re familiar with the required procedures (installation, vCenter import).

Make sure that the CIM provider on your 3PAR system is started.

BLACKHAWK cli% showcim
-Service- -State- --SLP-- SLPPort -HTTP-- HTTPPort -HTTPS- HTTPSPort PGVer CIMVer
Disabled Active Enabled 427 Enabled 5988 Enabled 5989 2.9.1 3.2.1
BLACKHAWK cli% startcim
CIM server will start in about 90 seconds

BLACKHAWK cli% showcim
-Service- -State- --SLP-- SLPPort -HTTP-- HTTPPort -HTTPS- HTTPSPort PGVer CIMVer
Enabled Active Enabled 427 Enabled 5988 Enabled 5989 2.9.1 3.2.1
Once the provider is up and running you should be able to register the storage system with Data Protector. You should check the connection before proceeding.

C:\>omnidbzdb --diskarray 3PAR --ompasswd --add blackhawk.syncer.de --ssl --user 3paradm --passwd password
HP 3PAR provider authentication data updated for user 3paradm at host blackhawk.syncer.de.

C:\>omnidbzdb --diskarray 3PAR --ompasswd --list
User Host Port Ssl
3paradm blackhawk.syncer.de 5989 Yes

C:\>omnidbzdb --diskarray 3PAR --ompasswd --check
Starting configuration check on host wildfire.syncer.de.
[Normal] From: SMISA@wildfire.syncer.de "SMISA" Time: 05.09.2015 09:13:51
Checking the HP 3PAR provider using this connection data:
Host: blackhawk.syncer.de
User: 3paradm
Namespace: root/tpd
Port: 5989
SSL mode: TRUE

[Normal] From: SMISA@wildfire.syncer.de "SMISA" Time: 05.09.2015 09:13:51
This HP 3PAR provider has access to the following unit:
WWN: 2FF70002AC012345
Description: HP_3PAR 7200c, ID: 12345, Serial number: 1612345, InForm OS version: 3.2.1 (MU3)

Configuration check finished.

Continue reading

Clean up after Data Protector IDB restore (Windows)

This is a follow-up on my post Clean up after Data Protector IDB restore which was intended for Linux/UNIX Cell Managers. It will guide you through the very same procedure to clean up after a successful IDB restore on a Windows Cell Manager. The same rules and warnings apply.

Please note: Data Protector program files and program data are installed to O:\OmniBack in this example. The default installation is C:\Program Files\OmniBack and C:\ProgramData\OmniBack.

  • Stop the services and check restored directory structure
O:\>omnisv stop
Cell Server services successfully stopped.

O:\>dir O:\OmniBack\server
Volume in drive O is OmniBack
Volume Serial Number is 261B-48D4

Directory of O:\OmniBack\server

07/02/2015 10:36 AM <DIR> .
07/02/2015 10:36 AM <DIR> ..
06/15/2015 04:46 PM <DIR> AppServer
07/02/2015 10:40 AM <DIR> db80
07/02/2015 10:37 AM <DIR> db80_restore
0 File(s) 0 bytes
5 Dir(s) 86,840,508,416 bytes free

O:\>dir O:\OmniBack\server\db80
Volume in drive O is OmniBack
Volume Serial Number is 261B-48D4

Directory of O:\OmniBack\server\db80

07/02/2015 10:40 AM <DIR> .
07/02/2015 10:40 AM <DIR> ..
06/15/2015 04:47 PM <DIR> dcbf
07/01/2015 08:46 PM <DIR> idb
07/01/2015 08:46 PM <DIR> jce
06/15/2015 09:38 PM <DIR> keystore
06/15/2015 04:46 PM <DIR> logfiles
06/15/2015 04:46 PM <DIR> meta
07/02/2015 10:36 AM <DIR> meta_2015_07_02-4_1435826212
07/02/2015 10:40 AM 13,276 mmd.ctx
06/15/2015 05:08 PM 16 mmd.id
06/15/2015 08:18 PM <DIR> msg
07/02/2015 10:36 AM <DIR> msg_2015_07_02-4_1435826212
07/02/2015 10:39 AM <DIR> pg
06/15/2015 04:46 PM <DIR> reportdb
06/15/2015 04:46 PM <DIR> smisdb
06/15/2015 04:46 PM <DIR> sqldb
06/15/2015 04:46 PM <DIR> sysdb
06/15/2015 04:46 PM <DIR> vssdb
06/15/2015 04:46 PM <DIR> xpdb
2 File(s) 13,292 bytes
18 Dir(s) 86,840,508,416 bytes free

O:\>dir O:\OmniBack\server\db80_restore
Volume in drive O is OmniBack
Volume Serial Number is 261B-48D4

Directory of O:\OmniBack\server\db80_restore

07/02/2015 10:37 AM <DIR> .
07/02/2015 10:37 AM <DIR> ..
07/02/2015 10:37 AM <DIR> idb
07/02/2015 10:37 AM <DIR> jce
07/02/2015 10:43 AM <DIR> pg
0 File(s) 0 bytes
5 Dir(s) 86,840,508,416 bytes free
  • Move the restored database in the target directory and adjust directory junctions. Please note: The name of the directory junctions might vary on your Cell Manager.
O:\>rmdir /S /Q O:\OmniBack\server\db80\idb
O:\>rmdir /S /Q O:\OmniBack\server\db80\jce
O:\>rmdir /S /Q O:\OmniBack\server\db80\pg

O:\>move /Y O:\OmniBack\server\db80_restore\idb O:\OmniBack\server\db80
O:\>move /Y O:\OmniBack\server\db80_restore\jce O:\OmniBack\server\db80
O:\>move /Y O:\OmniBack\server\db80_restore\pg O:\OmniBack\server\db80

O:\>dir O:\OmniBack\server\db80\pg\pg_tblspc
Volume in drive O is OmniBack
Volume Serial Number is 261B-48D4

Directory of O:\OmniBack\server\db80\pg\pg_tblspc

07/02/2015 10:37 AM <DIR> .
07/02/2015 10:37 AM <DIR> ..
07/02/2015 10:37 AM <JUNCTION> 16387 [O:\OmniBack\server\db80_restore\idb]
07/02/2015 10:37 AM <JUNCTION> 16445 [O:\OmniBack\server\db80_restore\jce]
0 File(s) 0 bytes
4 Dir(s) 87,235,559,424 bytes free

O:\>rmdir /Q O:\OmniBack\server\db80\pg\pg_tblspc\16387
O:\>rmdir /Q O:\OmniBack\server\db80\pg\pg_tblspc\16445

O:\>mklink /J O:\OmniBack\server\db80\pg\pg_tblspc\16387 O:\OmniBack\server\db80\idb
Junction created for O:\OmniBack\server\db80\pg\pg_tblspc\16387 <<===>> O:\OmniBack\server\db80\idb

O:\>mklink /J O:\OmniBack\server\db80\pg\pg_tblspc\16445 O:\OmniBack\server\db80\jce
Junction created for O:\OmniBack\server\db80\pg\pg_tblspc\16445 <<===>> O:\OmniBack\server\db80\jce
  • Update configuration files accordingly. For example replace all occurrences of db80_restore with db80. While looking at the idb.config file take a note of the values PGSUPERUSER and PGPORT as we need them when using the psql command.
O:\>notepad O:\OmniBack\server\db80\pg\postgresql.conf
O:\>notepad O:\OmniBack\server\db80\pg\postmaster.opts
O:\>notepad O:\OmniBack\Config\Server\idb\idb.config
  • Adjust the ImagePath registry value for the hpdp-idb service in HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\hpdp-idb to reflect the previous changes

  • Additional clean up, depending on your configuration
O:\>rmdir /Q /S O:\OmniBack\server\db80_restore
O:\>rmdir /Q /S O:\OmniBack\server\db80\meta_2015_07_02-4_1435826212
O:\>rmdir /Q /S O:\OmniBack\server\db80\msg_2015_07_02-4_1435826212
O:\>rmdir /Q /S O:\OmniBack\log\server\auditing_2015_07_02-4_1435826212
  • Start the IDB, update the references to the table spaces before starting the remaining services and do basic verification
O:\>omnisv start -idb_only
The Internal database services successfully started.

O:\>omnisv status
    ProcName      Status  [PID]
    crs         : Down
    mmd         : Down
    kms         : Down
    hpdp-idb    : Active  [4136]
    hpdp-idb-cp : Active  [5528]
    hpdp-as     : Down
    omnitrig    : Down
    omniinet    : Down
    Sending of traps disabled
Status: At least one of the Cell Server processes/services is not running.

O:\>O:\OmniBack\idb\bin\psql.exe -U hpdp -h localhost -p 7112 postgres

postgres=# SELECT spcname, spclocation FROM pg_tablespace;
  spcname   |         spclocation
 pg_default |
 pg_global  |
 hpdpidb    | O:/OmniBack/server/db80_restore/idb
 hpjce      | O:/OmniBack/server/db80_restore/jce
(4 rows)

postgres=# UPDATE pg_tablespace SET spclocation = 'O:/OmniBack/server/db80/idb' WHERE spcname = 'hpdpidb';
postgres=# UPDATE pg_tablespace SET spclocation = 'O:/OmniBack/server/db80/jce' WHERE spcname = 'hpjce';

postgres=# \q

O:\>omnisv start
Cell Server services successfully started.

O:\>omnidbcheck -extended
Check Level             Mode                    Status
Database connection     -connection             OK
Schema consistency      -schema_consistency     OK
Datafiles consistency   -verify_db_files        OK
Database consistency    -database_consistency   OK
Media consistency       -media_consistency      OK
SIBF(readability)       -sibf                   OK
DCBF(presence and size) -bf                     OK
OMNIDC(consistency)     -dc                     OK

O:\>omnib -idb_list IDB_BACKUP
[Normal] From: BSM@windows.syncer.de "IDB_BACKUP" Time: 7/2/2015 11:15:17 AM
Backup session 2015/07/02-5 started.

[Normal] From: BSM@windows.syncer.de "IDB_BACKUP" Time: 7/2/2015 11:15:17 AM
OB2BAR application on "windows.syncer.de" successfully started.

[Normal] From: OB2BAR_POSTGRES_BAR@windows.syncer.de "DPIDB" Time: 7/2/2015 11:15:18 AM
Checking the Internal Database consistency

[Normal] From: OB2BAR_POSTGRES_BAR@windows.syncer.de "DPIDB" Time: 7/2/2015 11:15:19 AM
Check of the Internal Database consistency succeeded

[Normal] From: OB2BAR_POSTGRES_BAR@windows.syncer.de "DPIDB" Time: 7/2/2015 11:15:19 AM
Putting the Internal database into the backup mode finished


List non-migrated DCBFs

If you need to deal with non-migrated DCBFs after an upgrade to Data Protector 8 or later, you might find this SQL query useful. It will display all non-migrated DCBFs (dcbf_version = -1) with references in the IDB. Don’t remove the old catalog using perl omnimigrate.pl -remove_old_catalog before all v1 DCBFs have been successfully migrated to v2. The migration is done using perl omnimigrate.pl -start_catalog_migration.

SELECT medium_name, dcbf_directory, dcbf_version FROM dp_dcbf_info i, dp_dcbf_directory d
WHERE i.dcbf_directory_seq_id = d.dp_numkey and dcbf_version != '2'
ORDER BY medium_name ASC, dcbf_directory ASC;

Save the query as dcbf_version.sql and run omnidbutil -run_script dcbf_version.sql -detail.

Load more