Troubleshooting clusters

If you experience problems with using Backup Exec in a cluster environment, review the questions and answers in this section.

Table: Cluster troubleshooting questions and answers

Question

Answer

After I recovered my cluster and all shared disks, the cluster service will not start. Why won’t it start and how can I get it started?

The cluster service may not start because the disk signature on the Quorum disk is different from the original signature. If you have the Microsoft 2000 Resource Kit use Dumpcfg.exe or Clusterrecovery from the Microsoft 2003 Resource Kit to replace the disk. For example, type:

dumpcfg.exe /s 12345678 0

Replace 12345678 with the disk signature and replace 0 with the disk number. You can find the disk signature and the disk number in the event log.

If you do not have the Microsoft 2000 Resource Kit, you can use -Fixquorum to change the Quorum disk signature.

See Changing the Quorum disk signature.

I used the Checkpoint Restart option for my backups. During one of my backups, a Microsoft cluster failover occurred. Multiple backup sets were created. When I try to verify or restore using these backup sets, an “Unexpected End of Data” error occurs on the set that contains the data that was backed up prior to the failover. Why does this occur? Is my data safe?

You received this error because failover occurred in the middle of backing up the resource, therefore the backup set was not closed on the media. However, the objects that were partially backed up in the first backup set were completely backed up again during restart, ensuring data integrity. Therefore, all of the objects on the media for the given backup set should still be restored and verified.

I clustered a primary SAN server with a secondary SAN server. Now the device and media service on the secondary server fails. Why?

This occurs when the secondary server becomes the active node and attempts to connect to the Backup Exec database on the primary server, which is no longer available. To correct this, you must use the Backup Exec Utility (BEUTILITY.EXE) or reinstall the secondary server to be a primary server.

An Advanced Disk Based backup failed due to the application virtual server failover. How do I clean up Veritas Storage Foundation for Windows cluster disk groups and their associated volumes?

If the application virtual server fails when you use Veritas Storage Foundation for Windows (SFW) snapshot provider to perform an advanced disk-based backup, the backup job will fail. The original cluster disk group that the snapped volumes belong to has moved from the primary node to a secondary node and the snapped volumes will not be able to resynchronize with the original volumes.

The following is a description of the steps that occur for an advanced disk-based backup:

  • The snapped volumes are split from the original volumes.

  • The previously split snapped volumes are placed into a new cluster disk group.

  • The new cluster disk group is removed from the physical node where the production virtual server is currently online and then added to the Symantec Backup Exec media server.

  • The new cluster disk group will eventually be removed from the media server and then added back into the physical node where it previously resided, regardless of where the production virtual server is currently located.

  • The new cluster disk group joins the original cluster disk group if it is located in the same node.

  • The snapped volumes resynchronize with the original volumes.

During this process if the production virtual fails over from the currently active node to a secondary node, the new cluster disk group cannot rejoin the original cluster disk group.

See Manually joining two cluster disk groups and resynchronizing volumes.

After I performed a manual failover of a Veritas cluster resource, my backup jobs hang. Why won’t the backup jobs terminate?

If a manual failover of a Veritas cluster resource occurs, Veritas Cluster Server does not dismount MountV resources if there are open handles. It is recommended that all backup jobs complete before performing a manual failover. If a backup job does hang, you must manually cancel the job before you can complete a manual cleanup process.

Troubleshooting clusters