Saturday, March 29, 2008

Implementing Veritas Flashsnap for database backups

Snapshot Technology is a fast and efficient method to take backups of large scale databases. Unlike RMAN which is Oracle based, Snapshots uses mirroring technologies managed by the system/storage. There are many variants of snapshots available and the most commonly used method is one in which an entire copy of the database is made. This permit off-host backups and can be incrementally refreshed on a daily basis.

Veritas Flashsnap is a Point in Time Data Mirroring solution using Veritas technologies. Veritas Flashsnap is tightly integrated into Veritas Volume Manager (VxVM) and Veritas Filesystem (VxFS). It requires a separate license to be purchased and enabled (at least for versions 4.x). It is important to note that Flashsnap is VxVM Volume based rather than LUN based (as Shadow Image or Timefinder).

This article is an attempt to create a reference configuration for implementing Flashsnap to backup an Oracle database using Storage Foundation for Oracle 4.0.

Flashsnap has many options available as to the kind of Point in Time Mirror to be deployed. This article focuses on the Full sized Instant Snapshot option by which a mirror is taken of the entire database on dedicated LUNs which can then be imported onto a Media Server for backup to tape.

Flashsnap has inherent advantages over Shadow Image (HDS) or Timefinder (EMC) technologies -

  • Since it is VxVM volume based, the LUNs used for the mirror can be on different storage array (e.g. a low cost array using SATA drives) than your primary storage. If you were to use Shadow Image etc, then you would need to dedicate storage on the primary array as used by the host.
  • The rate of synchronization between source (database) volumes and target (dedicated backup) volumes can be controlled with fine granularity.
  • Unlike Array based solutions, VxVM allows you to split and deport the target volumes as a different disk group from the parent. No need for complicated scripts to change the disk group name when you wish to import the backup LUNs on the source system to do a recovery.
  • In most big shops, the primary storage will be an Enterprise Array shared between many hosts. When using Array Based technologies, during the synchronization of large number of LUNs, a noticeable impact will be observed on the Primary Array thus affecting every attached host on the array. However when doing Flashsnap, the impact would be only to the host running the flashsnap operation rather than the primary array.
  • The source and target volumes need not be of the same configuration and layout. Unlike Array based technologies, one can have different LUN configurations and also different sizing.
  • One can deploy a Flashsnap solution in less than 30 minutes. It is easy to implement and works well if configured and managed correctly (like all good solutions done right the first time).
System Configuration

The system configuration and specifications as was used in this article is documented below.
  • Platform – SunFire E4900 with 8 CPU’s and 32GB of RAM
  • OS Version – Solaris 9
  • Database – Oracle 9204
  • Veritas Volume Manager – Storage Foundation for Oracle Enterprise Edition 4.0 (MP2)
  • HBA’s – 4 Dual Port Emulex LP10000 (64 Bit/66 MHz), Driver version 6.01f installed on 66Mhz slots installed in the 2 I/O boards on the E4900.
  • Number of volumes requiring backup - 18
  • Number of LUNs – 48
  • Total size – 3.2 TB
  • Storage Array for Source LUNs – Hitachi 9585 using 72GB 10K RPM FCAL drives in a Raid 1 configuration.
  • Storage Array for SNAP LUNs – Hitachi 9585 using 143GB 10K RPM FCAL drives in a Raid 5 Configuration (4D+1).
License Check

The first task would be to check the License for Flashsnap. Both Flashsnap and FastResync need to be enabled. The Flashsnap license also allows the vxdg split/join command to be performed allowing volumes to be split onto a different disk group.

bash-2.03# /opt/VRTSvlic/bin/vxlicrep

VERITAS License Manager vxlicrep utility version 3.02.005
Copyright (C) 1996-2004 VERITAS Software Corp. All Rights reserved.

Creating a report on all VERITAS products installed on this system
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-----------------***********************-----------------

License Key = 4251216366621396656112014869701
Product Name = Warning!!! Key not associated with a valid product

Key = Valid
License Type = PERMANENT_NODE_LOCK
Node Lock Type = (Hostid and Architecture ID)
Editions Product = YES

Features :=
POINT_IN_TIME = Enabled

-----------------***********************-----------------

License Key = 65879700989905626340292027039
Product Name = VERITAS Volume Manager
Key = Valid
License Type = PERMANENT_NODE_LOCK
Node Lock Type = (Hostid and Architecture ID)

Features :=
FASTRESYNC = Enabled

VxVM Tunables

In order to ensure that Veritas Volume Manager is adequately tuned to meet enterprise demands, the below variables were set.

/etc/system changes from default:

set maxphys=8388608

set vxio:vol_maxio=16384

set vxio:vol_maxioctl=131072

set vxio:vol_maxspecialio=16384

set vxio:vol_default_iodelay=10

set vxio:voliomem_chunk_size=131072

set vxio:voliomem_maxpool_sz=134217728

Tthe volpagemod_max_memsz variable is set to 1024m in a startup script.

/usr/sbin/vxtune volpagemod_max_memsz 1024m

The vxiod parameter is set to 32 by editing the vxvx-startup2 init script.

Emulex Configuration changes

lun-queue-depth=30;
tgt-queue-depth=256;
num-iocbs=2048;
num-bufs=1024;


Storage Connectivity

The SAN was rated at 2Gbit/sec and the connectivity is as shown below.

Primary Storage - lpfc0 and lpfc1 were on IO Board 0 and lpfc2 and lpfc3 were on IO Board 1.


Backup Storage is connected in an exactly similar manner except that dedicated HBA’s were numbered lpfc4 to lpfc7.

Flashsnap Configuration

Henceforth the volumes to be backed up are called Primary Volumes and the Mirror Copies of the volumes are called Backup Volumes. Same convention for Primary and Backup LUNs.
The basic configuration steps are

  • Ensure that the host sees the backup LUNs as presented by the Storage.
  • Configure the backup LUNs and add them to the same diskgroup as the Primary Volumes.
  • Create Backup Volumes using the backup LUNs as of the same size as the Primary Volumes.
  • Prepare the Primary and Backup Volumes for Flashsnap
  • Perform the initial sync.
  • Split the Backup Volumes onto a different diskgroup and import on the media server for backup.
  • Hence going forward, do a incremental refresh as required. This would entail importing the Backup Diskgroup and joining it to the Primary Diskgroup to perform the refresh.
Step1 and Step2 –

It is assumed that Step 1 (Host sees backup LUNs) and Step 2 (Backup LUNs are initialized and added to same diskgroup as Primary Volumes) is completed successfully. For the sake of clarity, the Backup LUNs are initialized and added to the Primary Diskgroup using ‘SN’ as the prefix for the disk name.

For e.g.

root@snaptest:> vxdisk list |more

DEVICE TYPE DISK GROUP STATUS
c2t18d0s2 auto:cdsdisk testdg00 testdg online
c2t18d1s2 auto:cdsdisk testdg01 testdg online
c2t18d2s2 auto:cdsdisk testdg02 testdg online
c2t18d3s2 auto:cdsdisk testdg03 testdg online
c2t18d4s2 auto:cdsdisk testdg04 testdg online
c2t18d5s2 auto:cdsdisk testdg05 testdg online
c2t18d6s2 auto:cdsdisk testdg06 testdg online
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
c8t40d71s2 auto:cdsdisk SNtestdg85 testdg online
c8t40d72s2 auto:cdsdisk SNtestdg72 testdg online
c8t40d73s2 auto:cdsdisk SNtestdg73 testdg online
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~


Here disks starting with testdg refer to the Primary LUNs and the disks starting with SN refer to the Backup LUNs.

Step 3:

Step 3 is to create the Backup Volumes with the same size as that of the Primary Volumes.
In the example shown below – one of the Primary Volumes is a 4 Column Stripe,

v test05 - ENABLED ACTIVE 556785664 SELECT test05-01 fsgen
pl test05-01 test05 ENABLED ACTIVE 556785664 STRIPE 4/2048 RW
sd testtdg22-01 test05-01 testtdg22 0 139196416 0/0 c5t16d22 ENA
sd testtdg23-01 test05-01 testtdg23 0 139196416 1/0 c5t16d23 ENA
sd testtdg24-01 test05-01 testtdg24 0 139196416 2/0 c5t16d24 ENA
sd testtdg25-01 test05-01 testtdg25 0 139196416 3/0 c5t16d25 ENA

While one can create the backup volume of any layout, it is ideal if the same layout were maintained. It is important to note that the Backup Volumes are created using the Backup LUNs and using the –U fsgen option. The source volumes would also ideally be created with the –U fsgen option.

The below volume SNtest05 is going to be Backup Volume for Primary Volume test05. All Primary volumes which need to be backed using Flashsnap need to have Backup Volumes similarly created.

v SNtest05 - ENABLED ACTIVE 556785664 SELECT SNtest05-01 fsgen
pl SNtest05-01 SNtest05 ENABLED ACTIVE 556785664 STRIPE 4/2048 RW
sd SNtesttdg22-01 SNtest05-01 SNtesttdg22 0 139196416 0/0 c8t40d22 ENA
sd SNtesttdg23-01 SNtest05-01 SNtesttdg23 0 139196416 1/0 c8t40d23 ENA
sd SNtesttdg24-01 SNtest05-01 SNtesttdg24 0 139196416 2/0 c8t40d24 ENA
sd SNtesttdg25-01 SNtest05-01 SNtesttdg25 0 139196416 3/0 c8t40d25 ENA

Step 4:

Step 4 is to prepare the Primary and Backup Volumes for Flashsnap. This basically involves creating DCO logs for the volumes. Primary volumes need to use a disk belonging to the Primary Disks for DCO logs and similarly Backup Volumes need to use a disk belonging to the Backup Disks for the DCO logs. In order to meet performance requirements, it is ideal if disks were dedicated for the sole purpose of DCO logs only. These disks would be reserved and not used in any volume creation.
In the example given below, testdg06 is the disk going to be used for DCO logs for all the Primary volumes and SNtestdg87 is used for the Backup Volumes.

root@snaptest:> vxdisk –g testdg list |more
DEVICE TYPE DISK GROUP STATUS
c2t18d0s2 auto:cdsdisk testdg00 testdg online
c2t18d1s2 auto:cdsdisk testdg01 testdg online
c2t18d6s2 auto:cdsdisk testdg06 testdg online
c2t18d6s2 auto:cdsdisk testdg06 testdg online reserved
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
c8t40d71s2 auto:cdsdisk SNtestdg71 testdg online reserved
c8t40d72s2 auto:cdsdisk SNtestdg72 testdg online
c8t40d73s2 auto:cdsdisk SNtestdg73 testdg online
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In order to prepare a volume for snap, we run the vxsnap prepare command as shown below:

Primary Volume

bash-2.03# vxsnap -g testdg prepare test05 regionsize=32k alloc=testdg06

root@snaptest:> vxprint -hrt test05
Disk group: testdg

dm testdg22 c2t18d22s2 auto 2048 139197696 -
dm testdg23 c2t18d23s2 auto 2048 139197696 -
dm testdg24 c2t18d24s2 auto 2048 139197696 -
dm testdg25 c2t18d25s2 auto 2048 139197696 -

v test05 - ENABLED ACTIVE 556785664 SELECT test05-01 fsgen
pl test05-01 test05 ENABLED ACTIVE 556785664 STRIPE 4/2048 RW
sd testdg22-01 test05-01 testdg22 0 139196416 0/0 c2t18d22 ENA
sd testdg23-01 test05-01 testdg23 0 139196416 1/0 c2t18d23 ENA
sd testdg24-01 test05-01 testdg24 0 139196416 2/0 c2t18d24 ENA
sd testdg25-01 test05-01 testdg25 0 139196416 3/0 c2t18d25 ENA
dc test05_dco test05 test05_dcl
v test05_dcl - ENABLED ACTIVE 76704 SELECT - gen
pl test05_dcl-01 test05_dcl ENABLED ACTIVE 76704 CONCAT - RW
sd testdg06-08 test05_dcl-01 testdg06 360944 76704 0 c2t18d7 ENA


Note the DCL plex test05_dcl created after running the vxsnap prepare command.

Backup Volume

bash-2.03# vxsnap -g testdg prepare SNtest05 regionsize=32k alloc=SNtestdg85

root@snaptest:> vxprint -hrt SNtest05
Disk group: SNtestdg

dm SNtestdg22 c8t40d22s2 auto 2048 139197696 -
dm SNtestdg23 c8t40d23s2 auto 2048 139197696 -
dm SNtestdg24 c8t40d24s2 auto 2048 139197696 -
dm SNtestdg25 c8t40d25s2 auto 2048 139197696 -

v SNtest05 - ENABLED ACTIVE 556785664 SELECT SNtest05-01 fsgen
pl SNtest05-01 SNtest05 ENABLED ACTIVE 556785664 STRIPE 4/2048 RW
sd SNtestdg22-01 SNtest05-01 SNtestdg22 0 139196416 0/0 c8t40d22 ENA
sd SNtestdg23-01 SNtest05-01 SNtestdg23 0 139196416 1/0 c8t40d23 ENA
sd SNtestdg24-01 SNtest05-01 SNtestdg24 0 139196416 2/0 c8t40d24 ENA
sd SNtestdg25-01 SNtest05-01 SNtestdg25 0 139196416 3/0 c8t40d25 ENA
dc SNtest05_dco SNtest05 SNtest05_dcl
v SNtest05_dcl - ENABLED ACTIVE 76704 SELECT - gen
pl SNtest05_dcl-01 SNtest05_dcl ENABLED ACTIVE 76704 CONCAT - RW
sd SNtestdg71-01 SNtest05_dcl-01 SNtestdg71 360944 76704 0 c8t40d71 ENA


Using a regionsize of 32k allows for faster resynchronizations though the initial sync may take more time.

In order to verify that Fast Resync and Instant Snapshot has been enabled successfully for the volume, the below tests can be done.

bash-2.03# vxprint -g testdg -F%fastresync test05
on

bash-2003# vxprint -g testdg -F%instant test05
on


Step 5:

Step 5 is to run the initial sync which will associate the Backup Volumes with the Primary Volumes and do a mirror copy of the Primary Volume onto the Backup Volume. It is performed using the vxsnap command.
In this example, the iosize parameter has been set to 12M which is aggressive. The default is 1M. Running at 12M will impose a significant overhead on the system during the time the sync operation runs – especially if synchronizing a large number of volumes at the same time.

bash-2.03# vxsnap -g testdg -o iosize=12m make source=test05/snapvol=SNtest05

The above command will create a snap record and associate the Primary with the Backup Volume. It will automatically run in the background. Monitoring on the progress can be done using vxtask list or vxsnap syncwait.

bash-2.03# vxsnap -g testdg syncwait SNtest05

The syncwait command will return to the prompt only when the snap process has completed.
A vxprint output of the Primary Volume or the Backup Volume will now show the sp record.

v test05 - ENABLED ACTIVE 556785664 SELECT test05-01 fsgen
pl test05-01 test05 ENABLED ACTIVE 556785664 STRIPE 4/2048 RW
sd testtdg22-01 test05-01 testdg22 0 139196416 0/0 c2t18d22 ENA
sd testtdg23-01 test05-01 testdg23 0 139196416 1/0 c2t18d23 ENA
sd testtdg24-01 test05-01 testdg24 0 139196416 2/0 c2t18d24 ENA
sd testtdg25-01 test05-01 testdg25 0 139196416 3/0 c2t18d25 ENA
dc test05_dco test05 test05_dcl
v test05_dcl - ENABLED ACTIVE 76704 SELECT - gen
pl test05_dcl-01 test05_dcl ENABLED ACTIVE 76704 CONCAT - RW
sd testtdg07-08 test05_dcl-01 testtdg07 360944 76704 0 c2t18d7 ENA
sp SNtest05_snp test05 test05_dco

Step 6:

Step 6 is now to split the Primary diskgroup and separate the Backup Volumes into a different diskgroup. Doing so allows the Backup Diskgroup (Volumes) to be deported and hence imported on the Backup/Media Server.

bash-2.03# vxdg split testdg SNtestdg SNtest05

This command will move the volume SNtest05 to a new diskgroup SNtestdg. There are multiple ways to do a split – either using the –o expand option or specifying all the volumes in the command

bash-2.03# vxdg split testdg SNtestdg SNtest05 SNvol2 SNvol3 etc

or

bash-2.03# vxdg –o expand split testdg SNtestdg SNtestdg71

bash-2.03# vxdg deport SNtestdg

The –o expand option will then split the Primary Diskgroup and move all volumes using the into the SNtestdg diskgroup – in this case the disk is SNtestdg71.

Once this is complete, then a deport of SNtestdg can be performed and SNtestdg can be imported on the Media Server for backup to Tape.

Step 7

For performing an incremental refresh of Backup Volumes from Primary Volumes, the Backup Diskgroup SNtestdg needs to be imported on the Primary Host, joined to testdg and the refresh performed.

bash-2.03# vxdg import SNtestdg

bash-2.03# vxdg join SNtestdg testdg

bash-2.03# vxvol –g testdg startall


bash-2.03# vxsnap –g testdg refresh SNtest05 source=test05


bash-2.03# vxsnap –g testg syncwait SNtest05


Proper Backup of an Oracle Database – sequence

In order to take a proper and valid backup of an Oracle database, the below is the sequence of operations to be performed. In order for this to be successful, the archive logs must be placed on a dedicated mount point.

  • Import and join the Backup Diskgroup to the Primary Diskgroup.
  • Alter all tablespaces except the temporary tablespace into begin backup.
  • Run a refresh of all the volumes including archive logs and wait till it completes.
  • Alter all tablespaces except the temporary tablespace into end backup.
  • Sync the archive logs (only) mount point again and wait till it completes.
  • Split the Backup Volumes into the Backup Diskgroup and deport the Backup Diskgroup.
  • Import on Media server and run backup to tape.

The above sequence can be done everyday as part of the backup cycle.

Performance Impact to the host during the initial sync and further refreshes

The entire refresh of 3.2TB took around 6 hrs thus showing an average write throughput of 150MB/sec with a peak of around 250MB/sec. The system was showing an aggregate throughput of 300-500MB/sec (read+write). However there was a considerable load on the system with a run queue of around 20 for the period of the sync. Further refreshes for a database of 1.2TB (used size) with a 50% read-write ratio (150GB of redo/day) takes around 30 minutes on an average.