Professional Knowledge base tutorial: How to Configuring a two node serviceguard Cluster HP-UX

------------------------------------------------------------------------------------------------------------------------------------------

Configuring a two node serviceguard Cluster HP-UX

------------------------------------------------------------------------------------------------------------------------------------------

https://ahmedsharifalvi.wordpress.com/2011/07/11/configuring-a-two-node-serviceguard-cluster-part-1/

------------------------------------------------------------------------------------------------------------------------------------------

Here I will describe the configuration procedure of a basic two node HP Serviceguard Cluster on HP-Unix. I will use Oracle RDBMS as the cluster package. I will just show the configuration steps. I am not going to discuss any theoretical concept like why someone will use a cluster or what actually a single point of failure is etc etc. There is a handful of discussion on these topics on internet. During the configuration I felt the lack of a well documented step by step configuration guide. I will try to make this one.

Hardware Configuration:

I will use two HP 9000 series RP3440 service with 2 CPU and 4 Gig physical memory.

Each server have 4 network interface card. We will use three of them.

For shared storage I will use HP MSA 1000 storage.

Software Configuration:

Operating system is: HP Unix 11.23 MCOE (Mission Critical Operating Environment).

The HP-UX 11i Mission Operating Environment provides all the capabilities of the base HP-UX 11i and Enterprise Operating Environments plus certain critical add-on products for additional multiple system availability and performance management. The benefit is you don’t need to install servicegurad and other related file sets manually. It will be installed during the OS installation.

For heartbeat I will use a point to point Ethernet connection.

Here is the steps:

1. Configure the hardware first. Connect all required network cables. For this example, connect two NICs on each server to the public Ethernet network, connect the heartbeat cable through a point to point cable. Assign the LUNs from the storage. The amount of required shared storage will depend on the application requirement. For this example 5/6 LUNs of 50GB will be more than enough.

2. Install the operating system. Ensure that the shared LUNs are visible from both node. The output of ioscan -fnC command will look something like this:

# ioscan -fnC disk

Class I H/W Path Driver S/W State H/W Type Description

============================================================================

disk 0 0/0/2/0.0.0.0 sdisk CLAIMED DEVICE TEAC DV-28E-N

/dev/dsk/c0t0d0 /dev/rdsk/c0t0d0

disk 1 0/1/1/0.0.0 sdisk CLAIMED DEVICE HP 146 GST3146707LC

/dev/dsk/c2t0d0 /dev/rdsk/c2t0d0

disk 16 0/1/1/0.1.0 sdisk CLAIMED DEVICE HP 146 GST3146707LC

/dev/dsk/c2t1d0 /dev/rdsk/c2t1d0

disk 2 0/2/1/0/4/0.1.0.0.0.0.1 sdisk CLAIMED DEVICE HP MSA VOLUME

/dev/dsk/c4t0d1 /dev/rdsk/c4t0d1

disk 4 0/2/1/0/4/0.1.0.0.0.0.2 sdisk CLAIMED DEVICE HP MSA VOLUME

/dev/dsk/c4t0d2 /dev/rdsk/c4t0d2

disk 6 0/2/1/0/4/0.1.0.0.0.0.3 sdisk CLAIMED DEVICE HP MSA VOLUME

/dev/dsk/c4t0d3 /dev/rdsk/c4t0d3

disk 8 0/2/1/0/4/0.1.0.0.0.0.4 sdisk CLAIMED DEVICE HP MSA VOLUME

/dev/dsk/c4t0d4 /dev/rdsk/c4t0d4

disk 10 0/2/1/0/4/0.1.0.0.0.0.5 sdisk CLAIMED DEVICE HP MSA VOLUME

/dev/dsk/c4t0d5 /dev/rdsk/c4t0d5

3. Assign the IP addresses on both nodes and make sure you are able to ping each other using both public interface IP and heartbeat IP. The /etc/rc.config.d/netconf file will look something like this:

#cat /etc/rc.config.d/netconf

HOSTNAME="node01"

OPERATING_SYSTEM=HP-UX

LOOPBACK_ADDRESS=127.0.0.1

INTERFACE_NAME[0]="lan0"

IP_ADDRESS[0]="10.10.96.162"

SUBNET_MASK[0]="255.255.255.0"

BROADCAST_ADDRESS[0]=""

INTERFACE_STATE[0]=""

DHCP_ENABLE[0]=0

INTERFACE_MODULES[0]=""

INTERFACE_NAME[1]="lan1"

IP_ADDRESS[1]="1.1.1.34"

SUBNET_MASK[1]="255.255.255.0"

BROADCAST_ADDRESS[1]=""

INTERFACE_STATE[1]=""

DHCP_ENABLE[1]=0

INTERFACE_MODULES[0]=""

ROUTE_DESTINATION[0]=default

ROUTE_MASK[0]=""

ROUTE_GATEWAY[0]="10.10.96.1"

ROUTE_COUNT[0]=""

GATED=0

GATED_ARGS=""

RDPD=0

RARP=0

DEFAULT_INTERFACE_MODULES=""

4. Add the information of all IP addresses (public and heartbeat) on /etc/hosts file. Make sure that /etc/hosts file on both nodes are same. Following is a sample file:

# cat /etc/hosts

10.10.96.162 node01

10.10.96.164 node02

10.10.96.163 crm_db #service IP / package IP

1.1.1.34 node01hb

1.1.1.24 node02hb

127.0.0.1 localhost loopback

5. Create a VG for using as a lock-VG. It will work as the quorum device. Here is the procedure:

Create the physical volume (PV) first:

# pvcreate -f /dev/rdsk/c4t0d5

Physical volume "/dev/rdsk/c4t0d5" has been successfully created.

Note that, you must mention the raw device file (/dev/rdsk/..) for creating the PV and you have to use the block device file (/dev/dsk/..) for creating VG.

Create the group file for the volume group:

# mkdir /dev/vglock

# mknod /dev/vglock/group c 64 0x010000

The minor number (0x010000) of the group file must be unique for all VG. To check the minor number of any existing group file use ‘ls -l /dev/*/group’.

Now create the volume group named vglock. This VG will be used as lock VG:

# vgcreate -s 32 vglock /dev/dsk/c4t0d5

Increased the number of physical extents per physical volume to 6399.

Volume group "/dev/vglock" has been successfully created.

Volume Group configuration for /dev/vglock has been saved in /etc/lvmconf/vglock.conf

Deactivate the VG on node01, export the VG information to a map file and transfer the file to the 2nd node (node02 here) using the following comands:

# vgchange -a n vglock

Volume group "vglock" has been successfully changed.

# vgexport -p -v -s -m /etc/lvmconf/vglock.map vglock

Beginning the export process on Volume Group "vglock".

/dev/dsk/c4t0d5

# rcp /etc/lvmconf/vglock.map node02:/etc/lvmconf/

If you don’t deactivate before exporting the VG, then you will get a warning which you can ignore.

Use the following commands to create the required group file on 2nd node and import the lock VG using the map file that was transferred from the first node. Minor number of a VG must be same on all the cluster nodes.

# mkdir /dev/vglock

# mknod /dev/vglock/group c 64 0x010000

# vgimport -s -v -m /etc/lvmconf/vglock.map vglock

Beginning the import process on Volume Group "vglock".

Volume group "/dev/vglock" has been successfully created.

6. Now you are ready to create the cluster. Execute the following command to create an ascii cluster configuration file.

# cd /etc/cmcluster

# cmquerycl -v -n node01 -n node02 -C crmcluster.ascii

Looking for other clusters ... Done

Gathering storage information

Found 23 devices on node node01

Found 23 devices on node node02

Analysis of 46 devices should take approximately 5 seconds

0%----10%----20%----30%----40%----50%----60%----70%----80%----90%----100%

Found 2 volume groups on node node01

Found 2 volume groups on node node02

Analysis of 4 volume groups should take approximately 1 seconds

0%----10%----20%----30%----40%----50%----60%----70%----80%----90%----100%

Note: Disks were discovered which are not in use by either LVM or VxVM.

Use pvcreate(1M) to initialize a disk for LVM or,

use vxdiskadm(1M) to initialize a disk for VxVM.

Volume group /dev/vglock is configured differently on node node01 than on node node02

Volume group /dev/vglock is configured differently on node node02 than on node node01

Gathering network information

Beginning network probing

Completed network probing

Node Names: node01

node02

Bridged networks (local node information only - full probing was not performed):

1 lan0 (node01)

2 lan1 (node01)

5 lan0 (node02)

6 lan1 (node02)

IP subnets:

IPv4:

10.10.96.0 lan0 (node01)

lan0 (node02)

1.1.1.0 lan1 (node01)

lan1 (node02)

IPv6:

Possible Heartbeat IPs:

10.10.96.0 10.10.96.162 (node01)

10.10.96.164 (node02)

1.1.1.0 1.1.1.34 (node01)

1.1.1.24 (node02)

Possible Cluster Lock Devices:

/dev/dsk/c4t0d5 /dev/vglock 66 seconds

LVM volume groups:

/dev/vg00 node01

/dev/vglock node01

node02

/dev/vg00 node02

LVM physical volumes:

/dev/vg00

/dev/dsk/c2t1d0 0/1/1/0.1.0 node01

/dev/vglock

/dev/dsk/c5t0d7 0/2/1/0/4/0.1.0.255.0.0.7 node01

/dev/dsk/c4t0d7 0/2/1/0/4/0.1.0.0.0.0.7 node02

/dev/dsk/c5t0d7 0/2/1/0/4/0.1.0.255.0.0.7 node02

/dev/vg00

/dev/dsk/c2t1d0 0/1/1/0.1.0 node02

LVM logical volumes:

Volume groups on node01:

/dev/vg00/lvol1 FS MOUNTED /stand

/dev/vg00/lvol2

/dev/vg00/lvol3 FS MOUNTED /

/dev/vg00/lvol4 FS MOUNTED /home

/dev/vg00/lvol5 FS MOUNTED /tmp

/dev/vg00/lvol6 FS MOUNTED /opt

/dev/vg00/lvol7 FS MOUNTED /usr

/dev/vg00/lvol8 FS MOUNTED /var

Volume groups on node02:

/dev/vg00/lvol1 FS MOUNTED /stand

/dev/vg00/lvol2

/dev/vg00/lvol3 FS MOUNTED /

/dev/vg00/lvol4 FS MOUNTED /home

/dev/vg00/lvol5 FS MOUNTED /tmp

/dev/vg00/lvol6 FS MOUNTED /opt

/dev/vg00/lvol7 FS MOUNTED /usr

/dev/vg00/lvol8 FS MOUNTED /var

Writing cluster data to crmcluster.ascii.

The above command will build the /etc/cmcluster/crmcluster.ascii file. This file defines the nodes, the disks, the LAN cards, and any other resources that are to be part of the cluster. You can edit this file and change various cluster parameter like HEARTBEAT_INTERVAL, NODE_TIMEOUT etc. Here is the content of the file:

# cat crmcluster.ascii

CLUSTER_NAME crmcluster

FIRST_CLUSTER_LOCK_VG /dev/vglock

NODE_NAME node01

NETWORK_INTERFACE lan0

HEARTBEAT_IP 10.10.96.162

NETWORK_INTERFACE lan1

HEARTBEAT_IP 1.1.1.34

FIRST_CLUSTER_LOCK_PV /dev/dsk/c5t0d7

NODE_NAME node02

NETWORK_INTERFACE lan0

HEARTBEAT_IP 10.10.96.164

NETWORK_INTERFACE lan1

HEARTBEAT_IP 1.1.1.24

FIRST_CLUSTER_LOCK_PV /dev/dsk/c4t0d7

HEARTBEAT_INTERVAL 1000000

NODE_TIMEOUT 2000000

AUTO_START_TIMEOUT 600000000

NETWORK_POLLING_INTERVAL 2000000

NETWORK_FAILURE_DETECTION INOUT

MAX_CONFIGURED_PACKAGES 150

VOLUME_GROUP /dev/vglock

Once the cluster configuration file is edited, you need to use the cmcheckconf command to check the file for errors:

# cd /etc/cmcluster

# cmcheckconf -v -C crmscluster.ascii

Checking cluster file: crmcluster.ascii

Note : a NODE_TIMEOUT value of 2000000 was found in line 127. For a

significant portion of installations, a higher setting is more appropriate.

Refer to the comments in the cluster configuration ascii file or Serviceguard

manual for more information on this parameter.

Checking nodes ... Done

Checking existing configuration ... Done

Gathering storage information

Found 2 devices on node node01

Found 3 devices on node node02

Analysis of 5 devices should take approximately 1 seconds

0%----10%----20%----30%----40%----50%----60%----70%----80%----90%----100%

Found 2 volume groups on node node01

Found 2 volume groups on node node02

Analysis of 4 volume groups should take approximately 1 seconds

0%----10%----20%----30%----40%----50%----60%----70%----80%----90%----100%

Volume group /dev/vglock is configured differently on node node01 than on node node02

Volume group /dev/vglock is configured differently on node node02 than on node node01

Gathering network information

Beginning network probing (this may take a while)

Completed network probing

Checking for inconsistencies

Adding node node01 to cluster crmcluster

Adding node node02 to cluster crmcluster

cmcheckconf: Verification completed with no errors found.

Use the cmapplyconf command to apply the configuration.

After the file has been verified as containing no errors, the cmapplyconf command is used to create and distribute the cluster binary file:

# cmapplyconf -v -C crmcluster.ascii

Checking cluster file: crmcluster.ascii

Note : a NODE_TIMEOUT value of 2000000 was found in line 127. For a

significant portion of installations, a higher setting is more appropriate.

Refer to the comments in the cluster configuration ascii file or Serviceguard

manual for more information on this parameter.

Checking nodes ... Done

Checking existing configuration ... Done

Node node01 is refusing Serviceguard communication.

Please make sure that the proper security access is configured on node

node01 through either file-based access (pre-A.11.16 version) or role-based

access (version A.11.16 or higher) and/or that the host name lookup

on node node01 resolves the IP address correctly.

cmapplyconf: Failed to gather configuration information

As you see from the above output, the command is failed to run successfully. Check the output carefully. It says, there are some problems in name resolution. I was stuck at this point for a significant amount of time. Then I found that there is no /etc/nsswitch.conf file. However, there were nsswitch.files, nsswitch.dns and nsswitch.nis on /etc directory. So I just renamed the file /etc/nsswitch.files to /etc/nsswitch.conf. You can use any of the following commands:

# cp /etc/nsswitch.files /etc/nsswitch.conf

# mv /etc/nsswitch.files /etc/nsswitch.conf

Then run the command cmapplyconf again and press y when it will ask for confirmation:

# cd /etc/cmcluster

# cmapplyconf -v -C crmcluster.ascii

Checking cluster file: crmcluster.ascii

Note : a NODE_TIMEOUT value of 2000000 was found in line 127. For a

significant portion of installations, a higher setting is more appropriate.

Refer to the comments in the cluster configuration ascii file or Serviceguard

manual for more information on this parameter.

Checking nodes ... Done

Checking existing configuration ... Done

Volume group /dev/vglock is configured differently on node node01 than on node node02

Volume group /dev/vglock is configured differently on node node02 than on node node01

Checking for inconsistencies

Modifying configuration on node node01

Modifying configuration on node node02

Modifying node wdkrac1 in cluster crmcluster

Modifying node wdkrac2 in cluster crmcluster

Modify the cluster configuration ([y]/n)? y

Marking/unmarking volume groups for use in the cluster

Completed the cluster creation

Now you will be able to start the cluster using the cmruncl command:

# cmruncl -v

cmruncl: Validating network configuration...

cmruncl: Network validation complete

Waiting for cluster to form ..... done

Cluster successfully formed.

Check the syslog files on all nodes in the cluster to verify that no warnings occurred during startup.

To check the cluster status you can use the cmviewcl command:

# cmviewcl

CLUSTER STATUS

crmcluster up

NODE STATUS STATE

node01 up running

node02 up running

HP-UX Cluster part-2

------------------------------------------------------------------------------------------------------------------------------------------------------------------

https://ahmedsharifalvi.wordpress.com/2011/08/01/configuring-a-two-node-serviceguard-cluster-part-2/

------------------------------------------------------------------------------------------------------------------------------------------------------------------

PKG Configuration

------------------------------------------------------------------------------------------------------------------------------------------------------------------

#ls /dev/rdsk/c0t2d0

/dev/rdsk/c0t2d0

Physical Volume (PV) Creation:

# pvcreate -f /dev/rdsk/c0t2d0

Physical volume "/dev/rdsk/c0t2d0" has been successfully created.

# ls -l /dev/*/group

crw-r----- 1 root sys 64 0x000000 Nov 12 16:06 /dev/vg00/group

crw-rw-rw- 1 root sys 64 0x010000 Dec 16 17:43 /dev/vglock/group

# mkdir /dev/vgappl

# mknod /dev/vgappl/group c 64 0x020000

# vgcreate -s 32 vgappl /dev/dsk/c0t2d0

Increased the number of physical extents per physical volume to 1119.

Volume group "/dev/vgappl" has been successfully created.

Volume Group configuration for /dev/vgappl has been saved in /etc/lvmconf/vgappl.conf

# lvcreate -L 200 -n lvappl /dev/vgappl

Warning: rounding up logical volume size to extent boundary at size "224" MB.

Logical volume "/dev/vgappl/lvappl" has been successfully created with

character device "/dev/vgappl/rlvappl".

Logical volume "/dev/vgappl/lvappl" has been successfully extended.

Volume Group configuration for /dev/vgappl has been saved in /etc/lvmconf/vgappl.conf

# lvcreate -L 200 -n lvapp3 /dev/vglock

Warning: rounding up logical volume size to extent boundary at size "224" MB.

Logical volume "/dev/vgappl/lvapp2" has been successfully created with

character device "/dev/vgappl/rlvapp2".

Logical volume "/dev/vgappl/lvapp2" has been successfully extended.

Volume Group configuration for /dev/vgappl has been saved in /etc/lvmconf/vgappl.conf

Create file system / mount point:

# newfs -F vxfs -o largefiles /dev/vgappl/rlvappl

version 7 layout

229376 sectors, 229376 blocks of size 1024, log size 1024 blocks

largefiles supported

# newfs -F vxfs -o largefiles /dev/vgappl/rlvapp2

version 7 layout

229376 sectors, 229376 blocks of size 1024, log size 1024 blocks

largefiles supported

# mount /dev/vgappl/lvappl /app1

# mount /dev/vgappl/lvapp2 /app2

# umount /app1

# umount /app2

# vgchange -a n vgappl # deactivate the vggroup

Volume group "vgappl" has been successfully changed.

# vgexport -p -v -s -m /etc/lvmconf/vgappl.map vgappl

Beginning the export process on Volume Group "vgappl".

/dev/dsk/c0t2d0

vgexport: Preview of vgexport on volume group "vgappl" succeeded.

scp /etc/lvmconf/vgappl.map lodii018v:/etc/lvmconf/

On 2nd node:

# mkdir /dev/vgappl

# mknod /dev/vgappl/group c 64 0x020000

# ls -l /dev/*/group

crw-r----- 1 root sys 64 0x000000 Oct 22 11:13 /dev/vg00/group

crw-rw-rw- 1 root sys 64 0x020000 Jan 8 07:34 /dev/vgappl/group

crw-rw-rw- 1 root sys 64 0x010000 Dec 16 17:51 /dev/vglock/group

# vgimport -v -s -m /etc/lvmconf/vgappl.map vgappl

Beginning the import process on Volume Group "vgappl".

Logical volume "/dev/vgappl/lvappl" has been successfully created

with minor number 1.

Logical volume "/dev/vgappl/lvapp2" has been successfully created

with minor number 2.

vgimport: Volume group "/dev/vgappl" has been successfully created.

Warning: A backup of this volume group may not exist on this machine.

Please remember to take a backup using the vgcfgbackup command after activating the volume group.

# mkdir /app1

# mkdir /app2

# mount /dev/vgappl/lvappl /app1 # If getting error directory not found activate vggroup

# vgchange -a y vgappl

# mount /dev/vgappl/lvapp2 /app1

--------------------------------------------------------------------------------------------------------------

Configure PKG1 primary node1

--------------------------------------------------------------------------------------------------------------

#mkdir /etc/cmcluster/nfs

# cd /etc/cmcluster/nfs

# cmmakepkg -m sg/failover -m sg/filesystem -m sg/service pkg1.ascii

#ls

pkg1.ascii

The next step is to edit / customize the file with the vi editor:

After editing the file will look something like below:

vi /etc/cmcluster/nfs/pkg1.ascii

PACKAGE_NAME pkg1

PACKAGE_TYPE FAILOVER

FAILOVER_POLICY CONFIGURED_NODE

FAILBACK_POLICY MANUAL

NODE_NAME lodii018q

NODE_NAME lodii018v

AUTO_RUN YES

LOCAL_LAN_FAILOVER_ALLOWED YES

NODE_FAIL_FAST_ENABLED NO

#RUN_SCRIPT /etc/cmcluster/nfs/pkg1.ctrl

RUN_SCRIPT_TIMEOUT NO_TIMEOUT

#HALT_SCRIPT /etc/cmcluster/nfs/pkg1.ctrl

HALT_SCRIPT_TIMEOUT NO_TIMEOUT

#SERVICE_NAME pkg1

#SERVICE_FAIL_FAST_ENABLED NO

#SERVICE_HALT_TIMEOUT 300

SUBNET 10.130.44.0

fs_name /var/devappl/lvappl

fs_server " "

fs_directory /appl

fs_type vxfs

fs_mount_opt “-o llock”

#fs_umount_opt

#fs_fsck_opt

# cmcheckconf -v -P pkg1.ascii

Begin package verification...

Checking existing configuration ... Done

Attempting to add package Pkg.

Validating package Pkg1 via /etc/cmcluster/scripts/mscripts/master_control_script.sh ...

Waiting for up to 1200 seconds for the validation.

Validation for package NFSPkg succeeded via /etc/cmcluster/scripts/mscripts/master_control_script.sh.

Maximum configured packages parameter is 300.

Configuring 1 package(s).

Adding the package configuration for package Pkg1.

cmcheckconf: Verification completed with no errors found.

Use the cmapplyconf command to apply the configuration

Apply the package configuration file.

# cmapplyconf -v -P pkg1.ascii

Begin package verification...

Checking existing configuration ... Done

Attempting to add package Pkg1.

Validating package Pkg1 via /etc/cmcluster/scripts/mscripts/master_control_script.sh ...

Waiting for up to 1200 seconds for the validation.

Validation for package Pkg1 succeeded via /etc/cmcluster/scripts/mscripts/master_control_script.sh.

Maximum configured packages parameter is 300.

Configuring 1 package(s).

Adding the package configuration for package Pkg.

Modify the package configuration ([y]/n)? y

Start the cluster if it is not already running. Run the package.

# cmrunpkg Pkg1

Running package Pkg1 on node blocks

Successfully started package NFSPkg on node blocks

cmrunpkg: All specified packages are running

# cmviewcl -p Pkg1

PACKAGE STATUS STATE AUTO_RUN NODE

Pkg1 up running enabled lodii018q

#cmviewcl –v –fline -p pkg

---------------------------------------------------------------------Enjoy----------------------------------------------------------------------------------

---------------------------------------------

Basic command of cluster

-----------------------------------------------

Service Guard Commands

cmruncl -v #start entire cluster --> working

cmhaltcl # stop entire cluster --> working

cmviewcl # check status of cluster --> working

cmrunnode -v nodename #start a single node --> working

cmhaltnode -f -v nodename- #stop a node --> working

cmgetconf -C config_name # get current configuration --> working

cmrunpkg -n nodename package_name # start package on node --> working

cmmodpkg -e package_name # enable switching --> working

cmhaltpkg package_name #stop package --> working

Professional Knowledge base tutorial

Tuesday, 30 January 2024

How to Configuring a two node serviceguard Cluster HP-UX

No comments:

Post a Comment

What is RAID ?

most viewed

Pageviews last month

Report Abuse