Tuesday, 30 January 2024

How to Configuring a two node serviceguard Cluster HP-UX

 ------------------------------------------------------------------------------------------------------------------------------------------

Configuring a two node serviceguard Cluster  HP-UX

------------------------------------------------------------------------------------------------------------------------------------------

https://ahmedsharifalvi.wordpress.com/2011/07/11/configuring-a-two-node-serviceguard-cluster-part-1/

------------------------------------------------------------------------------------------------------------------------------------------

Here I will describe the configuration procedure of a basic two node HP Serviceguard Cluster on HP-Unix. I will use Oracle RDBMS as the cluster package. I will just show the configuration steps. I am not going to discuss any theoretical concept like why someone will use a cluster or what actually a single point of failure is etc etc. There is a handful of discussion on these topics on internet. During the configuration I felt the lack of a well documented step by step configuration guide. I will try to make this one.

Hardware Configuration:


    I will use two HP 9000 series RP3440 service with 2 CPU and 4 Gig physical memory.


    Each server have 4 network interface card. We will use three of them.


    For shared storage I will use HP MSA 1000 storage.


Software Configuration:


Operating system is: HP Unix 11.23 MCOE (Mission Critical Operating Environment).


The HP-UX 11i Mission Operating Environment provides all the capabilities of the base HP-UX 11i and Enterprise Operating Environments plus certain critical add-on products for additional multiple system availability and performance management. The benefit is you don’t need to install servicegurad and other related file sets manually. It will be installed during the OS installation.


For heartbeat I will use a point to point Ethernet connection.


Here is the steps:


1. Configure the hardware first. Connect all required network cables. For this example, connect two NICs on each server to the public Ethernet network, connect the heartbeat cable through a point to point cable. Assign the LUNs from the storage. The amount of required shared storage will depend on the application requirement. For this example 5/6 LUNs of 50GB will be more than enough.


2. Install the operating system. Ensure that the shared LUNs are visible from both node. The output of ioscan -fnC command will look something like this:


# ioscan -fnC disk

Class     I  H/W Path       Driver     S/W State   H/W Type     Description

============================================================================

disk      0  0/0/2/0.0.0.0  sdisk      CLAIMED     DEVICE       TEAC    DV-28E-N

                           /dev/dsk/c0t0d0   /dev/rdsk/c0t0d0

disk      1  0/1/1/0.0.0    sdisk      CLAIMED     DEVICE       HP 146 GST3146707LC

                           /dev/dsk/c2t0d0   /dev/rdsk/c2t0d0

disk     16  0/1/1/0.1.0    sdisk      CLAIMED     DEVICE       HP 146 GST3146707LC

                           /dev/dsk/c2t1d0   /dev/rdsk/c2t1d0

disk      2  0/2/1/0/4/0.1.0.0.0.0.1    sdisk      CLAIMED     DEVICE  HP MSA VOLUME

                           /dev/dsk/c4t0d1   /dev/rdsk/c4t0d1

disk      4  0/2/1/0/4/0.1.0.0.0.0.2    sdisk      CLAIMED     DEVICE  HP MSA VOLUME

                           /dev/dsk/c4t0d2   /dev/rdsk/c4t0d2

disk      6  0/2/1/0/4/0.1.0.0.0.0.3    sdisk      CLAIMED     DEVICE  HP MSA VOLUME

                           /dev/dsk/c4t0d3   /dev/rdsk/c4t0d3

disk      8  0/2/1/0/4/0.1.0.0.0.0.4    sdisk      CLAIMED     DEVICE  HP MSA VOLUME

                           /dev/dsk/c4t0d4   /dev/rdsk/c4t0d4

disk     10  0/2/1/0/4/0.1.0.0.0.0.5    sdisk      CLAIMED     DEVICE  HP MSA VOLUME

                           /dev/dsk/c4t0d5   /dev/rdsk/c4t0d5


3. Assign the IP addresses on both nodes and make sure you are able to ping each other using both public interface IP and heartbeat IP. The /etc/rc.config.d/netconf file will look something like this:


#cat /etc/rc.config.d/netconf

HOSTNAME="node01"

OPERATING_SYSTEM=HP-UX

LOOPBACK_ADDRESS=127.0.0.1

INTERFACE_NAME[0]="lan0"

IP_ADDRESS[0]="10.10.96.162"

SUBNET_MASK[0]="255.255.255.0"

BROADCAST_ADDRESS[0]=""

INTERFACE_STATE[0]=""

DHCP_ENABLE[0]=0

INTERFACE_MODULES[0]=""


INTERFACE_NAME[1]="lan1"

IP_ADDRESS[1]="1.1.1.34"

SUBNET_MASK[1]="255.255.255.0"

BROADCAST_ADDRESS[1]=""

INTERFACE_STATE[1]=""

DHCP_ENABLE[1]=0

INTERFACE_MODULES[0]=""


ROUTE_DESTINATION[0]=default

ROUTE_MASK[0]=""

ROUTE_GATEWAY[0]="10.10.96.1"

ROUTE_COUNT[0]=""


GATED=0

GATED_ARGS=""

RDPD=0

RARP=0


DEFAULT_INTERFACE_MODULES=""


4. Add the information of all IP addresses (public and heartbeat) on /etc/hosts file. Make sure that /etc/hosts file on both nodes are same. Following is a sample file:


# cat /etc/hosts

10.10.96.162   node01

10.10.96.164   node02

10.10.96.163   crm_db        #service IP / package IP

1.1.1.34         node01hb

1.1.1.24         node02hb

127.0.0.1       localhost       loopback


5. Create a VG for using as a lock-VG. It will work as the quorum device.  Here is the procedure:


Create the physical volume (PV) first:


# pvcreate -f /dev/rdsk/c4t0d5

Physical volume "/dev/rdsk/c4t0d5" has been successfully created.


Note that, you must mention the raw device file (/dev/rdsk/..) for creating the PV and you have to use the block device file (/dev/dsk/..) for creating VG.


Create the group file for the volume group:


# mkdir /dev/vglock

# mknod /dev/vglock/group c 64 0x010000


The minor number (0x010000) of the group file must be unique for all VG. To check the minor number of any existing group file use ‘ls -l /dev/*/group’.


Now create the volume group named vglock. This VG will be used as lock VG:


# vgcreate -s 32 vglock /dev/dsk/c4t0d5

Increased the number of physical extents per physical volume to 6399.

Volume group "/dev/vglock" has been successfully created.

Volume Group configuration for /dev/vglock has been saved in /etc/lvmconf/vglock.conf


Deactivate the VG on node01, export the VG information to a map file and transfer the file to the 2nd node (node02 here) using the following comands:


# vgchange -a n vglock

Volume group "vglock" has been successfully changed.


# vgexport -p -v -s -m /etc/lvmconf/vglock.map vglock

Beginning the export process on Volume Group "vglock".

/dev/dsk/c4t0d5


# rcp /etc/lvmconf/vglock.map node02:/etc/lvmconf/


If you don’t deactivate before exporting the VG, then you will get a warning which you can ignore.


Use the following commands to create the required group file on 2nd node and import the lock VG using the map file that was transferred from the first node. Minor number of a VG must be same on all the cluster nodes.


# mkdir /dev/vglock


# mknod /dev/vglock/group c 64 0x010000


# vgimport -s -v -m /etc/lvmconf/vglock.map vglock

Beginning the import process on Volume Group "vglock".

Volume group "/dev/vglock" has been successfully created.


6. Now you are ready to create the cluster. Execute the following command to create an ascii cluster configuration file.


# cd /etc/cmcluster

# cmquerycl -v -n node01 -n node02 -C crmcluster.ascii

Looking for other clusters ... Done

Gathering storage information

Found 23 devices on node node01

Found 23 devices on node node02

Analysis of 46 devices should take approximately 5 seconds

0%----10%----20%----30%----40%----50%----60%----70%----80%----90%----100%

Found 2 volume groups on node node01

Found 2 volume groups on node node02

Analysis of 4 volume groups should take approximately 1 seconds

0%----10%----20%----30%----40%----50%----60%----70%----80%----90%----100%

Note: Disks were discovered which are not in use by either LVM or VxVM.

      Use pvcreate(1M) to initialize a disk for LVM or,

      use vxdiskadm(1M) to initialize a disk for VxVM.

Volume group /dev/vglock is configured differently on node node01 than on node node02

Volume group /dev/vglock is configured differently on node node02 than on node node01

Gathering network information

Beginning network probing

Completed network probing


Node Names:   node01

              node02


Bridged networks (local node information only - full probing was not performed):


1       lan0           (node01)


2       lan1           (node01)


5       lan0           (node02)


6       lan1           (node02)


IP subnets:


IPv4:


10.10.96.0         lan0      (node01)

                   lan0      (node02)


1.1.1.0             lan1      (node01)

                    lan1      (node02)


IPv6:


Possible Heartbeat IPs:


10.10.96.0                        10.10.96.162        (node01)

                                  10.10.96.164        (node02)


1.1.1.0                            1.1.1.34            (node01)

                                   1.1.1.24            (node02)


Possible Cluster Lock Devices:


/dev/dsk/c4t0d5    /dev/vglock          66 seconds


LVM volume groups:


/dev/vg00               node01


/dev/vglock             node01

                        node02


/dev/vg00               node02


LVM physical volumes:


/dev/vg00

/dev/dsk/c2t1d0    0/1/1/0.1.0                   node01


/dev/vglock

/dev/dsk/c5t0d7    0/2/1/0/4/0.1.0.255.0.0.7     node01


/dev/dsk/c4t0d7    0/2/1/0/4/0.1.0.0.0.0.7       node02

/dev/dsk/c5t0d7    0/2/1/0/4/0.1.0.255.0.0.7     node02


/dev/vg00

/dev/dsk/c2t1d0    0/1/1/0.1.0                   node02


LVM logical volumes:


Volume groups on node01:

/dev/vg00/lvol1                           FS MOUNTED   /stand

/dev/vg00/lvol2

/dev/vg00/lvol3                           FS MOUNTED   /

/dev/vg00/lvol4                           FS MOUNTED   /home

/dev/vg00/lvol5                           FS MOUNTED   /tmp

/dev/vg00/lvol6                           FS MOUNTED   /opt

/dev/vg00/lvol7                           FS MOUNTED   /usr

/dev/vg00/lvol8                           FS MOUNTED   /var


Volume groups on node02:

/dev/vg00/lvol1                           FS MOUNTED   /stand

/dev/vg00/lvol2

/dev/vg00/lvol3                           FS MOUNTED   /

/dev/vg00/lvol4                           FS MOUNTED   /home

/dev/vg00/lvol5                           FS MOUNTED   /tmp

/dev/vg00/lvol6                           FS MOUNTED   /opt

/dev/vg00/lvol7                           FS MOUNTED   /usr

/dev/vg00/lvol8                           FS MOUNTED   /var


Writing cluster data to crmcluster.ascii.


The above command will build the /etc/cmcluster/crmcluster.ascii file. This file defines the nodes, the disks, the LAN cards, and any other resources that are to be part of the cluster. You can edit this file and change various cluster parameter like HEARTBEAT_INTERVAL, NODE_TIMEOUT etc. Here is the content of the file:


# cat crmcluster.ascii 


CLUSTER_NAME            crmcluster


FIRST_CLUSTER_LOCK_VG           /dev/vglock


NODE_NAME               node01

  NETWORK_INTERFACE     lan0

    HEARTBEAT_IP        10.10.96.162

  NETWORK_INTERFACE     lan1

    HEARTBEAT_IP        1.1.1.34

  FIRST_CLUSTER_LOCK_PV /dev/dsk/c5t0d7


NODE_NAME               node02

  NETWORK_INTERFACE     lan0

    HEARTBEAT_IP        10.10.96.164

  NETWORK_INTERFACE     lan1

    HEARTBEAT_IP        1.1.1.24

  FIRST_CLUSTER_LOCK_PV /dev/dsk/c4t0d7


HEARTBEAT_INTERVAL           1000000

NODE_TIMEOUT                 2000000

AUTO_START_TIMEOUT           600000000

NETWORK_POLLING_INTERVAL     2000000

NETWORK_FAILURE_DETECTION    INOUT

MAX_CONFIGURED_PACKAGES      150

VOLUME_GROUP                 /dev/vglock


Once the cluster configuration file is edited, you need to use the cmcheckconf command to check the file for errors:


# cd /etc/cmcluster

# cmcheckconf -v -C crmscluster.ascii

Checking cluster file: crmcluster.ascii

Note : a NODE_TIMEOUT value of 2000000 was found in line 127. For a

significant portion of installations, a higher setting is more appropriate.

Refer to the comments in the cluster configuration ascii file or Serviceguard

manual for more information on this parameter.

Checking nodes ... Done

Checking existing configuration ... Done

Gathering storage information

Found 2 devices on node node01

Found 3 devices on node node02

Analysis of 5 devices should take approximately 1 seconds

0%----10%----20%----30%----40%----50%----60%----70%----80%----90%----100%

Found 2 volume groups on node node01

Found 2 volume groups on node node02

Analysis of 4 volume groups should take approximately 1 seconds

0%----10%----20%----30%----40%----50%----60%----70%----80%----90%----100%

Volume group /dev/vglock is configured differently on node node01 than on node node02

Volume group /dev/vglock is configured differently on node node02 than on node node01

Gathering network information

Beginning network probing (this may take a while)

Completed network probing

Checking for inconsistencies

Adding node node01 to cluster crmcluster

Adding node node02 to cluster crmcluster

cmcheckconf: Verification completed with no errors found.

Use the cmapplyconf command to apply the configuration.


After the file has been verified as containing no errors, the cmapplyconf command is used to create and distribute the cluster binary file:


# cmapplyconf -v -C crmcluster.ascii

Checking cluster file: crmcluster.ascii

Note : a NODE_TIMEOUT value of 2000000 was found in line 127. For a

significant portion of installations, a higher setting is more appropriate.

Refer to the comments in the cluster configuration ascii file or Serviceguard

manual for more information on this parameter.

Checking nodes ... Done

Checking existing configuration ... Done

Node node01 is refusing Serviceguard communication.

Please make sure that the proper security access is configured on node

node01 through either file-based access (pre-A.11.16 version) or role-based

access (version A.11.16 or higher) and/or that the host name lookup

on node node01 resolves the IP address correctly.

cmapplyconf: Failed to gather configuration information


As you see from the above output, the command is failed to run successfully. Check the output carefully. It says, there are some problems in name resolution. I was stuck at this point for a significant amount of time. Then I found that there is no /etc/nsswitch.conf file. However, there were nsswitch.files, nsswitch.dns and nsswitch.nis on /etc directory. So I just renamed the file /etc/nsswitch.files to /etc/nsswitch.conf. You can use any of the following commands:


# cp /etc/nsswitch.files /etc/nsswitch.conf


or


# mv /etc/nsswitch.files /etc/nsswitch.conf


Then run the command cmapplyconf again and press y when it will ask for confirmation:


# cd /etc/cmcluster

# cmapplyconf -v -C crmcluster.ascii

Checking cluster file: crmcluster.ascii

Note : a NODE_TIMEOUT value of 2000000 was found in line 127. For a

significant portion of installations, a higher setting is more appropriate.

Refer to the comments in the cluster configuration ascii file or Serviceguard

manual for more information on this parameter.

Checking nodes ... Done

Checking existing configuration ... Done

Volume group /dev/vglock is configured differently on node node01 than on node node02

Volume group /dev/vglock is configured differently on node node02 than on node node01

Checking for inconsistencies

Modifying configuration on node node01

Modifying configuration on node node02

Modifying node wdkrac1 in cluster crmcluster

Modifying node wdkrac2 in cluster crmcluster


Modify the cluster configuration ([y]/n)? y

Marking/unmarking volume groups for use in the cluster

Completed the cluster creation


Now you will be able to start the cluster using the cmruncl command:


# cmruncl -v

cmruncl: Validating network configuration...

cmruncl: Network validation complete

Waiting for cluster to form ..... done

Cluster successfully formed.

Check the syslog files on all nodes in the cluster to verify that no warnings occurred during startup.


To check the cluster status you can use the cmviewcl command:


# cmviewcl


CLUSTER        STATUS

crmcluster    up


  NODE           STATUS       STATE

  node01        up           running

  node02        up           running

HP-UX Cluster part-2

------------------------------------------------------------------------------------------------------------------------------------------------------------------

https://ahmedsharifalvi.wordpress.com/2011/08/01/configuring-a-two-node-serviceguard-cluster-part-2/

------------------------------------------------------------------------------------------------------------------------------------------------------------------

PKG Configuration

------------------------------------------------------------------------------------------------------------------------------------------------------------------

#ls  /dev/rdsk/c0t2d0

/dev/rdsk/c0t2d0

Physical Volume (PV) Creation:

# pvcreate -f /dev/rdsk/c0t2d0

Physical volume "/dev/rdsk/c0t2d0" has been successfully created.

# ls -l /dev/*/group

crw-r-----   1 root       sys         64 0x000000 Nov 12 16:06 /dev/vg00/group

crw-rw-rw-   1 root       sys         64 0x010000 Dec 16 17:43 /dev/vglock/group


# mkdir /dev/vgappl

# mknod /dev/vgappl/group c 64 0x020000


 #  vgcreate -s 32 vgappl /dev/dsk/c0t2d0

Increased the number of physical extents per physical volume to 1119.

Volume group "/dev/vgappl" has been successfully created.

Volume Group configuration for /dev/vgappl has been saved in /etc/lvmconf/vgappl.conf


# lvcreate -L 200 -n lvappl /dev/vgappl

Warning: rounding up logical volume size to extent boundary at size "224" MB.

Logical volume "/dev/vgappl/lvappl" has been successfully created with

character device "/dev/vgappl/rlvappl".

Logical volume "/dev/vgappl/lvappl" has been successfully extended.

Volume Group configuration for /dev/vgappl has been saved in /etc/lvmconf/vgappl.conf


# lvcreate -L 200 -n lvapp3 /dev/vglock

Warning: rounding up logical volume size to extent boundary at size "224" MB.

Logical volume "/dev/vgappl/lvapp2" has been successfully created with

character device "/dev/vgappl/rlvapp2".

Logical volume "/dev/vgappl/lvapp2" has been successfully extended.

Volume Group configuration for /dev/vgappl has been saved in /etc/lvmconf/vgappl.conf


Create file system / mount point:


# newfs -F vxfs -o largefiles /dev/vgappl/rlvappl

    version 7 layout

    229376 sectors, 229376 blocks of size 1024, log size 1024 blocks

    largefiles supported


# newfs -F vxfs -o largefiles /dev/vgappl/rlvapp2

    version 7 layout

    229376 sectors, 229376 blocks of size 1024, log size 1024 blocks

    largefiles supported


# mount /dev/vgappl/lvappl /app1

# mount /dev/vgappl/lvapp2 /app2


# umount /app1

# umount /app2


# vgchange -a n vgappl     # deactivate the vggroup

Volume group "vgappl" has been successfully changed.


# vgexport -p -v -s -m /etc/lvmconf/vgappl.map vgappl

Beginning the export process on Volume Group "vgappl".

/dev/dsk/c0t2d0

vgexport: Preview of vgexport on volume group "vgappl" succeeded.


scp /etc/lvmconf/vgappl.map lodii018v:/etc/lvmconf/


On 2nd node:


# mkdir /dev/vgappl

# mknod /dev/vgappl/group c 64 0x020000

# ls -l  /dev/*/group

crw-r-----   1 root       sys         64 0x000000 Oct 22 11:13 /dev/vg00/group

crw-rw-rw-   1 root       sys         64 0x020000 Jan  8 07:34 /dev/vgappl/group

crw-rw-rw-   1 root       sys         64 0x010000 Dec 16 17:51 /dev/vglock/group


# vgimport   -v -s -m /etc/lvmconf/vgappl.map vgappl

Beginning the import process on Volume Group "vgappl".

Logical volume "/dev/vgappl/lvappl" has been successfully created

with minor number 1.

Logical volume "/dev/vgappl/lvapp2" has been successfully created

with minor number 2.

vgimport: Volume group "/dev/vgappl" has been successfully created.

Warning: A backup of this volume group may not exist on this machine.

Please remember to take a backup using the vgcfgbackup command after activating the volume group.


# mkdir /app1

#  mkdir /app2


# mount /dev/vgappl/lvappl  /app1  # If getting error directory not found activate vggroup

# vgchange -a y  vgappl

# mount /dev/vgappl/lvapp2  /app1

--------------------------------------------------------------------------------------------------------------

Configure PKG1  primary node1

--------------------------------------------------------------------------------------------------------------

#mkdir   /etc/cmcluster/nfs

# cd   /etc/cmcluster/nfs

# cmmakepkg -m  sg/failover  -m sg/filesystem  -m sg/service pkg1.ascii

#ls 

pkg1.ascii



The next step is to edit / customize the file with the vi editor:

After editing the file will look something like below:


vi /etc/cmcluster/nfs/pkg1.ascii

PACKAGE_NAME                    pkg1

PACKAGE_TYPE                    FAILOVER

FAILOVER_POLICY               CONFIGURED_NODE

FAILBACK_POLICY               MANUAL


NODE_NAME                       lodii018q

NODE_NAME                       lodii018v


AUTO_RUN                        YES

LOCAL_LAN_FAILOVER_ALLOWED      YES

NODE_FAIL_FAST_ENABLED          NO


#RUN_SCRIPT              /etc/cmcluster/nfs/pkg1.ctrl

RUN_SCRIPT_TIMEOUT              NO_TIMEOUT

#HALT_SCRIPT             /etc/cmcluster/nfs/pkg1.ctrl

HALT_SCRIPT_TIMEOUT             NO_TIMEOUT


#SERVICE_NAME                   pkg1

#SERVICE_FAIL_FAST_ENABLED      NO

#SERVICE_HALT_TIMEOUT           300


SUBNET 10.130.44.0


fs_name /var/devappl/lvappl

fs_server  " "

fs_directory /appl

fs_type vxfs

fs_mount_opt “-o llock”

#fs_umount_opt 

#fs_fsck_opt


# cmcheckconf  -v  -P pkg1.ascii

Begin package verification...

Checking existing configuration ... Done

Attempting to add package Pkg.

Validating package Pkg1 via /etc/cmcluster/scripts/mscripts/master_control_script.sh ...

Waiting for up to 1200 seconds for the validation.

Validation for package NFSPkg succeeded via /etc/cmcluster/scripts/mscripts/master_control_script.sh.

Maximum configured packages parameter is 300.

Configuring 1 package(s).

Adding the package configuration for package Pkg1.

cmcheckconf: Verification completed with no errors found.

Use the cmapplyconf command to apply the configuration


Apply the package configuration file.


# cmapplyconf -v -P pkg1.ascii

Begin package verification...

Checking existing configuration ... Done

Attempting to add package Pkg1.

Validating package Pkg1 via /etc/cmcluster/scripts/mscripts/master_control_script.sh ...

Waiting for up to 1200 seconds for the validation.

Validation for package Pkg1 succeeded via /etc/cmcluster/scripts/mscripts/master_control_script.sh.

Maximum configured packages parameter is 300.

Configuring 1 package(s).

Adding the package configuration for package Pkg.

Modify the package configuration ([y]/n)? y


Start the cluster if it is not already running. Run the package.


# cmrunpkg  Pkg1

Running package Pkg1 on node blocks

Successfully started package NFSPkg on node blocks

cmrunpkg: All specified packages are running


# cmviewcl -p  Pkg1


PACKAGE    STATUS   STATE AUTO_RUN NODE

Pkg1     up running enabled lodii018q



#cmviewcl –v –fline -p  pkg

---------------------------------------------------------------------Enjoy----------------------------------------------------------------------------------


---------------------------------------------

Basic command of cluster

-----------------------------------------------

Service Guard Commands                 


    cmruncl -v #start entire cluster    --> working 

    cmhaltcl # stop entire cluster         --> working

    cmviewcl # check status of cluster     --> working

    cmrunnode -v nodename  #start a single node  --> working

    cmhaltnode -f -v nodename- #stop a node         --> working

    cmgetconf -C config_name # get current configuration  --> working

    cmrunpkg -n nodename package_name  # start package on node    -->  working

    cmmodpkg -e package_name # enable switching -->  working

    cmhaltpkg package_name #stop package                    --> working


No comments:

Post a Comment

What is RAID ?

  What is RAID?   RAID Levels - How the drives are organized   How to determine your RAID level  RAID 0 - Disk Striping   RAID 1 - Disk Mirr...

most viewed