DevOps

Thursday, 11 November 2010

Quick Reference Guide for VCS Commands

Start HA

# sudo hastart [-stale | -force]

To Start When All Systems in ADMIN_WAIT

# sudo hasys –force

Run it from a system to force VCS to use this system’s configuration file.

To Start HA on a Single Node Cluster (do not use on multisystem cluster)

# sudo hastart –onenode

Stop HA

# sudo hastop –all [-force]

# sudo hastop –local [-force | -evacuate | -noautodisable]

# sudo hastop -local [-force | -evacuate –noautodisable]

# sudo hastop –sys [-force | -evacuate | -noautodisable]

# sudo hastop –sys [-force | -evacuate –noautodisable]


-all stops HAD on all systems in cluster and takes service groups offline

-local stops HAD on the local system only

-force stops HAD but leaves service groups online

-evacuate, when combined with –local or –sys switches the active groups to another server before stopping

-noautodisable ensures that service groups are not disabled

-sys stops HAD on specified server

Adding a User

# sudo haconf –makerw

# sudo hauser –add -priv Administrator (this examples shows adding a user with Admin priv)

# sudo haconf –dump –makero

# sudo hauser –display

Query VCS

Query Service Groups

State of Group

# sudo hagrp –state [-sys-name]

List of Group Resources

# sudo hagrp –resource

List of Group Dependencies

# sudo hagrp –dep

Display Group on a System

# sudo hagrp –display [] [-sys ]

Display System Attributes

# sudo hagrp –display [] [-attribute ] [-sys ]


Query Resources

List Resource Dependencies

# sudo hares –dep []

Resource Info

# sudo hares –display []

Display Resources of a Service Group

# sudo hares –display –group

Display Resources of a Resource Type

# sudo hares –display –type

Display Attributes of a System

# sudo hares –display –sys



Query Resource Types

List Resource Types

# sudo hatype –list

List All Resources of a Particular Type

# sudo hatype –resources

Info about a Resource Type

# sudo hatype –display



Query Agents

Agent Run Time Status

# sudo haagent –display []



Query Systems

List of Systems in Cluster

# sudo hasys -list

Info about each system

# sudo hasys –display []



Query Clusters

Value of a specific cluster attribute

# sudo haclus –value

Info about the cluster

# sudo haclus –display



Query Status
Status of all service groups including resource

# sudo hastatus

Status of a particular group

# sudo hastatus –group [-group ]…

Status of Cluster faults, including faulted groups, resources, systems, etc..

# sudo hastatus –summ



Administration



Administer Service Groups

Start Service Group and Bring Resources online

# sudo hagrp –online -sys

Stop a Service Group and take its resources offline

# sudo hagrp –offline -sys
Stop a Service only if all resources are probed

# sudo hagrp –offline [-ifprobed] -sys
Switch a Service group from one system to another

# sudo hagrp –switch -to

Freeze a Service group (disable onlining, offlining, and failover)

# sudo hagrp –freeze [-persistent]
Use -persistent to persist thru a reboot.

Unfreeze a Service group

# sudo hagrp –unfreeze [-persistent]

Enable a Service group

# sudo hagrp –enable [-sys ]

Disable a Service group

# sudo hagrp –disable [-sys ]

Enable all Resources in a service group

# sudo hagrp –enableresource

Disable all Resources in a service group

# sudo hagrp –disableresource

Clear Faulted, non-persistent resources in a service group

# sudo hagrp –clear [-sys ]

Initiates online process

Clear Resources in ADMIN_WAIT state in a service group

# sudo hagrp –clearadminwait [-fault] -sys



Administer Resources

Bring Resources online

# sudo hares –online -sys

Bring Resources offline

# sudo hares –offline -sys

Prompt a Resource Agent to immediately monitor the resource

# sudo hares –probe -sys

Clear a Resource

# sudo hares –clear [-sys ]

State change from RESOURCE_FAULTED to RESOURCE_OFFLINE

Initiates online process.



Administer Systems

Force a system to start while in ADMIN_WAIT

# sudo hasys –force

Display system node ID as defined in /etc/llttab

# sudo hasys –nodeid

Freeze a system (prevent groups from being brought online or switched)

# sudo hasys –freeze [-persistent] [-evacuate]

Un-Freeze a system (re-enable online and switching)

# sudo hasys –unfreeze [-persistent]



Administer Agents

Starting/Stopping VCS Agent Manually

# sudo haagent –start -sys

# sudo haagent –stop -sys <SYS-NAME



Check Low Level Transport Status (configured with /etc/llttab, /etc/llthosts)



# lltstat –vvn

LLT node information:

Node State Link Status Address

* 0 node1 OPEN

ce1 UP 00:14:4F:4A:E0:7E

ce2 UP 00:14:4F:4A:E0:7E

ce0 UP 00:14:4F:4A:E0:7E

1 node2 OPEN

ce1 UP 00:14:4F:3E:90:EC

ce2 UP 00:14:4F:3E:90:EC

ce0 UP 00:14:4F:3E:90:EC



Files


Log File - /var/VRTSvcs/log/engine_A.log

Configuration File - /etc/VRTSvcs/conf/config/main.cf



HA GUI



From server -

# sudo hagui &



Or, install Windows version on your desktop,


Make Changes to main.cf



Stop HA.

# sudo hastop –all

# sudo hastop –local


Edit the main.cf file or make changes via the GUI or command line.

Verify.

# sudo hacf –verify /etc/VRTSvcs/conf/config



Change LLT Configuration



Originally configured ce9 and ce1 as LLT interfaces on node1/2.

For some reason ce1 was not connecting (view with lltstat –vvn). The AT&T tech moved the connection to ce2 and it worked (lltstat –vvn).



Now, need to remove ce1 from llt configuration and add ce2.

Summary – add ce2 to primary and secondary, remove ce1 from seconday and then primary.

Node1

Add ce2 -

# lltconfig -t ce2 -d /dev/ce:2

Node2

Add ce2 and remove ce1 -

# lltconfig -t ce2 -d /dev/ce:2

# lltconfig -u ce1


Node1

Remove ce1

# lltconfig -u ce1


If you have a reference to old interface in /etc/VRTSvcs/conf/config/main.cf – modify it from ce1 to ce2.

Careful about doing this while cluster is running.

Ex:

PrivNIC ora_PrivNIC (

Device = { ce9 = 0, ce1 = 1 } (update to look like line below)

Device = { ce9 = 0, ce2 = 1 }



Update /etc/llttab (example)

link ce1 /dev/ce:1 - ether - - (update to look like line below)

link ce2 /dev/ce:2 - ether - -



Workarounds


For “Manual Intervention May be Needed for Reseeding”

After rebooting a server that's part of a cluster and it doesn't join and receive the message "manual intervention may be needed for reseeding." Is the safe workaround to do a 'gabconfig -x'?


# hastop -all -force

This stops the HAD daemon but keeps the applications up.

Then start HAD on all the nodes;

# hastart


For “RESOURCES NOT PROBED”

When I do an 'hastatus -summ' I see that my csgnic resources, for both cluster servers, in the ClusterService group are in the "RESOURCES NOT PROBED" list.



Run the following command on all the nodes where the resource is not probed:

# hares -probe -sys

I tried this and got ‘VCS WARNING V-16-1-10270 Resource is not enabled’ error.

So, I went into the GUI, right-clicked on the resource and enabled it for both servers.

Then, I was able to probe the resource successfully (either by the GUI or using the command above).



If group is offline because of probe problem, do this to online it (I didn’t have to do this).

# hagrp -online ClusterService -sys

# hastatus -sum to make sure that all the resources are probed and service group is online.

No comments:

Post a Comment