Archive

Archive for the ‘hp-ux’ Category

Troubleshooting ASM 11.2 disk discovery

December 14th, 2011 2 comments

I was doing some installation at customer site when they asked if there anything specific to run GI 11.2 on HP-UX as this was their first interaction with 11g. Of course I replied that there is nothing specific, just to make sure the ownership of the raw disk is correct and had a correct ASM discovery string. They said that this is all done as it’s written in the documentation, but disks could not be discovered. This made me curious and asked them to log me in the system so I could have a look.

The system was running latest HP-UX 11.31 and we were going to install Oracle GI 11.2.0.2, the LUN was presented from HP EVA storage.

I couldn’t believe what they are saying and wanted them to show me what exactly they are doing. Unfortunately they were correct, after installing GI 11.2.0.2 software only, we tried to create an asm instance with asmca, but no disks were discovered although everything looked correct.

While I was looking around I remembered that the disk owner patch in HP-UX is a mandatory and it should be installed as the installation guide says this explicitly. I asked the customer and he said that all the required patches are installed, but when I checked the patch wasn’t  installed. The patch number as per installation guide is PHCO_41479, but the latest version is PHCO_41903. Also running kfed against disk on system on which the patch is not installed shows following:

KFED-00322: Invalid content encountered during block traversal: [kfbtTraverseBlock][Invalid OSM block type][][0]

I installed the patch and double checked everything and thought that this could be the reason why we are not seeing the disk, so I try to discover the disk, but again without success. The disk couldn’t be seen at ASM so I had to go deeper and see what asmca was actually doing. For the purpose I had to trace the system calls and for HP-UX the utility capable of doing this was tusc. There is MOS note describing how to trace systems call and what utilities should be used with different unix distributions [ ID 110888.1].

I run asmca and then using tusc got attached to its process, then changed the discovery string, pointing exactly to the disk I would like to use (in my case /dev/rdisk/disk3). So this is the paragraph which makes sense to me:

access("/dev/rdisk/disk3", W_OK|R_OK) ........................................................................... = 0
.......
open("/dev/rdisk/disk3", O_RDONLY|O_NDELAY|0x800, 0) ............................................................ = 7
lseek(7, 8192, SEEK_SET) ........................................................................................ = 8192
read(7, "L V M R E C 0 1 \r/ % aeN e2\va0".., 1024) ............................................................. = 1024
lseek(7, 73728, SEEK_SET) ....................................................................................... = 73728
read(7, "L V M R E C 0 1 \r/ % aeN e2\va0".., 1024) ............................................................. = 1024
close(7) ........................................................................................................ = 0

The disk is first successfully tested for read and write access and it’s opened for read-only in non-blocking mode. Then first 1024 bytes are read from offset 8192 from /dev/rdisk/disk3. This looked like a LVM header, AHA! So it seems that the disk was once used as LVM Physical Volume. Although the disk is not part of any volume group it has a LVM header and that’s why asmca it not showing this disk as CANDIDATE. It turned out that storage admins did not recreate the virtual disk on the storage, but the LUN was once used for LVM on another server.

After doing dd on the disk now the header looks better and disk could be seen as CANDIDATE:

oracle@vm:/$ dd if=/dev/zero of=/dev/rdisk/disk3 bs=1024k count=10

Now tusc output shows that header is filled with zeros:

read(7, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0".., 1024) ........................................................ = 1024

Just for troubleshooting purpose I try to read the disk header with kfed, before and after showed the same error:

KFED-00322: file not found; arguments: [kfbtTraverseBlock] [Invalid OSM block type] [] [0]

If you are not sure whether the disk contains valuable information you could import the physical volume and activate the volume group. In my case I was sure that the disk should be deleted and simple dd do the job.

Regards,
Sve

Categories: hp-ux, oracle, storage Tags: ,

How to setup virtual iLO Remote Console

September 14th, 2011 No comments

In the new release of HP Integrity Virtual Machines 4.30, HP added a new feature which allows you to access the guest console by logging into a specific IP address. This feature is called virtual iLO Remote Console and should be configured with hpvmmodify.

Looking at the guest status (hpvmstatus -p) the following is observed:
[Remote Console]
Remote Console not configured

This is how to configure the IP address of the virtual iLO Remote Console:
root@vmhost1:/# hpvmmodify -p 1 -K 192.168.11.193 -L 255.255.255.0

What the command does is configuring IP alias as could be seen in netstat:
lan0:110  1500 192.168.11.0    192.168.11.193  0                  0     0                  0     0

Running again hpvmstatus we see that the Remote Console is now configured:
[Remote Console]
Remote Console Ip Address:      192.168.11.193
Remote Console Net Mask:        255.255.255.0

Now about the users, by default the root user of the host is able login to all virtual consoles, although it’s not listed in following section:
[Authorized Administrators]
Oper Groups             :
Admin Groups            :
Oper Users              :
Admin Users             :

Let’s create a dedicated user for the first guest and then login:
root@vmhost1:/root# useradd -d /var/opt/hpvm/guests/vm01 -c ‘vm01 console’ adminvm1
root@vmhost1:/root# passwd adminvm1
root@vmhost1:/root# hpvmmodify -p 1 -u adminvm1:admin

Running status again we can see that the user is added to the administrators:
[Authorized Administrators]
Oper Groups             :
Admin Groups            :
Oper Users              :
Admin Users             : adminvm1

Let’s try now to login with our newly created user:
sve@angmar:~$ ssh adminvm1@192.168.11.193
Password:

vMP MAIN MENU

CO: Console
CM: Command Menu
CL: Console Log
SL: Show Event Logs
VM: Virtual Machine Menu
HE: Main Help Menu
X: Exit Connection

[vm01] vMP>

Voila, it’s working!

 

To enable access to the virtual iLO Remote Console through telnet (which by default is not supported) these two patches has to be applied on the host: PHCO_41595 and PHNE_41452. Two more considerations should be taken into consideration: The virtual iLO Remote Console’s SSH server host keys can change as it uses the host system’s SSH server host keys. If the guest is migrated to another system then these host keys will change. Another one is that guest administrator accounts are not migrated during Online VM Migration.

Regards,
Sve

Categories: hp-ux Tags: , ,

How to mount an ISO image under HP-UX 11iv3

September 12th, 2011 No comments

So what do I do if I want to mount an iso image under HP-UX.

Previously I had to create a logical volume, copy the content of the ISO on in and then mount the logical volume, like this one:

root@vmhost1:/# ll /tmp/test.iso
-rw-r--r--   1 root       sys        105906176 Sep  8 11:57 /tmp/test.iso
root@vmhost1:/# lvcreate -L 102 -n lvtest vg00
Warning: rounding up logical volume size to extent boundary at size "128" MB.
Logical volume "/dev/vg00/lvtest" has been successfully created with
character device "/dev/vg00/rlvtest".
Logical volume "/dev/vg00/lvtest" has been successfully extended.
Volume Group configuration for /dev/vg00 has been saved in /etc/lvmconf/vg00.conf
root@vmhost1:/# dd if=/tmp/test.iso of=/dev/vg00/rlvtest bs=1024k
101+0 records in
101+0 records out
root@vmhost1:/# mount /dev/vg00/lvtest /mnt
root@vmhost1:/# bdf /mnt
/dev/vg00/lvtest    102888  102888       0  100% /mnt

 

With the enhancement of ISOIMAGE-ENH, now you’re able to mount and umount iso images (like in Linux):

root@vmhost1:/# kcmodule fspd=loaded
root@vmhost1:/# mount -F cdfs /tmp/test.iso /mnt
root@vmhost1:/# bdf /mnt
/dev/fspd1          102888  102888       0  100% /mnt

 

If you get an error like this one:
mount: /tmp/test.iso is an invalid operand
then most probably you’ve forgotten to load the fspd module.

The package could be installed during installation or downloaded from Software Depot home, here and it’s available only for HP-UX 11iv3.

Regards,
Sve

Categories: hp-ux Tags:

HP EVA4400/6400/8400 now ship with XCS v10000000

September 7th, 2011 No comments

Two months ago HP released a new firmware for the EVA family including 4400/6400/8400/P6300 and P6500, version XCS 10000000. As I blogged previously the EVA4400 could not compare with the midrange storage system from other vendors. It introduced few new features, which turned out not to be complete and some other features wasn’t even included. Now except the re-branding of the EVA family to HP P6000 EVA, they also introduced a lot of new features with the new release. Some of the really useful features in this release:

  • Thin provisioning – It’s something the other vendors had for long time, it was about time to have such a feature on the EVA. This feature dynamically increases the space allocated to a virtual disk.
  • Large LUN support – Because this feature was missing, last year I’ve spent one month for migrating data from several big disks to another diskgroup with LVM. This will support snapshot replication of LUNs greater than 2 TB, expanding and shrinking of the large LUNs is also supported.
  • Online virtual disk migration (change Vraid or disk group) – Another ‘must have’ feature, change a virtual disk’s redundancy level or disk group membership without impacting host I/O. Previously one has to create another virtual disk in required level and move data from the other.

There are also a lot of fixes included in this release and some of them sounds really scary. For more information refer to the release notes.

Regards,
Sve

Categories: hp-ux, storage Tags:

Change of network interfaces in Oracle 10g RAC

July 12th, 2011 No comments

I was doing planned downtime on one of the 10.2.0.4 RAC systems and just before start of the second node I was told that during the downtime the network interfaces of the second node were aggregated. These servers are running HP-UX in which the default network interfaces are lan0 for the public network and lan1 for the interconnect. After they have been aggregated they became lan900 and lan901 respectively so I ask the guys to turn the things back and as I knew that the Clusterware would suffer from this change.

I decided to create a test scenario at the office, but with Linux OS (its was faster to deploy and test). Except the interfaces names everything else should be the same. I’m using eth0 for public and eth1 for private. Then for the purpose of demonstration at the second node I’m going to change the network interface which is used for public from eth0 to eth2. This would require also modifying nodeapps as VIP is running on this interface.

I installed Oracle 10.2.0.4 RAC on two nodes: oelvm5 and oelvm6 with orcl database. This is how the cluster configuration looks like before changing the interface:

[oracle@oelvm5 bin]$ ./oifcfg getif
eth0 192.168.143.0 global public
eth1 172.16.143.0 global cluster_interconnect

[oracle@oelvm5 bin]$ srvctl config nodeapps -n oelvm5 -a
VIP exists.: /oelvm5-vip/192.168.143.159/255.255.255.0/eth0

[oracle@oelvm5 bin]$ srvctl config nodeapps -n oelvm6 -a
VIP exists.: /oelvm6-vip/192.168.143.160/255.255.255.0/eth0

At this point I changed the interface eth0 to eth2 on the second node and restarted the node. After change of network interface on second node, listener is unable to run and VIP is relocated to the first node. I’m using a very handy script for getting the cluster resources status in formatted output and here is the output of it after the node boot:

[oracle@oelvm5 bin]$ crsstatus
HA Resource Target State
———– —— —–
ora.orcl.db ONLINE ONLINE on oelvm5
ora.orcl.orcl1.inst ONLINE ONLINE on oelvm5
ora.orcl.orcl2.inst ONLINE ONLINE on oelvm6
ora.oelvm5.ASM1.asm ONLINE ONLINE on oelvm5
ora.oelvm5.LISTENER_OELVM5.lsnr ONLINE ONLINE on oelvm5
ora.oelvm5.gsd ONLINE ONLINE on oelvm5
ora.oelvm5.ons ONLINE ONLINE on oelvm5
ora.oelvm5.vip ONLINE ONLINE on oelvm5
ora.oelvm6.ASM2.asm ONLINE ONLINE on oelvm6
ora.oelvm6.LISTENER_OELVM6.lsnr ONLINE OFFLINE
ora.oelvm6.gsd ONLINE ONLINE on oelvm6
ora.oelvm6.ons ONLINE ONLINE on oelvm6
ora.oelvm6.vip ONLINE ONLINE on oelvm5

Also following can be observed in $ORA_CRS_HOME/log/{HOST}/racg/ora.{HOST}.vip.log:
2011-05-28 16:20:39.157: [ RACG][3909306080] [4865][3909306080][ora.oelvm6.vip]: checkIf: interface eth0 is down
Invalid parameters, or failed to bring up VIP (host=node2)

So now its obvious, the VIP could not be started up on the second node, because interface eth0 is down. In order to change the public network interface, one has to use oifcfg first to delete the current interface and then add the correct one. Then for the node on which the interface is changed clusterware has to be stopped and nodeapps updated from the other node.

In case you are running in production and not using services, consider using crs_relocate on the VIP resource. It will relocate immediately the VIP address to the other node so none of the client would suffer from connection time out. In my lab VIP was easily relocated with just crs_relocate, but at the production environment ASM and LISTENER were dependant on the VIP and I had to stop them first. Not sure, but I think this was because there were two homes, one for ASM and one for DB.

Then change the public interface/subnet on the dependant node. While Clusterware is running, delete the interfaces using oifcfg and then add it with correct interface:

[oracle@oelvm6 ~]$ ./oifcfg delif -global eth0
[oracle@oelvm5 ~]$ oifcfg getif
eth1 172.16.143.0 global cluster_interconnect
[oracle@oelvm6 ~]$ oifcfg setif -global eth2/192.168.143.0:public

Now we have a correct configuration:

[oracle@oelvm5 ~]$ oifcfg getif
eth2 192.168.143.0 global public
eth1 172.16.143.0 global cluster_interconnect

Because the interface is the same on which VIP is running, nodeapps for this node has to be updated as well. For this action, stop the clusterware on the dependant node and execute srvctl from the other node. The other node has to be up and running in order to make the change:

[root@oelvm6 ~]# /etc/init.d/init.crs stop
Shutting down Oracle Cluster Ready Services (CRS):
May 30 11:33:15.380 | INF | daemon shutting down
Stopping resources. This could take several minutes.
Successfully stopped CRS resources.
Stopping CSSD.
Shutting down CSS daemon.
Shutdown request successfully issued.
Shutdown has begun. The daemons should exit soon.

[oracle@oelvm5 ~]$ srvctl config nodeapps -n oelvm6 -a
VIP exists.: /oelvm6-vip/192.168.143.160/255.255.255.0/eth0
[root@oelvm5 ~]# srvctl modify nodeapps -n oelvm6 -A oelvm6-vip/255.255.255.0/eth2
[oracle@oelvm5 ~]$ srvctl config nodeapps -n oelvm6 -a
VIP exists.: /oelvm6-vip/192.168.143.160/255.255.255.0/eth2

Finally start the Clusterware on the second node. It will automatically relocate it’s VIP address and start all the resources:

[root@oelvm6 ~]# /etc/init.d/init.crs start
Startup will be queued to init within 30 seconds.

It could be seen that the change is reflected and now the node applications are running fine:

[oracle@oelvm6 racg]$ crsstatus
HA Resource Target State
———– —— —–
ora.orcl.db ONLINE ONLINE on oelvm5
ora.orcl.orcl1.inst ONLINE ONLINE on oelvm5
ora.orcl.orcl2.inst ONLINE ONLINE on oelvm6
ora.oelvm5.ASM1.asm ONLINE ONLINE on oelvm5
ora.oelvm5.LISTENER_OELVM5.lsnr ONLINE ONLINE on oelvm5
ora.oelvm5.gsd ONLINE ONLINE on oelvm5
ora.oelvm5.ons ONLINE ONLINE on oelvm5
ora.oelvm5.vip ONLINE ONLINE on oelvm5
ora.oelvm6.ASM2.asm ONLINE ONLINE on oelvm6
ora.oelvm6.LISTENER_OELVM6.lsnr ONLINE ONLINE on oelvm6
ora.oelvm6.gsd ONLINE ONLINE on oelvm6
ora.oelvm6.ons ONLINE ONLINE on oelvm6
ora.oelvm6.vip ONLINE ONLINE on oelvm6

Regards,
Sve

Categories: hp-ux, oracle Tags: , ,

Oracle to stop writing software for Itanium processor

March 23rd, 2011 No comments

Few months after Oracle have changed the cpu core factor for the new Itanium series 93XX from 0.5 to 1, they now said that decided to discontinue all software development on the Intel Itanium chips. This is almost an year after Microsoft announced that Windows Server 2008 R2 will be the last version of Windows server to support Itanium chips. Red Hat also dropped the Itanium support in 2009 and the last supported version was RHEL 5.6. Until now HP and IBM were the two major competing companies in the Enterprise Mission Critical servers, HP with Superdome with dual core Itaniums chips and now Superdome2 with quad-core Itanium chips, IBM are having their p795 with 4/8 cores. I worked with HP-UX for six years now and I really liked the Itanium, but IBM chips are way more powerful and already had 8 cores per chips and they used to have the same cpu core factor with Itanium. Probably the today’s announcement heralds an end of an era.

You can read Oracle Press release here, they also released a support table of product versions which will be the last supported on Itanium chips. As a result Intel reaffirms their commitment to the development of the Itanium chip, read it from the Intel’s newsroom.

Regards,
Sve

Categories: hp-ux Tags:

Oracle DB 10.2.0.3 LISTENER (VIP) goes down on HP-UX 11.23 without reason

January 5th, 2011 No comments

Happy New Year!

For a long time I’ve been receiving complains that the listener at one of the nodes in two node RAC is going offline from time to time. Without obvious reason the VIP of the second node fails, the listener is stopped and VIP is relocated to the first node. Since the VIP is relocated there are no problems if all the clients are configured correctly. In this case some of the clients were connecting explicitly to the second node and were unable to connect to the database. Database version is 10.2.0.3 RAC installed on two nodes running HP-UX 11.23 with December 2008 bundle patches.

The following can be observed in $CRS_HOME/log/$HOSTNAME/crsd/crsd.log:
2010-10-25 06:11:12.492: [ CRSAPP][8336] CheckResource error for ora.db2.vip error code = 1
2010-10-25 06:11:12.522: [ CRSRES][8336] In stateChanged, ora.db2.vip target is ONLINE
2010-10-25 06:11:12.522: [ CRSRES][8336] ora.db2.vip on db2 went OFFLINE unexpectedly
2010-10-25 06:11:12.523: [ CRSRES][8336] StopResource: setting CLI values
2010-10-25 06:11:12.527: [ CRSRES][8336] Attempting to stop `ora.db2.vip` on member `db2`
2010-10-25 06:11:13.182: [ CRSRES][8336] Stop of `ora.db2.vip` on member `db2` succeeded.
2010-10-25 06:11:13.185: [ CRSRES][8336] ora.db2.vip RESTART_COUNT=0 RESTART_ATTEMPTS=0
2010-10-25 06:11:13.188: [ CRSRES][8336] ora.db2.vip failed on db2 relocating.
2010-10-25 06:11:13.231: [ CRSRES][8336] StopResource: setting CLI values
2010-10-25 06:11:13.235: [ CRSRES][8336] Attempting to stop `ora.db2.LISTENER_DB2.lsnr` on member `db2`
2010-10-25 06:12:31.183: [ CRSRES][8336] Stop of `ora.db2.LISTENER_DB2.lsnr` on member `db2` succeeded.
2010-10-25 06:12:31.211: [ CRSRES][8336] Attempting to start `ora.db2.vip` on member `db1`
2010-10-25 06:12:38.327: [ CRSRES][8336] Start of `ora.db2.vip` on member `db1` succeeded.

At alert log can be seen following:
ALTER SYSTEM SET service_names=” SCOPE=MEMORY SID=’oradb2′;

There are couple of bugs logged about that. There is also MOS ID regarding this problem:
HP-UX Itanium: RACGMAIN Received SIGSEGV On CheckResource Causing a Crash of a Resource [ID 763724.1]

The solution is to change the executable mode which uses shared library from “delay binding” to “immediate binding” using following bash script. It has to be applied on both CRS and DB homes, all Oracle processes should be stopped:

cd $ORACLE_HOME/bin/
for i in crs_relocate.bin crs_start.bin crs_stop.bin crsd.bin evmd.bin racgons.bin racgeut racgevtf racgmain; do chatr -B immediate $i; done

cd $CRS_HOME/bin/
for i in crs_relocate.bin crs_start.bin crs_stop.bin crsd.bin evmd.bin racgons.bin racgeut racgevtf racgmain; do chatr -B immediate $i; done

For three months since implementing this solutions I haven’t seen this problem again!

Regards,
Sve

Categories: hp-ux, oracle Tags: , , ,

Many open files on HP-UX after RAC upgrade to 10.2.0.4 – racgimon file handle leak

July 23rd, 2010 No comments

Two months after patching a customer database to 10.2.0.4 I’ve received a call, telling me that the database is hanging. Usually this happens when they missed the backup of the archive logs and the database stops. This time there was enough space available and this was not the problem. I logged to the first node and start looking around, weird things were happening, some commands were failing and other were hanging. Then I realized that this is not an ordinary case and start looking deeper. It turns out that this is a bug of Oracle with HP-UX and there is a patch and work around too.

The customer was having HP-UX 11.23 (September 2006) with patch bundles from September 2008. The database was Oracle RAC Enterprise Edition 10.2.0.2.

This problem had very big impact on the database because although the database is running in RAC the database was not accessible and there were a lot of locks. Rebooting the node or killing the processes do the job

After some reading it figure out that this happens only on HP-UX, after patching the database to 10.2.0.4 and it happens only on the first node.

Here are some symptoms:


Executing sar -v show the current-size and maximum size of the system file table:

12:00:00   N/A   N/A 328/4200  0  1374/286108 0  41906/65536 0
12:02:00   N/A   N/A 330/4200  0  1376/286108 0  41944/65536 0
12:04:00   N/A   N/A 336/4200  0  1390/286108 0  41999/65536 0
12:06:00   N/A   N/A 331/4200  0  1377/286108 0  41983/65536 0
12:08:00   N/A   N/A 330/4200  0  1376/286108 0  41976/65536 0
12:10:00   N/A   N/A 330/4200  0  1377/286108 0  41935/65536 0


With lsof the following open files are seen:

racgimon   3506 oracle   14u   REG             64,0x9        1552   29678 /oracle/ora10g/dbs/hc_baandb1.dat
racgimon   3506 oracle   28u   REG             64,0x9        1552   29678 /oracle/ora10g/dbs/hc_baandb1.dat
racgimon   3506 oracle   30u   REG             64,0x9        1552   29678 /oracle/ora10g/dbs/hc_baandb1.dat
racgimon   3506 oracle   37u   REG             64,0x9        1552   29678 /oracle/ora10g/dbs/hc_baandb1.dat


The processes which is holding the open files:

 oracle  3506     1  0  Nov  5  ?        18:16 /oracle/ora10g/bin/racgimon startd baandb


At this log “$ORACLE_HOME/log/<NodeName>/racg/imon_<InstanceName>.log” every minute can be seen the following error:

2009-12-02 12:12:35.454: [    RACG][73] [3506][73][ora.baandb.baandb1.inst]: GIMH: GIM-00104: Health check failed to connect to instance.
GIM-00090: OS-dependent operation:mmap failed with status: 12
GIM-00091: OS failure message: Not enough space
GIM-00092: OS failure occurred at: sskgmsmr_13
2009-12-02 12:13:35.474: [    RACG][73] [3506][73][ora.baandb.baandb1.inst]: GIMH: GIM-00104: Health check failed to connect to instance.
GIM-00090: OS-dependent operation:mmap failed with status: 12
GIM-00091: OS failure message: Not enough space
GIM-00092: OS failure occurred at: sskgmsmr_13


When the file table gets full weird things start to happen,  in the syslog the following can be seen:

Nov  5 08:00:02 db1 vmunix: file: table is full
Nov  5 08:00:03 db1 vmunix: file: table...
Nov  5 08:00:03 db1 vmunix: file...
Nov  5 08:00:03 db1 vmunix: file...
Nov  5 08:01:13 db1 vmunix: file: table is full
Nov  5 08:11:15 db1  above message repeats 34260 times


Also in the alertlog file the following can be seen:

ORA-00603: ORACLE server session terminated by fatal error
ORA-27544: Failed to map memory region for export
ORA-27300: OS system dependent operation:socket failed with status: 23
ORA-27301: OS failure message: File table overflow
ORA-27302: failure occurred at: sskgxpcre1


Solution:
Base bug is 6931689 (SS10204-HP-PARISC64-080216.080324 HEALTH CHECK FAILED TO CONNECT TO INSTANCE), but it’s not public. It’s fixed in CRS 10.2.0.4 Bundle Patch #2, but the actual CRS bundle is PSU2 with Patch# 8705958: TRACKING BUG FOR 10.2.0.4.2 PSU FOR CRS which is around 41Mb big.
This patch# 8705958 should be applied to all Oracle homes although the bug is in the database CRS should always be a higher version.

To apply this patch OPatch version must be at least 10.2.0.4.7, which can be downloaded with patch# 6880880. At the moment of writing this the latest version was 10.2.0.4.9 and its 34Mb. To install it, simply download it and unzip it under ORACLE_HOME.

I didn’t went with the patch because I read some scary stuff at OTN and thanks to Ivan Kartik I integrated a dirty work around. He proposed very good script which is checking if opened files are more than 20000 just to kill the racgimon process:

13:56:00   N/A   N/A 307/4200  0  1352/286108 0  44102/65536 0
13:58:00   N/A   N/A 307/4200  0  1353/286108 0  44119/65536 0
14:00:01   N/A   N/A 309/4200  0  1355/286108 0  44135/65536 0
14:02:01   N/A   N/A 307/4200  0  1353/286108 0  44153/65536 0
14:04:01   N/A   N/A 301/4200  0  1336/286108 0  2583/65536 0
14:06:01   N/A   N/A 306/4200  0  1347/286108 0  2610/65536 0
14:08:01   N/A   N/A 299/4200  0  1333/286108 0  2583/65536 0
14:10:01   N/A   N/A 300/4200  0  1335/286108 0  2571/65536 0

The work around fixed the problem. This article was written half an year ago and reading MOS now they say that this bug is fixed in 10.2.0.5 which was released at the beginning of June.

Regards,
Sve

Categories: hp-ux, oracle Tags: ,

Patch Set 10.2.0.5 for Oracle Database Server

June 9th, 2010 No comments

Just to mention that few days ago patch set 10.2.0.5 was released for HP-UX Itanium and IBM AIX systems. The patch set is available for download from My Oracle Support with number 8202632.

Regards,
Sve

Categories: hp-ux, oracle Tags:

Oracle 11g R2 installer fails on HP-UX 11iv3

May 20th, 2010 7 comments

Running the installer of any of the products (client, grid, database) of Oracle Database 11g Release 2 on HP-UX 11iv3 (Itanium) fails with:
“An internal error occurred within cluster verification framework”

After starting ./runInstaller the following error window pops-up:
runInstaller error

Also at the installAction$DATE.log the following error can be seen:

SEVERE: [FATAL] An internal error occurred within cluster verification framework
Unable to get the current group.

This happens, because patch PHCO_40381 is not installed. There is a list of patches to be installed at 2.3.4 Patch Requirement of the Database Installation guide for HP-UX.

The first one is:
PHCO_40381 11.31 Disk Owner Patch

The patch is available from ITRC. It’s 205Kb big and it fixes behavior of the command diskowner. The installation of the patch does not require reboot of the server.

After the installation of the patch, runInstaller starts succesfully.

There is also MOS Doc ID regarding this problem:
HP-UX: 11gR2 runInstaller Fails with “An internal error occurred within cluster verification framework” [ID 983713.1]

Regards,
Sve

Categories: hp-ux, oracle Tags: , ,