Archive

Archive for the ‘linux’ Category

How to configure Link Aggregation Control Protocol on Exadata

May 13th, 2015 2 comments

During a recent X5 installation I had to configure Link Aggregation Control Protocol (LACP) on the client network of the compute nodes. Although the ports were running at 10Gbits and default configuration of Active/Passive works perfectly fine the customer wanted even distribution of traffic and workload across their core switches.

Link Aggregation Control Protocol (LACP), also known as 802.3ad is a methods of combining multiple physical network connections into one logical connection to increase throughput and provide redundancy in case one of the links should fail. The protocol requires both – the server and the switch(es) to have the same settings to allow LACP to work properly.

To configure LACP on Exadata you need to change the bondeth0 parameters.

On each of the compute nodes open the following file:

/etc/sysconfig/network-scripts/ifcfg-bondeth0

and replace the line saying BONDING_OPTS with this one:

BONDING_OPTS="mode=802.3ad xmit_hash_policy=layer3+4 miimon=100 downdelay=200 updelay=5000 num_grat_arp=100"

and then restart the network interface:

ifdown bondeth0
ifup bondeth0
Determining if ip address 192.168.1.10 is already in use for device bondeth0...

You can check the status of the interface by query the proc filesystem. Make sure both interfaces are up and running at the same speed. The esential part to make sure the LACP is working is shown below:

cat /proc/net/bonding/bondeth0

802.3ad info
LACP rate: slow
Aggregator selection policy (ad_select): stable
Active Aggregator Info:
Aggregator ID: 1
Number of ports: 2
Actor Key: 33
Partner Key: 34627
Partner Mac Address: 00:23:04:ee:be:c8
I had a problem with the network where the client network did NOT come up after server reboot. This was happening because during system boot the 10Gbit interfaces goes through multiple resets causing very fast link change. Here is the status of the bond as of that time:
cat /proc/net/bonding/bondeth0

802.3ad info
LACP rate: slow
Aggregator selection policy (ad_select): stable
bond bondeth0 has no active aggregator
The solution for that was to decrease the down_delay to 200. The issue is described in this note:
Bonding Mode 802.3ad Using 10Gbps Network – Slave NICs Fail to Come Up Consistently after Reboot (Doc ID 1621754.1)

 

Categories: linux, oracle Tags:

RHEL6 udev and EMC PowerPath

January 26th, 2015 No comments

I’m working on Oracle database migration project where customer have chosen commodity x86 hardware with RHEL6 and EMC storage.

I’ve done many similar installations in the past and I always used the native MPIO in Linux (DM-Multipath) to load balance and failover I/O paths. This time however I’ve got EMC PowerPath doing the load balance and failover and got the native MPIO disabled. From my point of view it’s the same, whether I’ll be using /dev/emcpower* or /dev/mapper/* it’s the same. Obviously PowerPath has some advantages over the native MPIO which I really can’t tell yet. That’s a good paper from EMC giving a comparison between the native MPIO in different operating systems.

As mentioned before the aggregated logical names (pseudo names) with EMC PowerPath could be found under /dev/emcpowerX. I partitioned the disks with GPT tables and aligned the first partition to match the storage sector size. Also added to following line to udev rules to make sure my devices will get the proper permissions:

ACTION=="add", KERNEL=="emcpowerr1", OWNER:="oracle", GROUP:="dba", MODE="0600"

I restarted the server and then later udev to make sure ownership and permissions were picked up correctly. Upon running asmca to create ASM with the first disk group I got the following errors:

Configuring ASM failed with the following message:
One or more disk group(s) creation failed as below:
Disk Group DATA01 creation failed with the following message:
ORA-15018: diskgroup cannot be created
ORA-15031: disk specification '/dev/emcpowerr1' matches no disks
ORA-15025: could not open disk "/dev/emcpowerr1"
ORA-15056: additional error message

Well that’s strange, I’m sure the file had to correct permissions. However listing the file proved that it didn’t have the correct permissions. I repeated the process several times and always got the same result, you can use simple touch command to get the same result:

[root@testdb ~]# ls -al /dev/emcpowerr1
brw-rw---- 1 oracle dba 120, 241 Jan 23 12:35 /dev/emcpowerr1
[root@testdb ~]# touch /dev/emcpowerr1
[root@testdb ~]# ls -al /dev/emcpowerr1
brw-rw---- 1 root root 120, 241 Jan 23 12:35 /dev/emcpowerr1

Something was changing the ownership of the file and I didn’t know what. Well you’ll be no less surprised than I was to find that linux has a similar auditing framework as the Oracle database.

Auditctl will allow you to audit any file for any syscall run against it. In my case I would like to know which process is changing the ownership of my device file. Another helpful command is ausyscall whic allows you to map syscall names and numbers. In other words I would like to know what is the chmod syscall number on a 64bit platform (it does matter):

[root@testdb ~]# ausyscall x86_64 chmod --exact
90

Then I would like to set up auditing for all chmod calls against my device file:

[root@testdb ~]# auditctl -a exit,always -F path=/dev/emcpowerr1 -F arch=b64 -S chmod
[root@testdb ~]# touch /dev/emcpowerr1
[root@testdb ~]# tail -f /var/log/audit/audit.log
type=SYSCALL msg=audit(1422016631.416:4208): arch=c000003e syscall=90 success=yes exit=0 a0=7f3cfbd36960 a1=61b0 a2=7fff5c59b830 a3=0 items=1 ppid=60056 pid=63212 auid=4294967295 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=(none) ses=4294967295 comm="udevd" exe="/sbin/udevd" key=(null)
type=CWD msg=audit(1422016631.416:4208):  cwd="/"
type=PATH msg=audit(1422016631.416:4208): item=0 name="/dev/emcpowerr1" inode=28418 dev=00:05 mode=060660 ouid=54321 ogid=54322 rdev=78:f1 nametype=NORMAL
[root@testdb ~]# auditctl -D
No rules

Gotcha! So it was udev changing the permissions but why ?

I spent half day going through logs and tracing udev but couldn’t find anything.

At the end of the day I found an article by RHEL on which they had exactly the same problem. The solution was to have “add|change” into the ACTION directive instead of only “add”.

So here is the rule you need to have in order for UDEV to set a persistent ownership/permission on EMC PowerPath device files in RHEL 6:

[root@testdb ~]# cat /etc/udev/rules.d/99-oracle-asm.rules
ACTION=="add|change", KERNEL=="emcpowerr1", OWNER:="oracle", GROUP:="dba", MODE="0600"

Hope it helps and you don’t have to spent half day as I did.

Sve

Categories: linux, oracle Tags:

runInstaller fails at CreateOUIProcess with permission denied

January 16th, 2015 No comments

Just a short post on a problem I encountered recently.

I had to install 11.2 GI and right after running the installer I got a message saying permission denied. Below is the exact error:

[oracle@testdb grid]$ ./runInstaller -silent -showProgress -waitforcompletion -responseFile /u01/software/grid/response/grid_install_20140114.rsp
Starting Oracle Universal Installer...

Checking Temp space: must be greater than 120 MB.   Actual 7507 MB    Passed
Checking swap space: must be greater than 150 MB.   Actual 8191 MB    Passed
Preparing to launch Oracle Universal Installer from /tmp/OraInstall2015-01-15_12-12-20PM. Please wait ...Error in CreateOUIProcess(): 13
: Permission denied

Quickly tracing the process I can see that it fails to execute the java installer:

27316 execve("/tmp/OraInstall2015-01-15_12-05-40PM/jdk/jre/bin/java", ["/tmp/OraInstall2015-01-15_12-05-"..., "-Doracle.installer.library_loc=/"..., "-Doracle.installer.oui_loc=/tmp/"..., "-Doracle.installer.bootstrap=TRU"..., "-Doracle.installer.startup_locat"..., "-Doracle.installer.jre_loc=/tmp/"..., "-Doracle.installer.nlsEnabled=\"T"..., "-Doracle.installer.prereqConfigL"..., "-Doracle.installer.unixVersion=2"..., "-mx150m", "-cp", "/tmp/OraInstall2015-01-15_12-05-"..., "oracle.install.ivw.crs.driver.CR"..., "-scratchPath", "/tmp/OraInstall2015-01-15_12-05-"..., "-sourceLoc", ...], [/* 22 vars */]) = -1 EACCES (Permission denied)

I never had this problem before, see similar behaviour with having selinux enabled but that wasn’t the case.

Then why I remembered that while formatting a partition for u01 and adding to fstab I saw that tmp didn’t have the default mount options:

/dev/mapper/vglocal00-tmp00 /tmp                    ext4    defaults,noexec 1 2

Indeed, the noexec option will not let you execute binaries that are on that partition. This server was built by a hosting provider and I guess this was part of thir default deployment process.

After removing the option and remounting /tmp (mount -o remount /tmp), installer was able to run successfully.

Categories: linux, oracle Tags:

Troubleshooting Oracle DBFS mount issues

March 13th, 2014 No comments

On Exadata the local drives on the compute nodes are not big enough to allow larger exports and often dbfs is configured. In my case I had a 1.2 TB dbfs file system mounted under /dbfs_direct/.

While I was doing some exports yesterday I found that my dbfs wasn’t mounted, running quick crsctl command to bring it online failed:

[oracle@exadb01 ~]$ crsctl start resource dbfs_mount -n exadb01
 CRS-2672: Attempting to start 'dbfs_mount' on 'exadb01'
 CRS-2674: Start of 'dbfs_mount' on 'exadb01' failed
 CRS-2679: Attempting to clean 'dbfs_mount' on 'exadb01'
 CRS-2681: Clean of 'dbfs_mount' on 'exadb01' succeeded
 CRS-4000: Command Start failed, or completed with errors.

It doesn’t give you any error messages or reason why it’s failing, neither the other database and grid infrastructure logs does. The only useful solution is to enable tracing for dbfs client and see what’s happening. To enable tracing edit the mount script and insert the following MOUNT_OPTIONS:

vi $GI_HOME/crs/script/mount-dbfs.sh
MOUNT_OPTIONS=trace_level=1,trace_file=/tmp/dbfs_client_trace.$$.log,trace_size=100

Now start the resource one more time to get the log file generated. You can get this working with the client as well from the command line:

[oracle@exadb01 ~]$ dbfs_client dbfs_user@ -o allow_other,direct_io,trace_level=1,trace_file=/tmp/dbfs_client_trace.$$.log /dbfs_direct
Password:
Fail to connect to database server.

 

After checking the log file it’s clear now why dbfs was failing to mount, the dbfs database user has expired:

tail /tmp/dbfs_client_trace.100641.log.0
 [43b6c940 03/12/14 11:15:01.577723 LcdfDBPool.cpp:189         ] ERROR: Failed to create session pool ret:-1
 [43b6c940 03/12/14 11:15:01.577753 LcdfDBPool.cpp:399         ] ERROR: ERROR 28001 - ORA-28001: the password has expired

[43b6c940 03/12/14 11:15:01.577766 LcdfDBPool.cpp:251         ] DEBUG: Clean up OCI session pool...
 [43b6c940 03/12/14 11:15:01.577805 LcdfDBPool.cpp:399         ] ERROR: ERROR 24416 - ORA-24416: Invalid session Poolname was specified.

[43b6c940 03/12/14 11:15:01.577844 LcdfDBPool.cpp:444         ] CRIT : Fail to set up database connection.

 

The account had a default profile which had the default PASSWORD_LIFE_TIME of 180 days:

SQL> select username, account_status, expiry_date, profile from dba_users where username='DBFS_USER';

USERNAME                       ACCOUNT_STATUS                   EXPIRY_DATE       PROFILE
------------------------------ -------------------------------- ----------------- ------------------------------
DBFS_USER                      EXPIRED                          03-03-14 14:56:12 DEFAULT

Elapsed: 00:00:00.02
SQL> select password from sys.user$ where name= 'DBFS_USER';

PASSWORD
------------------------------
A4BC1A17F4AAA278

Elapsed: 00:00:00.00
SQL> alter user DBFS_USER identified by values 'A4BC1A17F4AAA278';

User altered.

Elapsed: 00:00:00.03
SQL> select username, account_status, expiry_date, profile from dba_users where username='DBFS_USER';

USERNAME                       ACCOUNT_STATUS                   EXPIRY_DATE       PROFILE
------------------------------ -------------------------------- ----------------- ------------------------------
DBFS_USER                      OPEN                             09-09-14 11:09:43 DEFAULT


SQL> select * from dba_profiles where resource_name = 'PASSWORD_LIFE_TIME';

PROFILE                        RESOURCE_NAME                    RESOURCE LIMIT
------------------------------ -------------------------------- -------- ----------------------------------------
DEFAULT                        PASSWORD_LIFE_TIME               PASSWORD 180

 

After resetting database user password dbfs successfully mounted!

If you are using dedicated database for dbfs make sure you have set the password_life_time to unlimited to avoid similar issues.

 

 

Categories: linux, oracle Tags: , ,

OEM 12c installation fails if parallel_max_servers too high

February 21st, 2014 No comments

Just a quick post regarding OEM 12c installation where recently I had to install OEM 12c and during the repository configuration step the installation fails with error:

ORA-12801: error signaled in parallel query server P151

This was caused by a known bug which requires decreasing the number of parallel queries of the repository databases and start over the installation. The database had cpu_count set to 64 and parallel_max_servers to 270. After setting the parallel_max_servers to lower value the installation completed successfully.

For more information refer to:
EM 12c: Enterprise Manager Cloud Control 12c Installation Fails At Repository Configuration With Error: ORA-12805: parallel query server died unexpectedly (Doc ID 1539444.1)

 

Categories: linux, oracle Tags:

RMAN fails to allocate channel with Tivoli Storage Manager

February 6th, 2014 No comments

I was recently configuring backup on the customers Exadata with IBM TSM Data Protection for Oracle and run into weird RMAN error. The configuration was Oracle Database 11.2, TSM client version 6.1 and TSM Server version 5.5 and this was the error:

[oracle@oraexa01 ~]$ rman target /

Recovery Manager: Release 11.2.0.3.0 - Production on Wed Jan 29 16:41:54 2014

Copyright (c) 1982, 2011, Oracle and/or its affiliates.  All rights reserved.

connected to target database: TESTDB (DBID=2128604199)

RMAN> run {
2> allocate channel c1 device type 'SBT_TAPE';
3> }

using target database control file instead of recovery catalog
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03009: failure of allocate command on c1 channel at 01/29/2014 16:42:01
ORA-19554: error allocating device, device type: SBT_TAPE, device name:
ORA-27000: skgfqsbi: failed to initialize storage subsystem (SBT) layer
Linux-x86_64 Error: 106: Transport endpoint is already connected
Additional information: 7011
ORA-19511: Error received from media manager layer, error text:
SBT error = 7011, errno = 106, sbtopen: system error

You get this message because the Tivoli Storage Manager API error log file (errorlogname option specified in the dsm.sys file) is not writable by the Oracle user.

Just change the file permissions or change the parameter to point to a file under /<writable_path>/ and retry your backup:

[root@oraexa01 ~]# chmod a+w /usr/tivoli/tsm/client/ba/bin/dsmerror.log

This time RMAN allocates channel successfully:

[oracle@oraexa01 ~]$ rman target /

Recovery Manager: Release 11.2.0.3.0 - Production on Wed Jan 29 16:42:52 2014

Copyright (c) 1982, 2011, Oracle and/or its affiliates.  All rights reserved.

connected to target database: TESTDB (DBID=2128604199)

RMAN> run {
2> allocate channel c1 device type 'SBT_TAPE';
3> }

using target database control file instead of recovery catalog
allocated channel: c1
channel c1: SID=807 instance=TESTDB device type=SBT_TAPE
channel c1: Data Protection for Oracle: version 5.5.1.0
released channel: c1
Categories: linux, oracle Tags: , ,

Oracle GI 12.1 error when using NFS

January 16th, 2014 No comments

I had quite an interesting case recently where I had to build stretch cluster for a customer using Oracle GI 12.1 and placing quorum voting disk on NFS. There is a document at OTN regarding the stretch clusters and using NFS as a third location for voting disk but it has information for 11.2 only as of the moment. Assuming there is no difference in the NFS parameters I used the Linux parameters from that document and mounted the NFS share on the cluster nodes.

Later on when I tried to add the third voting disk within the ASM disk group I got this strange error:

SQL> ALTER DISKGROUP OCRVOTE ADD  QUORUM DISK '/vote_nfs/vote_3rd' SIZE 10000M /* ASMCA */
Thu Nov 14 11:33:55 2013
NOTE: GroupBlock outside rolling migration privileged region
Thu Nov 14 11:33:55 2013
Errors in file /install/app/oracle/diag/asm/+asm/+ASM1/trace/+ASM1_rbal_26408.trc:
ORA-17503: ksfdopn:3 Failed to open file /vote_nfs/vote_3rd
ORA-17500: ODM err:Operation not permitted
Thu Nov 14 11:33:55 2013
Errors in file /install/app/oracle/diag/asm/+asm/+ASM1/trace/+ASM1_ora_33427.trc:
ORA-17503: ksfdopn:3 Failed to open file /vote_nfs/vote_3rd
ORA-17500: ODM err:Operation not permitted
NOTE: Assigning number (1,3) to disk (/vote_nfs/vote_3rd)
NOTE: requesting all-instance membership refresh for group=1
Thu Nov 14 11:33:55 2013
ORA-15025: could not open disk "/vote_nfs/vote_3rd"
ORA-17503: ksfdopn:3 Failed to open file /vote_nfs/vote_3rd
ORA-17500: ODM err:Operation not permitted
WARNING: Read Failed. group:1 disk:3 AU:0 offset:0 size:4096
path:Unknown disk
incarnation:0xeada1488 asynchronous result:'I/O error'
subsys:Unknown library krq:0x7f715f012d50 bufp:0x7f715e95d600 osderr1:0x0 osderr2:0x0
IO elapsed time: 0 usec Time waited on I/O: 0 usec
NOTE: Disk OCRVOTE_0003 in mode 0x7f marked for de-assignment
Errors in file /install/app/oracle/diag/asm/+asm/+ASM1/trace/+ASM1_ora_33427.trc  (incident=83441):
ORA-00600: internal error code, arguments: [kfgscRevalidate_1], [1], [0], [], [], [], [], [], [], [], [], []
ORA-15080: synchronous I/O operation failed to read block 0 of disk 3 in disk group OCRVOTE

This happens because with 12c direct NFS is used by default and it will use ports above 1024 to initiate connections. On the other hand there is a default option on the NFS server – secure which will require any incoming connections from ports below 1024:
secure This  option requires that requests originate on an internet port less than IPPORT_RESERVED (1024). This option is on by default. To turn it off, specify insecure.

The solution for that is to add insecure parameters to the exporting NFS server, remount the NFS share and then retry the above operation.

For more information refer to:
12c GI Installation with ASM on NFS Disks Fails with ORA-15018 ORA-15072 ORA-15080 (Doc ID 1555356.1)

 

Categories: linux, oracle Tags: , ,

Missing IPV6 address for local network interfaces causes service timeout

October 4th, 2012 1 comment

Year ago I installed Weblogic server + Oralce database server for a customer. Few months later my colleagues asked me whether something has changed in the environment or if something happened at the data center, because they started to see Java exceptions for unknown hostname for a web service they were calling and this was happening only from time to time. We’ve checked the firewall rules, DNS and all the stuff, but everything seemed to be working fine. The only solution by that time they came out was to add the hostname to the hosts file of the Weblogic server. These were versions of the software:
Oracle Enterprise Linux 6.2 (64bit)
Weblogic Server 10.3.5
JDK 1.6.0_31

Then six months later, problem showed up again. The new version of the application was calling another web service, which obviously was missing from the hosts file and this time I decided to investigate the problem and find out what’s really happening. After I received the email I immediately logged in to the server and fired several nslookups and ping requests to the host that was causing problems, both were successful and returned correct result.  I’ve double checked hosts file, nsswitch.conf file and all the network settings, everything was correct. Meanwhile the Weblogic server log kept getting java.net.UnknownHostException for the very same host.

Obviously the problem required different approach. I’ve found useful Java procedure to call the function getByName and in some way simulate the application behavior, webservice hostname was intentionally changed,  this is the procedure:

[root@srv tmp]# cd /tmp/
cat > DomainResolutionTest.java
java.net.InetAddress;
import java.net.UnknownHostException;
import java.io.PrintWriter;
import java.io.StringWriter;

public class DomainResolutionTest {

public static void main(String[] args) {
if (args.length == 0) args = new String[] { "sve.to" };

try {
InetAddress ip = InetAddress.getByName(args[0]);
System.out.println(ip.toString());
}catch (UnknownHostException uhx) {
System.out.println("ERROR: " + uhx.getMessage() + "\n" + getStackTrace(uhx));
Throwable cause = uhx.getCause();
if (cause != null) System.out.println("CAUSE: " + cause.getMessage());
}

}

public static String getStackTrace(Throwable t)
{
StringWriter sw = new StringWriter();
PrintWriter pw = new PrintWriter(sw, true);
t.printStackTrace(pw);
pw.flush();
sw.flush();
return sw.toString();
}

}

Then just compile the procedure and execute it:

[root@srv tmp]# javac DomainResolutionTest.java
[root@srv tmp]# java DomainResolutionTest
sve.to/95.154.250.125
[root@srv tmp]# java DomainResolutionTest
sve.to/95.154.250.125

Running the procedure several times returned the correct address and no error occurred, but looping the procedure for some time returned the exception I was looking for:

while 1>0; do java DomainResolutionTest; done > 1
^C
[root@srv tmp]# wc -l 2
2648 2
[root@srv tmp]# grep Unknown 2
java.net.UnknownHostException: sve.to
[root@srv tmp]# less 2
......
sve.to/95.154.250.125
ERROR: sve.to
java.net.UnknownHostException: sve.to
at java.net.Inet6AddressImpl.lookupAllHostAddr(Native Method)
at java.net.InetAddress$1.lookupAllHostAddr(InetAddress.java:849)
at java.net.InetAddress.getAddressFromNameService(InetAddress.java:1202)
at java.net.InetAddress.getAllByName0(InetAddress.java:1153)
at java.net.InetAddress.getAllByName(InetAddress.java:1083)
at java.net.InetAddress.getAllByName(InetAddress.java:1019)
at java.net.InetAddress.getByName(InetAddress.java:969)
at DomainResolutionTest.main(DomainResolutionTest.java:12)

sve.to/95.154.250.125
....

From what I understand is that Java process is trying to lookup the IP address of the requested host, but first it takes all IP address of the local interfaces (eth0 and lo) including their default IPV6 address and then try to resolve these IP addresses to hostnames. Although I didn’t configured IPV6 addresses for the interfaces, they already had default ones. This is because the OS had IPV6 enabled by default and respectively the interfaces got default IPV6 addresses. During the installation I removed localhost6 (::1) record from hosts file, which later caused this error and also missing record for eth0 IP address.

The problem may be that the JVM performs both IPv6 and IPv4 queries and if the DNS server is not configured to handle IPv6 queries, the application might issue an unknown host exception. If the DNS is not configured to handle IPv6 queries properly, the application must wait for the IPv6 query to time out.  The workaround for this is to make Java use only IPV4 stack and run Java process with -Djava.net.preferIPv4Stack=true parameter. This will make Java process to prefer running in IPV4, thus avoiding the error for IPV6 look up.  Unfortunately running the above Java procedure with this parameter again returned UnknownHostException.

It looks like genuine bug with IPv6 in Java, I also saw few bugs opened at Sun regarding this Java behavior, but there was no solution. Finally after adding hostname for local interface’s IPV6 addresses in host file, the exceptions disappeared:

::1                                          localhost6
fe80::20c:29ff:fe36:4144                     srv6

So for the future installations I’ll be explicitly disabling IPV6 on the installed systems. The easiest way to do that is like this:

cat >> /etc/sysctl.conf
#disable all ipv6 capabilities on a kernel level
sysctl net.ipv6.conf.all.disable_ipv6 = 1

Regards,
Sve

Categories: linux, oracle Tags: , ,

How to run standalone Oracle APEX Listener 2.0 with Oracle 11g XE and APEX 4.1.1

August 23rd, 2012 2 comments

This is short guide on how to run standalone Oracle APEX Listener 2.0 beta with Oracle 11g XE. I’m using Oracle Enterprise Linux 5.7 for running Oracle APEX Listener and Oracle Database 11g XE with APEX 4.1.1. Although running APEX Listener standalone is not supported I’m using it to run several internal applications for company needs.

When using APEX Listener with Oracle XE, APEX won’t work properly and white screen appears when APEX is open. This is because the APEX images are stored in the XML DB repository, but APEX Listener have to be run with parameter –apex-images pointing to directory containing the images at the filesystem. To solve this I downloaded the latest patch of APEX and copied the images from the patch.

If you have another database running on the same machine, keep in mind this.

 

Install Oracle 11g XE and update Oracle APEX to latest version:

1. Download Oracle Database Express Edition 11g Release 2 for Linux x64

2. Install Oracle 11g XE:
rpm -ivh oracle-xe-11.2.0-1.0.x86_64.rpm

3. Configure Express Edition:
/etc/init.d/oracle-xe configure
Port: 1522
Password: secret

4. Update APEX to 4.1

Download APEX 4.1
cd /tmp
unzip -q apex_4.1.zip
cd apex
sqlplus / as sysdba
@apexins SYSAUX SYSAUX TEMP /i/
@apxldimg.sql /tmp/apex

5. Update APEX to version 4.1.1
Download patch set 13331096 from MOS

Disable Oracle XML DB HTTP server:

SQL> EXEC DBMS_XDB.SETHTTPPORT(0);
PL/SQL procedure successfully completed.

SQL> COMMIT;
Commit complete.

SQL> SELECT DBMS_XDB.GETHTTPPORT FROM DUAL;
GETHTTPPORT
———–
0

Run apxpatch.sql to patch the system:

SQL> @apxpatch.sql

Update the Images Directory When Running the Embedded PL/SQL Gateway:

@apxldimg.sql /tmp/patch

Commit complete.

Once the update finished do not enable Oracle XML DB HTTP server, because we’ll be using Oracle APEX Listener, which will setup next.

 

Install APEX Listener 2.0.0

1. Download Oracle APEX Listener 2.0.0 beta

2. Download and install latest JRE 1.6 version, currently latest version is 1.6.34

Unpack to /opt/jre1.6.0_34

3. Unlock and set password for apex_public_user at the Oracle XE database:
alter user APEX_PUBLIC_USER account unlock;
alter user APEX_PUBLIC_USER identified by secret;

4. Patch Oracle APEX to support RESTful  Services:
cd /oracle/apxlsnr/apex_patch/
sqlplus / as sysdba @catpatch.sql

Set passwords for both users APEX_LISTENER and APEX_REST_PUBLIC_USER.

5. Install Oracle APEX Listener:
mkdir /oracle/apxlsnr/
cd /oracle/apxlsnr/
unzip apex_listener.2.0.0.215.16.35.zip

Now this is tricky, for XE edition the images are kept in the XML DB repository, so images have to be copied from the patch to the listener home:
cp /tmp/patch/images .

6. Configure Oracle APEX Listener:
export JAVA_HOME=/opt/jre1.6.0_34
export PATH=$JAVA_HOME/bin:$PATH

Set APEX listener config dir:
java -jar apex.war configdir $PWD/config

Configure the listener:
java -jar apex.war

Once configuration is complete, listener is started. It has to be stopped and run with appropriate parameters, use Ctrl-C to stop it.

7. Finally start the listener:
java -jar apex.war standalone –apex-images /oracle/apxlsnr/images

In case you want to run it in background here’s how to do it:
nohup java -jar apex.war standalone –apex-images /oracle/apxlsnr/images > apxlsnr.log &

 

Periodically I was seeing exceptions like these:
ConnectionPoolException [error=BAD_CONFIGURATION]

Caused by: oracle.ucp.UniversalConnectionPoolException: Universal Connection Pool already exists in the Universal Connection Pool Manager. Universal Connection Pool cannot be added to the Universal Connection Pool Manager

I found that if APEX Listener is not configured with RESTful Services then these messages appeared in the log and could be safety ignored.

 

Regards,
Sve

Categories: linux, oracle Tags: ,

How I run Oracle VM 2.2 guests with custom network configuration

August 15th, 2012 No comments

Recently I was given three virtual machines running Oracle Enterprise Linux 5 and Oracle 11gR2 RAC on Oracle VM 2.2.1, copied straight from /OVS/running_pool/. I had to get these machines up and running at my lab environment, but I found hard to setup the network. I’ve spent half day in debugging without success, but finally found a workaround, which I’ll explain here.

Just a little technical notes – Oracle VM (xen) has three main setup configurations within /etc/xen/xend-config.sxp:

Bridge Networking – this configuration is configured by default and it’s simplest to configure. Using this type of networking means that the VM guest should have IP from the same network as the VM host. Another thing is that the VM guest could take advantage of DHCP, if any. The following lines should be uncommented in /etc/xen/xend-config.sxp:
(network-script network-bridge)
(vif-script vif-bridge)

Routed Networking with NAT – this configuration is most common where a private LAN must be used, for example you have a VM host running  on your notebook and you can’t get another IP from corporate or lab network. For this you have to setup private LAN and NAT the VM guests so they can access the rest of the network. The following lines should be uncommented in /etc/xen/xend-config.sxp:
(network-script network-nat)
(vif-script vif-nat)

Two-way Routed Network – this configuration requires more manual steps, but offers greater flexibility. This one is exactly the same at the second one, except the fact that VM guests are exposed on the external network. For example when VM guest make connection to external machine, its original IP is seen. The following lines should be uncommented in /etc/xen/xend-config.sxp:
(network-script network-route)
(vif-script vif-route)

Typically only one of the above can be used at one time and selection and choice depends on the network setup. For second and third configurations to work, a “route” must be added to the Default Gateway. For example if my Oracle VM host has an IP address 192.168.143.10, then on the default gateway (192.168.143.1) a route has to be added to explicitly route all connection requests to my VM guests through my VM host. Something like that:
route add -net 10.0.1.0 netmask 255.255.255.0 gw 192.168.143.10

Now back to the case itself. Each of the RAC nodes had two NICs – one for the public connections and one for the private, which is used by GI an RAC. The public network was 10.0.1.X and private 192.168.1.X. What I wanted was to run the VM guests at my lab and access them directly with IP addresses from the lab network, which was 192.168.143.X. As we know the default network configuration is to use bridged networking so I went with this one. Having the vm guests config files all I had to do was to change the first address of every guest:

From:
vif = [‘mac=00:16:3e:22:0d:04, ip=10.0.1.11, bridge=xenbr0’, ‘mac=00:16:3e:22:0d:14, ip=192.168.1.11’,]

To:
vif = [‘mac=00:16:3e:22:0d:04, ip=192.168.143.151, bridge=xenbr0’, ‘mac=00:16:3e:22:0d:14, ip=192.168.1.11’,]

This turned to be real nightmare, I’ve spent half a day looking why my VM gusts doesn’t have access to the lab network. They had access to VM host, but not to the outside world. Maybe because I’m running Oracle VM on top of VMWare, but finally I  gave up this configuration.

Thus I had to use one of the other two network configurations – Routed Networking with NAT OR Two-way Routed Network. Either case I didn’t have access to the default gateway and would not be able to put static route to my VM guests.

Here is how I solved this – to run three nodes RAC on Oracle VM Server 2.2.1, keep their original network configuration and access them with IP address from my lab network (192.168.143.X). I’ve put logical IP’s of the VM guests on the VM host using ip (ifconfig could also be used) and then using iptables change packet destination to the VM guests themselves (10.0.1.X).

1. Change Oracle VM configuration to Two-way Routed Network, comment the lines for default bridge configuration and remove comments for routed networking:
(network-script network-route)
(vif-script vif-route)

2. Configure VM host itself for forwarding:
echo 1 > /proc/sys/net/ipv4/conf/all/proxy_arp
iptables -t nat -A POSTROUTING -s 10.0.1.0 -j MASQUERADE

3. Set network alias with the IP address that you want to use for the VM guests:
ip addr add 192.168.143.151/32 dev eth0:1
ip addr add 192.168.143.152/32 dev eth0:2
ip addr add 192.168.143.153/32 dev eth0:3

4. Create iptables rules in PREROUTING chain that will redirect the request to VM guests original IPs once it receive it on the lab network IP:
iptables -t nat -A PREROUTING -d 192.168.143.151 -i eth0 -j DNAT –to-destination 10.0.1.11
iptables -t nat -A PREROUTING -d 192.168.143.152 -i eth0 -j DNAT –to-destination 10.0.1.12
iptables -t nat -A PREROUTING -d 192.168.143.153 -i eth0 -j DNAT –to-destination 10.0.1.13

5. Just untar the VM guest in /OVS/running_pool/

[root@ovm22 running_pool]# ls -al /OVS/running_pool/dbnode1/
total 26358330
drwxr-xr-x 2 root root        3896 Aug  6 17:27 .
drwxrwxrwx 6 root root        3896 Aug  3 11:18 ..
-rw-r–r– 1 root root  2294367596 May 16  17:27 swap.img
-rw-r–r– 1 root root  4589434792 May 16  17:27 system.img
-rw-r–r– 1 root root 20107128360 May 16  17:27 u01.img
-rw-r–r– 1 root root         436 Aug 6 11:20 vm.cfg

6. Run the guest:
xm create /OVS/running_pool/dbnode1/vm.cfg

Now I have a three node RAC, nodes have their original public IPs and I can access them using my lab network IPs. The mapping is like this:

Request to 192.168.143.151 –> the IP address is up on the VM host –> on the VM host iptables takes action –> packet destination IP address is changed to 10.0.1.11 –> static route is already in place at VM host routing packet to the vif interface of the VM guest.

Now I can access my dbnode1 (10.0.1.11) directly with its lab network IP 192.168.143.151.

Regards,
Sve

Categories: linux, oracle, virtualization Tags: ,