Oracle Database 12.2 released for Exadata on-premises

February 11th, 2017 No comments

We live in exciting times, Oracle Database 12.2 for Exadata was released earlier today.

The 12.2 database was already available on the Exadata Express Cloud Service and Database as a service for a few months now.

Today, it has been released for Exadata on-premises, five days earlier than the initial Oracle announcement of 15th Feb.

The documentation suggests that to run 12.2 database you need to run at least Exadata storage software but better go with the recommended version of which was released just a few days ago. Here are few notes on running 12.2 database on Exadata:

Recommended Exadata storage software: or higher
Supported Exadata storage software: or higher

Full Exadata offload functionality for Database 12.2, and IORM support for Database 12.2 container databases and pluggable databases requires Exadata or higher.

Exadata Storage Server version will be required for full Exadata functionality including  ‘Smart Scan offloaded filtering’, ‘storage indexes’ and’

Current Oracle Database and Grid Infrastructure version must be,, or  Upgrades from or directly to are not supported.

There is a completely new note on how to upgrade to 12.2 GI and RDBMS on Exadata:

12.2 Grid Infrastructure and Database Upgrade steps for Exadata Database Machine running and later on Oracle Linux (Doc ID 2111010.1)

The 12.2 GI and RDBMS binaries are available from MOS as well as edelivery:

Patch 25528839: Grid Software clone version
Patch 25528830: Database Software clone version

The recommended Exadata storage software for running 12.2 RDBMS:

Exadata release and patch (21052028) (Doc ID 2207148.1)

For more details about Exadata storage software refer to this slide deck and of course a link to the 12.2 documentation here as we all need to start getting familiar with it.

Also, last week Oracle released the February version of OEDA to support the new Exadata SL6 hardware. It does not support 12.2 database yet and I guess we’ll see another release in February to support the 12.2 GI and RDBMS.

Happy upgrading! 🙂

Categories: oracle Tags: ,

Upgrading to Exadata or later – mind the 32bit packages

February 2nd, 2017 No comments

This might not be relevant anymore, shame on me for keeping it draft for few months. However, there are still people running an older versions of Exadata storage software and it might still help someone out there.

With the release of Exadata storage software, Oracle announced that some 32bit (i686) packages will be removed from the OS as part of the upgrade.

This happened to me in the summer last year (blog post) and I thought back then that someone messed up the dependencies. After seeing it again for another customer a month later I thought it might be something else. So after checking the release notes for all the recent patches, I found this for the release:

Note that several i686 packages may be removed from the database note when being updated. Run with -N flag at prereq check time to see exactly what rpms will be removed from your system.

Now, this will be all ok if you haven’t installed any additional packages. If you, however, like many other customers has packages like LDAP or Kerberos then your dbnodeupdate pre-check will fail with “Minimum’ dependency check failed.” and broken dependencies since all i686 package will be removed as part of dbnodeupdate pre-check.

The way around that is to run dbnodeupdate with -N flag and check the logs of what packages will be removed and what will be impacted. Then manually remove any packages you installed manually. After the Exadata storage software update, you’d need to install the relevant version of the packages again.

Having said that I need to mention the below note on what software is allowed to install on Exadata:

Is it acceptable / supported to install additional or 3rd party software on Exadata machines and how to check for conflicts? (Doc ID 1541428.1)


Categories: oracle Tags:

onecommand fails to change storage cell name

January 20th, 2017 No comments

It’s been a busy month – five Exadata deployments in the past three weeks and new personal best – 2x Exadata X6-2 Eighth Racks with CoD and storage upgrade deployed in only 6hrs!

An issue I encountered with the first deployment was that onecommand wouldn’t change the storage cells names. The default cell names (not hostnames!) are based on where they are mounted within the rack and they are assigned by the elastic configuration script. The first cell name is ru02 (rack unit 02), the second cell is ru04, third is ru06 and so on.

Now, if you are familiar with the cell and grid disks you would know that their names are based on the cell name. In other words, I got my cell, grid and ASM disks with the wrong names. Exachk would report the following failures for every grid disk:

Grid Disk name DATA01_CD_00_ru02 does not have cell name (exa01cel01) suffix
Naming convention not used. Cannot proceed further with
automating checks and repair for bug 12433293

Apart from exachk complaining, I wouldn’t feel comfortable with similar names on my Exadata.

Fortunately cell, grid and ASM disk names can be changed and here is how to do it:

Stop the cluster and CRS on each compute node:

/u01/app/ stop cluster -all
/u01/app/ stop crs

Login to each storage server and rename cell name, cell and grid disks, use the following to build the alter commands:

You don’t need cell services shut but the grid disks shouldn’t be in use i.e. make sure to stop the cluster first!

cell -e alter cell name=exa01cel01
for i in `cellcli -e list celldisk | awk '{print $1}'`; do echo "cellcli -e alter celldisk $i name=$i"; done | sed -e "s/ru02/exa01cel01/2"
for i in `cellcli -e list griddisk | awk '{print $1}'`; do echo "cellcli -e alter griddisk $i name=$i"; done | sed -e "s/ru02/exa01cel01/2"

If you get the following error restart the cell services and try again:

GridDisk DATA01_CD_00_ru02 alter failed for reason: CELL-02548: Grid disk is in use.

Start the cluster on each compute node:

/u01/app/ start crs


We’ve got all cell and grid disks fixed, now we need to rename the ASM disks. To rename ASM disk you need to mount the diskgroup in restricted mode i.e. running on one node only and no one using it. If the diskgroup is not in restricted mode you’ll get:

ORA-31020: The operation is not allowed, Reason: disk group is NOT mounted in RESTRICTED state.


Stop the second compute node, default dbm01 database and the MGMTDB database:

srvctl stop database -d dbm01
srvctl stop mgmtdb

Mount diskgroups in restricted mode:

If you are running and high redundancy DATA diskgroup, it is  VERY likely that the voting disks are in the DATA diskgroup. Because of that, you wouldn’t be able to dismount the diskgroup. The only way I found around that was to force stop ASM and start it manually in a restricted mode:

srvctl stop asm -n exa01db01 -f

sqlplus / as sysasm

startup mount restricted

alter diskgroup all dismount;
alter diskgroup data01 mount restricted;
alter diskgroup reco01 mount restricted;
alter diskgroup dbfs_dg mount restricted;


Rename the ASM disks, use the following build the alter commands:

select 'alter diskgroup ' || || ' rename disk ''' || || ''' to ''' || REPLACE(,'RU02','exa01cel01')  || ''';' from v$asm_disk d, v$asm_diskgroup g where d.group_number=g.group_number and like '%RU02%';

select 'alter diskgroup ' || || ' rename disk ''' || || ''' to ''' || REPLACE(,'RU04','exa01cel03') || ''';' from v$asm_disk d, v$asm_diskgroup g where d.group_number=g.group_number and like '%RU04%';

select 'alter diskgroup ' || || ' rename disk ''' || || ''' to ''' || REPLACE(,'RU06','exa01cel03') || ''';' from v$asm_disk d, v$asm_diskgroup g where d.group_number=g.group_number and like '%RU06%';


Finally stop and start CRS on both nodes.


It’s only when I thought everything was ok I discovered one more reference to those pesky names. These were the fail group names which again are based on the storage cell name. Following will make it more clear:

select group_number,failgroup,mode_status,count(*) from v$asm_disk where group_number > 0 group by group_number,failgroup,mode_status;

GROUP_NUMBER FAILGROUP                      MODE_ST   COUNT(*)
———— —————————— ——- ———-
1 RU02                           ONLINE          12
1 RU04                           ONLINE          12
1 RU06                           ONLINE          12
1 EXA01DB01                  ONLINE           1
1 EXA01DB02                  ONLINE           1
2 RU02                           ONLINE          10
2 RU04                           ONLINE          10
2 RU06                           ONLINE          10
3 RU02                           ONLINE          12
3 RU04                           ONLINE          12
3 RU06                           ONLINE          12

For each diskgroup we’ve got three fail groups (three storage cells). The other two fail groups EXA01DB01 and EXA01DB02 are the quorum disks.

Unfortunately, you cannot rename failgroups in ASM. My immediate thought was to drop each failgroup and add it back with the intention that it will resolve the problem. Unfortunately, since this was a quarter rack I couldn’t do it, here’s an excerpt from the documentation:

If a disk group is configured as high redundancy, then you can do this procedure on a Half Rack or greater. You will not be able to do this procedure on a Quarter Rack or smaller with high redundancy disk groups because ASM will not allow you to drop a failure group such that only one copy of the data remains (you’ll get an ORA-15067 error).

The last option was to recreate the diskgroups. I’ve done this many times before when the compatible.rdbms parameter was set to too high and I had to install some earlier version of 11.2. However, since oracle decided to move the voting disks to DATA this became a bit harder. I couldn’t drop DBFS_DG because that’s where the MGMTDB was created, I couldn’t drop DATA01 either because of the voting disks and some parameter files. I could have renamed RECO01 diskgroup but decided to keep it “consistently wrong” across all three diskgroups.

Fortunately, this behvaiour might change with the January 2017 release of OEDA. The following bug fix suggests that DBFS_DG will always be configured as high redundancy and host the voting disks:

24329542: oeda should make dbfs_dg as high redundancy and locate ocr/vote into dbfs_dg

There is also a feature request to support failgroup rename but it’s not very popular, to be honest. Until we get this feature, exachk will report the following failure:

failgroup name (RU02) for grid disk DATA01_CD_00_exa01cel01 is not cell name
Naming convention not used. Cannot proceed further with
automating checks and repair for bug 12433293

I’ve deployed five Exadata X6-2 machines so far and had this issue on all of them.

This issue seems to be caused a bug in OEDA. The storage cell names should have been changed as part of step “Create Cell Disks” of onecommand. I keep the logs from some older deployments where it’s very clear that each cell was renamed as part of this step:

Initializing cells...

EXEC## |cellcli -e alter cell name = exa01cel01||root|

I couldn’t find that command in the logs of the deployements I did. Obviously, the solution for now, is to manually rename the cell before you run step “Create Cell Disks” of onecommand.

Update 04.02.2017:

This problem has been logged by someone else a month earlier under the following bug:


Categories: oracle Tags:

Unable to perform initial elastic configuration on Exadata X6

January 12th, 2017 No comments

I had the pleasure to deploy another Exadata in the first week of 2017 and got my first issue this year.

As we know starting with Exadata X5, Oracle introduced the concept of Elastic Configuration. Apart from allowing you to mix and match the number of compute nodes and storage cells they have also changed how the IP addresses are assigned on the admin (eth0) interface. Prior X5, Exadata had default IP addresses set at the factory in the range of IP addresses was to but since this could collide with the customer’s network they changed the way those IPs are assigned. In short – the IP address on eth0 on the compute nodes and storage cells is assigned within to range. The first time node boots it will assign its hostname and IP address based on the IB ports its connected to.

Now to the real problem, I was doing the usual stuff – changing ILOMs, setting cisco and IB switches and was about to perform the initial elastic configuration ( so I had upload all the files I need for the deployment on the first compute node. I’ve changed my laptop address to an IP within the same range and was surprised when I got connection timed out when I tried to ssh to the first compute node ( I thought this was an unfortunate coincidence since I rebooted the IB switches almost at the time I powered on the compute nodes but I was wrong. For some reason, ALL servers did not get their eth0 IP addresses assigned hence they were not accessible.

I was very surprised to what’s causing this issue and I’ve spent the afternoon troubleshooting it. I thought Oracle changed the way they assign the IP addresses but the scripts haven’t been changed for a long time. It didn’t take long before I find out what was causing it. Three lines in /sbin/ifup script were the reason eth0 interface wasn’t up with the 172.2.16.X IP address:

if ip link show ${DEVICE} | grep -q “UP”; then
exit 0

This line will check if the interface is UP before proceeding further and bring the interface up. Actually, the eth0 interface is brought UP already by the elastic configuration script to check if there is a link on the interface. Then at the end of the script when ifup script is invoked to bring the interface up it will stop the execution since the interface is already UP.

The solution is really simple – comment out the three lines (line 73-75) in /sbin/ifup script and reboot each node.

This wasn’t the first X6 I deploy and I never had this problem before so I did some further investigation. The /sbin/ifup scripts is part of initscripts package. It turns out that the check for the interface being UP was introduced in one minor version of the package and then removed in the latest package. Unfortunately, the last entry in the Changelog is from Apr 12 2016 so that’s not very helpful but here’s a summary:

initscripts-9.03.53-1.0.1.el6.x86_64.rpm           11-May-2016 19:49     947.9 K  <– not affected
initscripts-9.03.53-1.0.1.el6_8.1.x86_64.rpm     12-Jul-2016 16:42     948.0 K    <– affected
initscripts-9.03.53-1.0.2.el6_8.1.x86_64.rpm     13-Jul-2016 08:26     948.1 K   <– affected
initscripts-9.03.53-1.0.3.el6_8.2.x86_64.rpm     23-Nov-2016 05:06     948.3 K <– latest version, not affected

I had this problem on three Exadata machine so far. So, if you are doing deployment of new Exadata in the next few days or weeks it’s very likely that you will be affected, unless your Exadata has been factory deployed after 23rd Nov 2016. That’s the day when the latest initscripts package was released.

Update 24.01.2017:

This problem has been fixed in

Where the latest package has been added to the patch (initscripts-9.03.53-1.0.3.el6_8.2.x86_64.rpm)

Categories: oracle Tags:

Installing OEM13c management agent in silent mode

November 1st, 2016 3 comments

For many customers I work with SSH won’t be available between the OEM and monitoring hosts. Therefore you cannot push the management agent on to the host from the OEM console. Customer might have to raise CR to allow SSH between the hosts but this might take a while and it’s really unnecessary.

In that case the management agent has to be installed in silent mode. That is when the agent won’t be pushed from OEM to the host but pulled by the host and installed. There are thrее ways to do that – using AgentPull script, agentDeploy script or using RPM file.

When using AgentPull script, you download a script from the OMS and then run it. It will download the agent package and install it on the local host. Using the agentDeploy script is very similar but to obtain it you use EMCLI. The third method of using RPM file is similar – using EMCLI you download RPM file and install it on the system. These methods require HTTP/HTTPS access to the OMS, AgentDeploy and RPM file also require ELICLI to be installed. For that reason I always use AgentPull method since it’s quicker and really straight forward. Another benefit of using AgentPull method is that if you don’t have HTTP/HTTPS access to the OEM you can simply copy and paste the script.

Download a script from the OEM first, use curl or wget. The monitoring hosts usually don’t have HTTP access but many of them do so by using HTTP proxy. Download the script and make it executable:

curl "" --insecure -o
chmod +x

If a proxy server is not available then you can simply copy and paste the script, the location of the script on the OEM server is:


Make sure you edit the file change the oms host and oms port parameters.

Check what are the available platforms:

[oracle@exa01db01 ~]$ ./ -showPlatforms

Platforms    Version
Linux x86-64

Run the script by suppling the sysman password, platform you want to install, agent registration password and the agent base directory:

[oracle@exa01db01 ~]$ ./ LOGIN_USER=sysman LOGIN_PASSWORD=welcome1 \

It takes less than two minutes to install the agent and then at the end you’ll see the following messages:

Agent Configuration completed successfully
The following configuration scripts need to be executed as the "root" user. Root script to run : /u01/app/oracle/agent13c/agent_13.
Waiting for agent targets to get promoted...
Successfully Promoted agent and its related targets to Management Agent

Now login as root and run the file.

To be honest, I use this method regardless of the circumstances, it’s so much easier and faster.


Categories: oracle Tags:

Exadata memory configuration

October 28th, 2016 No comments

Read this post if your Exadata compute nodes have 512/768GB of RAM or you plan to upgrade to the same.

There has been a lot of information about hugepages and I wouldn’t go into too much details. For efficiency, the (x86) CPU allocates RAM by chunks (pages) of 4K bytes and those pages can be swapped to disk. For example, if your SGA allocates 32GB this will take 8388608 pages and given that Page Table Entry consume 8bytes that’s 64MB to look-up. Hugepages, on the other hand, are 2M. Pages that are used as huge pages are reserved inside the kernel and cannot be used for other purposes.  Huge pages cannot be swapped out under memory pressure, obviously there is decreased page table overhead and page lookups are not required since the pages are not subject to replacement. The bottom line is that you need to use them, especially now with the amount of RAM we get nowadays.

For every new Exadata deployment I usually set the amount of hugepages to 60% of the physical RAM:
256GB RAM = 150 GB (75k pages)
512GB RAM = 300 GB (150k pages)
768GB RAM = 460 GB (230k pages)

This allows databases to allocate SGA from the hugepages. If you want to allocate the exact number of hugepages that you need, Oracle has a script which will walk through all instances and give you the number of hugepages you need to set on the system, you can find the Doc ID in the reference below.

This also brings important point – to make sure your databases don’t allocate from both 4K and 2M pages make sure the parameter use_large_pages is set to ONLY for all databases. Starting with (I think) you’ll find hugepages information in the alertlog when database starts:

************************ Large Pages Information *******************
Per process system memlock (soft) limit = 681 GB

Total Shared Global Region in Large Pages = 2050 MB (100%)

Large Pages used by this instance: 1025 (2050 MB)
Large Pages unused system wide = 202863 (396 GB)
Large Pages configured system wide = 230000 (449 GB)
Large Page size = 2048 KB


Now there is one more parameter you need to change if you deploy or upgrade Exadata with 512/768GB of RAM. That is the total amount of shared memory, in pages, that the system can use at one time or kernel.shmall. On Exadata, this parameter is set to 214G by default which is enough if your compute nodes have only 256GB of RAM. If the sum of all databases SGA memory is less than 214GB that’s ok but the moment you try to start another database you’ll get the following error:

Linux-x86_64 Error: 28: No space left on device

For that reason, if you deploy or upgrade Exadata with 512G/768GB of physical RAM make sure you upgrade kernel.shmall  too!

Some Oracle docs suggest this parameter should be set to the half of the physical memory, other suggest it should be set to the all available memory. Here’s how to calculate it:

kernel.shmall = physical RAM size / pagesize

To get the pagesize run getconf PAGE_SIZE on the command prompt. You need to set shmall to at least match the size of the hugepages – because that’s where we’d allocate SGA memory from. So if you run Exadata with 768G of RAM and have 460 GB of hugepages you’ll set shmall to 120586240 (460GB / 4K pagesize).

Using HUGEPAGES does not alter the calculation for configuring shmall!

HugePages on Linux: What It Is… and What It Is Not… (Doc ID 361323.1)
Upon startup of Linux database get ORA-27102: out of memory Linux-X86_64 Error: 28: No space left on device (Doc ID 301830.1)

Categories: oracle Tags:

Speaking at BGOUG and UKOUG

October 17th, 2016 No comments

It’s my pleasure to be speaking at BGOUG and UKOUG again this year.

The coming Wednesday 19th Oct, I’ll be speaking at the UKOUG Systems SIG event here in London (agenda). I’ll talk about Exadata implementations I did last year and issues I encountered. Also, things you need to keep in mind when you plan to extend the system, attach it to ZFS Storage Appliance or Exalytics.

Next is all time my favorite user group conference BGOUG. It’s held in Pravetz, Bulgaria between 11-13 November. With an excellent line of speakers is one not to miss (agenda). I’ll be speaking on Saturday at 10:00 about Protecting single instance databases with Oracle Clusterware 12c. In case you don’t have RAC, RAC One Node or 3rd party cluster licenses but you still need high availability for your database. I’ll go through the clusters basics, the difference between single instance, RAC and RAC One Node and then more technical details around the implementation of a single instance failover cluster.

Finally, it’s the UKOUG Tech 16 with its massive 14 streams of sessions between 5-7 December and speakers from around the world (agenda). I’ll be speaking on Tuesday 11:35 about Exadata extension use cases. I’ll talk about the Exadata extension I did and what to keep in mind if you plan one. In particular extension of a quarter rack to an eighth rack, expansion of Exadata with more compute nodes or storage cell and extension of X3-8 two-rack configuration with another X4-8 rack.

I’d like to thank my company (Red Stack) for the support and BGOUG and UKOUG committees for accepting my sessions.

See you there!


Categories: oracle Tags: ,

OTN Appreciation Day: Oracle Data Guard Fast-Start Failover

October 11th, 2016 No comments

Thank you, Tim, for the great idea.

There are so many cool database features one could spend weeks blogging about them.

A feature which I like very much is Oracle DataGuard Fast-Start Failover, FSFO for short.

Oracle DataGuard Fast-Start Failover was one of the many new features introduced in Oracle Database 10.2. It’s an addition to the already available DataGuard option to maintain standby databases. DataGuard FSFO is a feature that automatically, quickly, and reliably fails over to a designated, synchronized standby database in the event of loss of the production database, without requiring manual intervention.

In FSFO configuration there are three participants – primary database, standby database and an observer and they follow a very simple rule – whichever two can communicate with each other will determine the outcome of fast-start failover. The observer usually runs on a third machine, requires only Oracle client and will continuously monitor the primary and standby databases for possible failure conditions.

FSFO solves the problem we used to have with clusters before – a “split brain” scenario where after a failure of the connection between the cluster nodes we end up having two primary databases. FSFO also gives you the option to establish an acceptable time limit (in seconds) that the designated standby is allowed to fall behind the primary database (in terms of redo applied), beyond which time a fast-start failover will not be allowed.

Oracle DataGuard Fast-Start Failover can be used only in a broker configuration in either maximum availability mode or maximum performance mode.

I don’t have post on FSFO (yet) but here are the links to the documentation:

Oracle Database 12.1 Data Guard Concepts and Administration

Oracle Database 12.1 Data Guard Broker

Oracle Database 12.1 Fast-Start Failover

Categories: oracle Tags: ,

How to enable Exadata Write-Back Flash Cache

October 10th, 2016 No comments

Yes, this is well-known and the process has been described in Exadata Write-Back Flash Cache – FAQ (Doc ID 1500257.1) but what the note fails to make clear is that you do NOT have to restart cell services anymore hence resync the griddisks!

I had to enable the WBFC many times before and every time I’d restart the cell services, as note suggests. Well, this is not required anymore, starting with it is no longer necessary to shut down the cellsrv service on the cells when changing the flash cache mode. This is not big deal if you deploy the Exadata just now but it makes enabling/disabling WBFC for existing systems quicker and much easier.

The best way to do that is to use the script that Oracle has provided – It will do all the work for you – pre-checks and changing the mode, either rolling or non-rolling.

Here are the checks it does for you:

  • Storage cells are valid storage nodes running at least or later across all cells.
  • Griddisks status is ONLINE across all cells.
  • No ASM rebalance operations are running.
  • Flash cache state across all cells are “NORMAL”.

Enable Write-Back Flash Cache using a ROLLING method

Before you enable WBFC run a precheck to make sure the cells are ready and there are no faults.

./ -g cell_group -m WriteBack -o rolling -p

At the end of the script which takes less than two minutes to run you’ll see message if storage cells passed the prechecks:

All pre-req checks completed:                    [PASSED]
2016-10-10 10:53:03
exa01cel01: flashcache size: 5.82122802734375T
exa01cel02: flashcache size: 5.82122802734375T
exa01cel03: flashcache size: 5.82122802734375T

There are 3 storage cells to process.

Then, once you are ready you run the script to enable the WBFC:

./ -g cell_group -m WriteBack -o rolling

The script will go through the following steps on each cell, one cell at a time:

1. Recheck griddisks status to make sure none are OFFLINE
2. Drop flashcache
3. Change WBFC flashcachemode to WriteBack
4. Re-create the flashcache
5. Verify flashcachemode is in the correct state

On a Quarter Rack it took around four minutes to enable WBFC and you’ll this message at the end:

2016-10-10 11:23:24
Setting flash cache to WriteBack completed successfully.

Disable Write-Back Flash Cache using a ROLLING method

Disabling WBFC is not something you do every day but soon or later you might have to do it. I had to do it once for a customer who wanted to go back to WriteThrough because Oracle ACS said this was the default ?!

The steps to disable WBFC are the same as enabling it except that we need to flush all the dirty blocks off the flashcache before we drop it.

Again, run the precheck script to make sure everything looks good:

./ -g cell_group -m WriteThrough -o rolling -p

if everything looks good then run the script:

./ -g cell_group -m WriteThrough -o rolling

The script will first FLUSH flashcache across all cells in parallel and wait until the flush is complete!

You can monitor the flush process using the following commands:

dcli -l root -g cell_group cellcli -e "LIST CELLDISK ATTRIBUTES name, flushstatus, flusherror" | grep FD
dcli -l root -g cell_group cellcli -e "list metriccurrent attributes name,metricvalue where name like \'FC_BY_DIRTY.*\' "

The script will then go through the following steps on each cell, one cell at a time:

1. Recheck griddisks status to make sure none are OFFLINE
2. Drop flashcache
3. Change WBFC flashcachemode to WriteThrough
4. Re-create the flashcache
5. Verify flashcachemode is in the correct state

The time it takes to flush the cache depends on how dirty blocks you’ve got in the flashcache and the machine workload. I did two eighth racks and unfortunately, I didn’t check the number of dirty blocks but it took 75mins on the first one and 4hrs on the second.

Categories: oracle Tags:

Extending an Exadata Eighth Rack to a Quarter Rack

October 3rd, 2016 No comments

In the past year I’ve done a lot of Exadata deployments and probably half of them were eighth racks. It’s one of those temporary things – let’s do it now but we’ll change it later. It’s the same with the upgrades – I’ve never seen anyone doing an upgrade from an eighth rack to a quarter. However, a month ago one of our customers asked me to upgrade their three X5-2 HC 4TB units from an eighth to a quarter rack configuration.

What’s the different between an eighth rack and a quarter rack

X5-2 Eighth Rack and X5-2 Quarter rack have the same hardware and look exactly the same. The only difference is that only half of the compute power and storage space on an eighth rack is usable. In an eighth rack the compute nodes have half of their CPUs activated – 18 cores per server. It’s the same for the storage cells – 16 cores per cell, six hard disks and two flash cards are active.

While this is true for X3, X4 and X5 things have slightly changed for X6. Up until now, eighth rack configurations had all the hard disks and flash cards installed but only half of them were usable. The new Exadata X6-2 Eighth Rack High Capacity configuration has half of the hard disks and flash cards removed. To extend X6-2 HC to a quarter rack you need to add high capacity disks and flash cards to the system. This is only required for High Capacity configurations because X6-2 Eighth Rack Extreme Flash storage servers have all flash drives enabled.

What are the main steps of the upgrade:

  • Activate Database Server Cores
  • Activate Storage Server Cores and disks
  • Create eighth new cell disks per cell – six hard disk and two flash disks
  • Create all grid disks (DATA01, RECO01, DBFS_DG) and add them to the disk groups
  • Expand the flashcache onto the new flash disks
  • Recreate the flashlog on all flash cards

Here are few things you need to keep in mind before you start:

  • Compute nodes upgrade require a reboot for the new changes to come into action.
  • Storage cells upgrade do NOT require a reboot and it is an online operation.
  • Upgrade work is a low risk – your data is secure and redundant at all times.
  • This post is about X5 upgrade. If you were to upgrade X6 then before you begin you need to install the six 8 TB disks in HDD slots 6 – 11 and install the two F320 flash cards in PCIe slots 1 and 4.

Upgrade of the compute nodes

Well, this is really straight forward and you can do it at any time. Remember that you need to restart the server for the change to come into action:

dbmcli -e alter dbserver pendingCoreCount=36 force
DBServer exa01db01 successfully altered. Please reboot the system to make the new pendingCoreCount effective.

Reboot the server to activate the new cores. It will take around 10 minutes for the server to come back online.

Check the number of cores after server comes back:

dbmcli -e list dbserver attributes coreCount
cpuCount:               36/36


Make sure you’ve got the right number of cores. These systems allow capacity on demand (CoD) and in my case customer wanted to me activate only 28 cores per server.

Upgrade of the storage cells

Like I said earlier, the upgrade of the storage cells does NOT require reboot and can be done online at any time.

The following needs to be done on each cell. You can, of course, use dcli but I wanted to do that cell by cell and make sure each operation finishes successfully.

1. First, upgrade the configuration from an eighth to a quarter rack:

[root@exa01cel01 ~]# cellcli -e list cell attributes cpuCount,eighthRack
cpuCount:               16/32
eighthRack:             TRUE

[root@exa01cel01 ~]# cellcli -e alter cell eighthRack=FALSE
Cell exa01cel01 successfully altered

[root@exa01cel01 ~]# cellcli -e list cell attributes cpuCount,eighthRack
cpuCount:               32/32
eighthRack:             FALSE


2. Create cell disks on top of the newly activated physical disks

Like I said – this is an online operation and you can do it at any time:

[root@exa01cel01 ~]# cellcli -e create celldisk all
CellDisk CD_06_exa01cel01 successfully created
CellDisk CD_07_exa01cel01 successfully created
CellDisk CD_08_exa01cel01 successfully created
CellDisk CD_09_exa01cel01 successfully created
CellDisk CD_10_exa01cel01 successfully created
CellDisk CD_11_exa01cel01 successfully created
CellDisk FD_02_exa01cel01 successfully created
CellDisk FD_03_exa01cel01 successfully created


3. Expand the flashcache on to the new flash cards

This is again an online operation and it can be run at any time:

[root@exa01cel01 ~]# cellcli -e alter flashcache all
Flash cache exa01cel01_FLASHCACHE altered successfully


4. Recreate the flashlog

The flashlog is always 512MB big but to make use of the new flash cards it has to be recreated. Use the DROP FLASHLOG command to drop the flash log, and then use the CREATE FLASHLOG command to create a flash log. The DROP FLASHLOG command can be run at runtime, but the command does not complete until all redo data on the flash disk is written to hard disk.

Here is an important note from Oracle:

If FORCE is not specified, then the DROP FLASHLOG command fails if there is any saved redo. If FORCE is specified, then all saved redo is purged, and Oracle Exadata Smart Flash Log is removed.

[root@exa01cel01 ~]# cellcli -e drop flashlog
Flash log exa01cel01_FLASHLOG successfully dropped


5. Create grid disks

The best way to do that is to query the current grid disks size and use to create the new grid disks. Use the following queries to obtain the size for each grid disk. We use disk 02 because the first two does have DBFS_DG on them.

[root@exa01db01 ~]# dcli -g cell_group -l root cellcli -e "list griddisk attributes name, size where name like \'DATA.*02.*\'"
exa01cel01: DATA01_CD_02_exa01cel01        2.8837890625T
[root@exa01cel01 ~]# dcli -g cell_group -l root cellcli -e "list griddisk attributes name, size where name like \'RECO.*02.*\'"
exa01cel01: RECO01_CD_02_exa01cel01        738.4375G
[root@exa01cel01 ~]# dcli -g cell_group -l root cellcli -e "list griddisk attributes name, size where name like \'DBFS_DG.*02.*\'"
exa01cel01: DBFS_DG_CD_02_exa01cel01       33.796875G

Then you can either generate the commands and run them on each cell or use dcli to create them on all three cells:

dcli -g cell_group -l celladmin "cellcli -e create griddisk DATA_CD_06_\`hostname -s\` celldisk=CD_06_\`hostname -s\`,size=2.8837890625T"
dcli -g cell_group -l celladmin "cellcli -e create griddisk DATA_CD_07_\`hostname -s\` celldisk=CD_07_\`hostname -s\`,size=2.8837890625T"
dcli -g cell_group -l celladmin "cellcli -e create griddisk DATA_CD_08_\`hostname -s\` celldisk=CD_08_\`hostname -s\`,size=2.8837890625T"
dcli -g cell_group -l celladmin "cellcli -e create griddisk DATA_CD_09_\`hostname -s\` celldisk=CD_09_\`hostname -s\`,size=2.8837890625T"
dcli -g cell_group -l celladmin "cellcli -e create griddisk DATA_CD_10_\`hostname -s\` celldisk=CD_10_\`hostname -s\`,size=2.8837890625T"
dcli -g cell_group -l celladmin "cellcli -e create griddisk DATA_CD_11_\`hostname -s\` celldisk=CD_11_\`hostname -s\`,size=2.8837890625T"
dcli -g cell_group -l celladmin "cellcli -e create griddisk RECO_CD_06_\`hostname -s\` celldisk=CD_06_\`hostname -s\`,size=738.4375G"
dcli -g cell_group -l celladmin "cellcli -e create griddisk RECO_CD_07_\`hostname -s\` celldisk=CD_07_\`hostname -s\`,size=738.4375G"
dcli -g cell_group -l celladmin "cellcli -e create griddisk RECO_CD_08_\`hostname -s\` celldisk=CD_08_\`hostname -s\`,size=738.4375G"
dcli -g cell_group -l celladmin "cellcli -e create griddisk RECO_CD_09_\`hostname -s\` celldisk=CD_09_\`hostname -s\`,size=738.4375G"
dcli -g cell_group -l celladmin "cellcli -e create griddisk RECO_CD_10_\`hostname -s\` celldisk=CD_10_\`hostname -s\`,size=738.4375G"
dcli -g cell_group -l celladmin "cellcli -e create griddisk RECO_CD_11_\`hostname -s\` celldisk=CD_11_\`hostname -s\`,size=738.4375G"
dcli -g cell_group -l celladmin "cellcli -e create griddisk DBFS_DG_CD_06_\`hostname -s\` celldisk=CD_06_\`hostname -s\`,size=33.796875G"
dcli -g cell_group -l celladmin "cellcli -e create griddisk DBFS_DG_CD_07_\`hostname -s\` celldisk=CD_07_\`hostname -s\`,size=33.796875G"
dcli -g cell_group -l celladmin "cellcli -e create griddisk DBFS_DG_CD_08_\`hostname -s\` celldisk=CD_08_\`hostname -s\`,size=33.796875G"
dcli -g cell_group -l celladmin "cellcli -e create griddisk DBFS_DG_CD_09_\`hostname -s\` celldisk=CD_09_\`hostname -s\`,size=33.796875G"
dcli -g cell_group -l celladmin "cellcli -e create griddisk DBFS_DG_CD_10_\`hostname -s\` celldisk=CD_10_\`hostname -s\`,size=33.796875G"
dcli -g cell_group -l celladmin "cellcli -e create griddisk DBFS_DG_CD_11_\`hostname -s\` celldisk=CD_11_\`hostname -s\`,size=33.796875G"

6. The final step is to add newly created grid disks to ASM

Connect to the ASM instance using sqlplus as sysasm and disable the appliance mode:

SQL> ALTER DISKGROUP DATA01 set attribute 'appliance.mode'='FALSE';
SQL> ALTER DISKGROUP RECO01 set attribute 'appliance.mode'='FALSE';
SQL> ALTER DISKGROUP DBFS_DG set attribute 'appliance.mode'='FALSE';

Add the disks to the disk groups, you can either queue them on one instance or run them on both ASM instances in parallel:


Monitor the rebalance using select * from gv$asm_operations and once done change the appliance mode back to TRUE:

SQL> ALTER DISKGROUP DATA01 set attribute 'appliance.mode'='TRUE';
SQL> ALTER DISKGROUP RECO01 set attribute 'appliance.mode'='TRUE';
SQL> ALTER DISKGROUP DBFS_DG set attribute 'appliance.mode'='TRUE';

And at this point, you are done with the upgrade. I strongly recommend you to run (latest) exachk report and make sure there are no issues with the configuration.

A problem you might encounter is that the flash is not fully utilized, in my case I had 128MB free on each card:

[root@exa01db01 ~]# dcli -g cell_group -l root "cellcli -e list celldisk attributes name,freespace where disktype='flashdisk'"
exa01cel01: FD_00_exa01cel01         128M
exa01cel01: FD_01_exa01cel01         128M
exa01cel01: FD_02_exa01cel01         128M
exa01cel01: FD_03_exa01cel01         128M
exa01cel02: FD_00_exa01cel02         128M
exa01cel02: FD_01_exa01cel02         128M
exa01cel02: FD_02_exa01cel02         128M
exa01cel02: FD_03_exa01cel02         128M
exa01cel03: FD_00_exa01cel03         128M
exa01cel03: FD_01_exa01cel03         128M
exa01cel03: FD_02_exa01cel03         128M
exa01cel03: FD_03_exa01cel03         128M

This seems to be a known bug and to fix it you need to recreate both flashcache and flashlog.

Extending an Eighth Rack to a Quarter Rack in Oracle Exadata Database Machine X4-2 and Later
Oracle Exadata Database Machine exachk or HealthCheck (Doc ID 1070954.1)
Exachk fails due to incorrect flashcache size after upgrading from 1/8 to a 1/4 rack (Doc ID 2048491.1)

Categories: oracle Tags: