Home > linux, oracle > Database 11.2 bug causes huge number of alert log entries

Database 11.2 bug causes huge number of alert log entries

Few days ago I received a call from customer about problem with their EM console and messages about file system full. They run DB 11.2.0.2 on OEL 5.7 and had only binaries installation at that file system and the database itself was using ASM. I quickly logged on to find out the file system was really full and after looking around I figure out that all the free space was eaten by alert and trace diagnostic directories. The trace directory was full of 10MB files and the alertlog file was quick growing with following messages:

 WARNING: failed to read mirror side 1 of virtual extent 2917 logical extent 0 of file 271 in group [1.2242406296] from disk DATA_0000 allocation unit 24394 reason error; if possible,will try another mirror side
Errors in file /oracle/app/oracle/diag/rdbms/baandb/baandb/trace/baandb_ora_17785.trc:
WARNING: Read Failed. group:1 disk:0 AU:24394 offset:1007616 size:8192
WARNING: failed to read mirror side 1 of virtual extent 2917 logical extent 0 of file 271 in group [1.2242406296] from disk DATA_0000 allocation unit 24394 reason error; if possible,will try another mirror side
Errors in file /oracle/app/oracle/diag/rdbms/baandb/baandb/trace/baandb_ora_17785.trc: 

At first I though there is a storage problem, but looking at the ASM views everything seemed to be all right and these seemed to be false messages. I deleted all the trace files, but then few minutes later the file system became again full. It turned out that generated log per minute were more than 60MBor around 7GB for two hours, because of this huge number of messages the machine was already loaded.

Then after quick MOS search I found that this is a Bug 10422126: FAILED TO READ MORROR SIDE 1 and there is a 70KB patch for 11.2.0.2.

The following MOS notes are also useful:
WARNING: ‘Failed To Read Mirror Side 1′ continuously reported in the alert log [ID 1289905.1]
Huge number of alert log entries: ‘WARNING: IO Failed…’ ‘WARNING: failed to read mirror side 1 of virtual extent …’ [ID 1274852.1]

After applying the patch everything became normal and no more false messages appeared in the logs. The bug is fixed in 11.2.0.3.

Regards,
Sve

Similar Posts:

m4s0n501
Categories: linux, oracle Tags: ,
  1. Vasiliy
    December 27th, 2011 at 07:22 | #1

    Yestarday we’ve got the same problem. Disk became full and 1-st node became unavailable. Hope this note helps, thank you!

  2. Hapless Sysadmin
    February 10th, 2012 at 03:14 | #2

    Yeah, we encountered this one as well. When will Oracle ever begin to focus on product quality? This system ran fine before upgrading to 11.2

  3. Svetoslav Gyurov
    February 10th, 2012 at 10:11 | #3

    Yep, it’s even worse, because the behavior changes from patchset to patchset. We’ve been running for six months and then suddenly out of no where we hit this bug.

    Regards,
    Sve

  4. MB
    June 8th, 2012 at 23:03 | #4

    This same thing happened to me, on a cluster that had been live for over a month. Just suddenly “bam”, and there it went.

  5. Aleks
    November 8th, 2012 at 10:53 | #5

    Yesterday we’ve had the same problem after seven months good working system.

    We appreciate your post here a lot. We’ve ran into an additional problem, below is the solution.

    We’ve used the Metalink document
    WARNING: ‘Failed To Read Mirror Side 1′ continuously reported in the alert log [ID 1289905.1]
    Following this document we’ve downloaded the patch p10422126_112020_HPUX-IA64. While implemeting the patch on (2) Installation:
    1. while executing point 4. the owner of the GI home has to be grid
    2. while executing point 5., bullet 2 (GI HOme on Standalone server)
    ./roothas.pl -patch
    an error appeared : Undefined subroutine crspatch.pm line 86.
    Here we used the second workaround of the metalink document
    roothas.pl -patch or rootcrs.pl -patch Fails with ‘Undefined subroutine’ [ID 1268390.1], the first workaround did not help in our case.

    After this the services were restarted, and we’ve started the database successfully.
    Regards, and thanks again for posting the solution.

  1. January 25th, 2012 at 16:22 | #1