Exadata

Some exadata disk tips

July 29, 2013 Exadata, oracle No comments

针对exadata最近频繁报出的IO error,做如下总结

data node alert

ORA-27603: 单元存储 I/O 错误, I/O 在磁盘 o/192.168.10.5/DATA_DM01_CD_08_dm01cel03 上失败, 偏移量 17331625984 (数据长度 253952)
ORA-27626: Exadata 错误: 201 (Generic I/O error)
WARNING: Read Failed. group:1 disk:32 AU:4132 offset:761856 size:253952
path:o/192.168.10.5/DATA_DM01_CD_08_dm01cel03
         incarnation:0x802360d9 asynchronous result:'I/O error'
         subsys:OSS iop:0x2b8c42c03640 bufp:0x2b8c42fc4e00 osderr:0xc9 osderr1:0x0
         Exadata error:'Generic I/O error'
         IO elapsed time: 18021514 usec Time waited on I/O: 18013517 usec
WARNING: failed to read mirror side 1 of virtual extent 2039 logical extent 0 of file 274 in group [1.540250240] from disk DATA_DM01_CD_08_DM01CEL03  allocation unit 4132 reason error; if possible, will try another mirror side
NOTE: successfully read mirror side 2 of virtual extent 2039 logical extent 1 of file 274 in group [1.540250240] from disk DATA_DM01_CD_05_DM01CEL02 allocation unit 4133

ASM alert

Wed Jun 19 08:45:30 2013
Errors in file /u01/app/grid/diag/asm/+asm/+ASM1/trace/+ASM1_r000_76832.trc:
ORA-27603: Cell storage I/O error, I/O failed on disk o/192.168.10.4/DATA_DM01_CD_07_dm01cel02 at offset 1140850688 for data length 1048576
ORA-27626: Exadata error: 201 (Generic I/O error)
WARNING: Read Failed. group:1 disk:19 AU:272 offset:0 size:1048576

Sun Jul 28 23:05:07 2013
NOTE: repairing group 1 file 274 extent 2039
SUCCESS: extent 2039 of file 274 group 1 repaired - all online mirror sides found readable, no repair required

storage node alert

Jul 28 23:05:07 dm01cel03 kernel: sd 0:2:8:0: SCSI error: return code = 0x00070002
Jul 28 23:05:07 dm01cel03 kernel: end_request: I/O error, dev sdi, sector 33916368

针对在DB端与storage端报出的IO error ,ORACLE用直接利用ASM中默认的处理行为,首先去read secondary extent上的block
并且会在primary extent上尝试做repair操作,针对这个repair操作分为两种行为,针对以上ASM alert log 发现:

1. SUCCESS: extent 4753 of file 502 group 1 repaired by relocating to a different AU on the same disk or the disk is offline

ASM use the mirrored copy which allows the disk to re-allocate data around any bad blocks in the physical disk media–也就是重新分配了一块物理的AU SIZE区域

2. SUCCESS: extent 2039 of file 274 group 1 repaired - all online mirror sides found readable, no repair required

ASM 做了 initiate 操作重写了这个SIZE。

针对这个报错,表明stroage disk的寿命在不断的缩减,同理随着磁盘物理坏块的增加,一旦disk达到critical的值那么这块盘将建议被replaced(利用ASM fast disk sync来同步).
另外针对这个问题,在传统存储端不是很容易见到这个错误,例如我们所常用的external redundancy,在存储层面的冗余一般已经足够安全,所以XD在storage端的表现并不如它的软件所提供的功能那么亮眼。(我们可以说传统存储的安全性>>xd sun storage?,也许有点鲁莽,Maybe..)
针对上述ASM的自动修复行为可以参考之前的文章

这里顺便提一下在normal redundancy环境中的Req_mir_free_MB与Usable_file_MB

[grid@dm01db01 trace]$ asmcmd lsdg
State    Type    Rebal  Sector  Block       AU  Total_MB  Free_MB  Req_mir_free_MB  Usable_file_MB  Offline_disks  Voting_files  Name
MOUNTED  NORMAL  N         512   4096  4194304  15593472  6102196          5197824          452186              0             N  DATA_DM01/
MOUNTED  NORMAL  N         512   4096  4194304    894720   893432           298240          297596              0             Y  DBFS_DG/
MOUNTED  NORMAL  N         512   4096  4194304   3896064  1717684          1298688          209498              0             N  RECO_DM01/

total_MB/3=Req_mir_free_MB why ? Req_mir_free_MB可以等同于热备盘,oracle在normal模式下,ASM disk 将被等价的切分成3块,来实现Req_mir_free_MB包含的disk能够替代任意primary,secondary中的盘。另外Req_mir_free_MB中的空间也是可以被用到的,当Usable_file_MB用光的时候,将会使用继续使用Req_mir_free_MB的空间来写数据
但是Req_mir_free_MB/2 才是真实可以写的空间,因为normal必须写两份数据。当Req_mir_free_MB耗尽时,其实已经不存在hot spare disk了,这个时候如果主备extend同时坏掉,那么就会出现丢数据。结合一个案例来说明:

[grid@dm01db01 ~]$ asmcmd -p
ASMCMD [+] > lsdg
State    Type    Rebal  Sector  Block       AU  Total_MB  Free_MB  Req_mir_free_MB  Usable_file_MB  Offline_disks  Voting_files  Name
MOUNTED  NORMAL  N         512   4096  4194304  15593472  9918184          5197824         2360180              0             N  DATA_DM01/
MOUNTED  NORMAL  N         512   4096  4194304    894720   893432           298240          297596              0             Y  DBFS_DG/
MOUNTED  NORMAL  N         512   4096  4194304   3896064    28248          1298688         -635220              0             N  RECO_DM01/

Usable_file_MB=-635220  ==> Req_mir_free_MB/2

恢复之后:

ASMCMD [+] > lsdg
State    Type    Rebal  Sector  Block       AU  Total_MB  Free_MB  Req_mir_free_MB  Usable_file_MB  Offline_disks  Voting_files  Name
MOUNTED  NORMAL  N         512   4096  4194304  15593472  9918184          5197824         2360180              0             N  DATA_DM01/
MOUNTED  NORMAL  N         512   4096  4194304    894720   893432           298240          297596              0             Y  DBFS_DG/
MOUNTED  NORMAL  N         512   4096  4194304   3896064  3860220          1298688         1280766              0             N  RECO_DM01/

实际上这个时候 Usable_file_MB=(1280766+635220)MB

Update exadata flash disk firmware due to poor performance on X2-2

March 14, 2013 Exadata, oracle No comments

ASR XD报警抛出如下错误

"
dm01cel02: Hardware Error has been detected in the Flash Logs on the Cell Node
dm01cel02: Please open up a Service Request with Oracle to resolve the Error
dm01cel02: View File /tmp/ASR-Flash-Fault-Check-Log for further Information of the Error
dm01cel02: This log File can be copied to the Oracle Service Request for Error Reference"

by running sundiag we got these informations:

	 name:              	 dm01cel02_FLASHCACHE
	 cellDisk:          	 FD_11_dm01cel02,FD_09_dm01cel02,FD_13_dm01cel02,FD_14_dm01cel02,FD_10_dm01cel02,FD_03_dm01cel02,FD_07_dm01cel02,FD_06_dm01cel02,FD_01_dm01cel02,FD_15_dm01cel02,FD_04_dm01cel02,FD_08_dm01cel02,FD_02_dm01cel02,FD_12_dm01cel02,FD_05_dm01cel02
	 creationTime:      	 2012-06-26T17:41:47+08:00
	 degradedCelldisks: 	 FD_00_dm01cel02
	 effectiveCacheSize:	 341.953125G
	 id:                	 8141998e-5451-4ee9-bf27-f61982297ade
	 size:              	 364.75G
	 status:            	 warning  --------------->error status


	 name:              	 FLASH_1_0
	 diskType:          	 FlashDisk
	 luns:              	 1_0
	 makeModel:         	 "MARVELL SD88SA02"
	 physicalFirmware:  	 D20Y
	 physicalInsertTime:	 2012-05-10T02:43:53+08:00
	 physicalSize:      	 22.8880615234375G
	 slotNumber:        	 "PCI Slot: 1; FDOM: 0"
	 status:            	 poor performance   ------->error status

Reference by note 1504776.1 and step by step updating firmware.

[root@dm02cel03 dev]# imageinfo 

Kernel version: 2.6.18-274.18.1.0.1.el5 #1 SMP Thu Feb 9 19:07:16 EST 2012 x86_64
Cell version: OSS_11.2.3.1.0_LINUX.X64_120304
Cell rpm version: cell-11.2.3.1.0_LINUX.X64_120304-1

Active image version: 11.2.3.1.0.120304
Active image activated: 2012-05-07 02:13:48 -0700
Active image status: success
Active system partition on device: /dev/md5
Active software partition on device: /dev/md7

In partition rollback: Impossible

Cell boot usb partition: /dev/sdm1
Cell boot usb version: 11.2.3.1.0.120304

Inactive image version: undefined
Rollback to the inactive partitions: Impossible

update firmware on storage cell node

1 comfirm that griddisk status should return no output

[root@dm02cel02 ~]# cellcli -e 'list griddisk attributes name,asmmodestatus' | \
>        egrep -v 'UNUSED|ONLINE'

or check other nodes disk status 

cellcli -e 'list griddisk attributes name,asmmodestatus' | \
       egrep -v 'UNUSED|ONLINE'

2 apply path 14793859

[root@dm02cel02 14793859]# ./p14793859.sh
[INFO] Begin patch 14793859
[INFO] Resetting ILOM to ensure clean reboot. Wait 240 seconds.
[INFO] Patch has been staged.  Stop cell services, then run '/opt/oracle.cellos/CheckHWnFWProfile -U /opt/oracle.cellos/iso/cellbits' to activate new firmware and restart system.
[INFO] If restarting cells in a rolling manner, ensure ASMDEACTIVATIONOUTCOME=Yes for all griddisks before activating new firmware.
[root@dm02cel02 14793859]# 
[root@dm02cel02 14793859]# 
[root@dm02cel02 14793859]# 
[root@dm02cel02 14793859]# cellcli -e 'alter griddisk all inactive'
GridDisk DATA_DM02_CD_00_dm02cel02 successfully altered
GridDisk DATA_DM02_CD_01_dm02cel02 successfully altered
GridDisk DATA_DM02_CD_02_dm02cel02 successfully altered
GridDisk DATA_DM02_CD_03_dm02cel02 successfully altered
GridDisk DATA_DM02_CD_04_dm02cel02 successfully altered
GridDisk DATA_DM02_CD_05_dm02cel02 successfully altered
GridDisk DATA_DM02_CD_06_dm02cel02 successfully altered
GridDisk DATA_DM02_CD_07_dm02cel02 successfully altered
GridDisk DATA_DM02_CD_08_dm02cel02 successfully altered
GridDisk DATA_DM02_CD_09_dm02cel02 successfully altered
GridDisk DATA_DM02_CD_10_dm02cel02 successfully altered
GridDisk DATA_DM02_CD_11_dm02cel02 successfully altered
GridDisk DBFS_DG_CD_02_dm02cel02 successfully altered
GridDisk DBFS_DG_CD_03_dm02cel02 successfully altered
GridDisk DBFS_DG_CD_04_dm02cel02 successfully altered
GridDisk DBFS_DG_CD_05_dm02cel02 successfully altered
GridDisk DBFS_DG_CD_06_dm02cel02 successfully altered
GridDisk DBFS_DG_CD_07_dm02cel02 successfully altered
GridDisk DBFS_DG_CD_08_dm02cel02 successfully altered
GridDisk DBFS_DG_CD_09_dm02cel02 successfully altered
GridDisk DBFS_DG_CD_10_dm02cel02 successfully altered
GridDisk DBFS_DG_CD_11_dm02cel02 successfully altered
GridDisk RECO_DM02_CD_00_dm02cel02 successfully altered
GridDisk RECO_DM02_CD_01_dm02cel02 successfully altered
GridDisk RECO_DM02_CD_02_dm02cel02 successfully altered
GridDisk RECO_DM02_CD_03_dm02cel02 successfully altered
GridDisk RECO_DM02_CD_04_dm02cel02 successfully altered
GridDisk RECO_DM02_CD_05_dm02cel02 successfully altered
GridDisk RECO_DM02_CD_06_dm02cel02 successfully altered
GridDisk RECO_DM02_CD_07_dm02cel02 successfully altered
GridDisk RECO_DM02_CD_08_dm02cel02 successfully altered
GridDisk RECO_DM02_CD_09_dm02cel02 successfully altered
GridDisk RECO_DM02_CD_10_dm02cel02 successfully altered
GridDisk RECO_DM02_CD_11_dm02cel02 successfully altered
[root@dm02cel02 14793859]# 
[root@dm02cel02 14793859]# 
[root@dm02cel02 14793859]# cellcli -e 'alter cell shutdown services all'

Stopping the RS, CELLSRV, and MS services...
The SHUTDOWN of services was successful.

3 active new firmware

[root@dm02cel02 14793859]# /opt/oracle.cellos/CheckHWnFWProfile -U /opt/oracle.cellos/iso/cellbits
SUNFlashDOM: OK
[INFO] Reset the ILOM before trying ILOM update. Wait for 240 seconds as part of the reset for ILOM to be ready.
Sent cold reset command to MC
Now updating the ILOM and the BIOS ...
[INFO] Start ILOM firmware upgrade to version 3.0.16.10.d r74499. Attempt 1 of 2.
[INFO] Generated temporary ILOM user: iu_hhyll
[INFO] Generated temporary ILOM password: ********
[INFO] ipmitool user set name 3 iu_hhyll
[INFO] ipmitool user set password 3 ********
[INFO] ipmitool sunoem cli force "set /SP/users/iu_hhyll role=aucro"
Connected. Use ^D to exit.
-> set /SP/users/iu_hhyll role=aucro

Set 'role' to 'aucro'


-> Session closed
Disconnected
[INFO] export IPMI_PASSWORD=********
[INFO] ipmiflash -v -I lanplus -H 10.61.1.239 -U iu_hhyll -E write /tmp/firmware/SUNBIOS force script config delaybios warning=0
[INFO] unset IPMI_PASSWORD
[INFO] ipmitool user set name 3 ""
Set User Name command failed (user 3, name ): Invalid data field in request

[INFO] ILOM update, ipmiflash return code 0

[INFO] ILOM will be reloaded in case of successful firmware upgrade to version 3.0.16.10.d r74499
[INFO] In this case close Sun ILOM Remote Console and corresponding Internet Exporer windows.
[INFO] Re-open both windows in few minutes.

[INFO] Waiting for the service processor to come up for 180 seconds
[INFO] ILOM firmware upgrade completed with success
Update all DOM firmware for all hbas
flash_dom -h 1 -d /tmp/firmware/SUNFlashDOM -b All -s

Aura Firmware Update Utility, Version 1.2.7

Copyright (c) 2009 Sun Microsystems, Inc. All rights reserved..

U.S. Government Rights - Commercial Software. Government users are subject
to the Sun Microsystems, Inc. standard license agreement and
applicable provisions of the FAR and its supplements.

Use is subject to license terms.

This distribution may include materials developed by third parties.

Sun, Sun Microsystems, the Sun logo, Sun StorageTek and ZFS are trademarks
or registered trademarks of Sun Microsystems, Inc. or its subsidiaries,
in the U.S. and other countries.



 1.  /proc/mpt/ioc0    LSI Logic SAS1068E C0     105      011b5c00     0

Using tag SD88SA02 to identify the DOMs
36.  0   0  Disk       ATA      MARVELL SD88SA02 D20Y
36.  0   1  Disk       ATA      MARVELL SD88SA02 D20Y
36.  0   2  Disk       ATA      MARVELL SD88SA02 D20Y
36.  0   3  Disk       ATA      MARVELL SD88SA02 D20Y
Found four DOMs.

36.  0   0  Disk       ATA      MARVELL SD88SA02 D20Y

Updating Marvell firmware on Bus: 0, Target: 0

SAS1068E's links are 3.0 G, 3.0 G, 3.0 G, 3.0 G, off, off, off, off

     B___T  Type       Vendor   Product          Rev
 1.  0   0  Disk       ATA      MARVELL SD88SA02 D20Y
 2.  0   1  Disk       ATA      MARVELL SD88SA02 D20Y
 3.  0   2  Disk       ATA      MARVELL SD88SA02 D20Y
 4.  0   3  Disk       ATA      MARVELL SD88SA02 D20Y
Update to bus 0, target 0
Using mode 5
Using BufferID 0

Downloading image...
Download succeeded
36.  0   1  Disk       ATA      MARVELL SD88SA02 D20Y

Updating Marvell firmware on Bus: 0, Target: 1

SAS1068E's links are 3.0 G, 3.0 G, 3.0 G, 3.0 G, off, off, off, off

     B___T  Type       Vendor   Product          Rev
 1.  0   0  Disk       ATA      MARVELL SD88SA02 D20Y
 2.  0   1  Disk       ATA      MARVELL SD88SA02 D20Y
 3.  0   2  Disk       ATA      MARVELL SD88SA02 D20Y
 4.  0   3  Disk       ATA      MARVELL SD88SA02 D20Y
Update to bus 0, target 1
Using mode 5
Using BufferID 0

Downloading image...
Download succeeded
36.  0   2  Disk       ATA      MARVELL SD88SA02 D20Y

Updating Marvell firmware on Bus: 0, Target: 2

SAS1068E's links are 3.0 G, 3.0 G, 3.0 G, 3.0 G, off, off, off, off

     B___T  Type       Vendor   Product          Rev
 1.  0   0  Disk       ATA      MARVELL SD88SA02 D20Y
 2.  0   1  Disk       ATA      MARVELL SD88SA02 D20Y
 3.  0   2  Disk       ATA      MARVELL SD88SA02 D20Y
 4.  0   3  Disk       ATA      MARVELL SD88SA02 D20Y
Update to bus 0, target 2
Using mode 5
Using BufferID 0

Downloading image...
Download succeeded
36.  0   3  Disk       ATA      MARVELL SD88SA02 D20Y

Updating Marvell firmware on Bus: 0, Target: 3

SAS1068E's links are 3.0 G, 3.0 G, 3.0 G, 3.0 G, off, off, off, off

     B___T  Type       Vendor   Product          Rev
 1.  0   0  Disk       ATA      MARVELL SD88SA02 D20Y
 2.  0   1  Disk       ATA      MARVELL SD88SA02 D20Y
 3.  0   2  Disk       ATA      MARVELL SD88SA02 D20Y
 4.  0   3  Disk       ATA      MARVELL SD88SA02 D20Y
Update to bus 0, target 3
Using mode 5
Using BufferID 0

Downloading image...
Download succeeded


Before the changes will take effect you must power down the system
After you have rebooted the OS you will need to reformat all of the 
updated DOMs.

flash_dom -h 2 -d /tmp/firmware/SUNFlashDOM -b All -s

Aura Firmware Update Utility, Version 1.2.7

Copyright (c) 2009 Sun Microsystems, Inc. All rights reserved..

U.S. Government Rights - Commercial Software. Government users are subject
to the Sun Microsystems, Inc. standard license agreement and
applicable provisions of the FAR and its supplements.

Use is subject to license terms.

This distribution may include materials developed by third parties.

Sun, Sun Microsystems, the Sun logo, Sun StorageTek and ZFS are trademarks
or registered trademarks of Sun Microsystems, Inc. or its subsidiaries,
in the U.S. and other countries.



 2.  /proc/mpt/ioc1    LSI Logic SAS1068E C0     105      011b5c00     0

Using tag SD88SA02 to identify the DOMs
36.  0   0  Disk       ATA      MARVELL SD88SA02 D20Y
36.  0   1  Disk       ATA      MARVELL SD88SA02 D20Y
36.  0   2  Disk       ATA      MARVELL SD88SA02 D20Y
36.  0   3  Disk       ATA      MARVELL SD88SA02 D20Y
Found four DOMs.

36.  0   0  Disk       ATA      MARVELL SD88SA02 D20Y

Updating Marvell firmware on Bus: 0, Target: 0

SAS1068E's links are 3.0 G, 3.0 G, 3.0 G, 3.0 G, off, off, off, off

     B___T  Type       Vendor   Product          Rev
 1.  0   0  Disk       ATA      MARVELL SD88SA02 D20Y
 2.  0   1  Disk       ATA      MARVELL SD88SA02 D20Y
 3.  0   2  Disk       ATA      MARVELL SD88SA02 D20Y
 4.  0   3  Disk       ATA      MARVELL SD88SA02 D20Y
Update to bus 0, target 0
Using mode 5
Using BufferID 0

Downloading image...
Download succeeded
36.  0   1  Disk       ATA      MARVELL SD88SA02 D20Y

Updating Marvell firmware on Bus: 0, Target: 1

SAS1068E's links are 3.0 G, 3.0 G, 3.0 G, 3.0 G, off, off, off, off

     B___T  Type       Vendor   Product          Rev
 1.  0   0  Disk       ATA      MARVELL SD88SA02 D20Y
 2.  0   1  Disk       ATA      MARVELL SD88SA02 D20Y
 3.  0   2  Disk       ATA      MARVELL SD88SA02 D20Y
 4.  0   3  Disk       ATA      MARVELL SD88SA02 D20Y
Update to bus 0, target 1
Using mode 5
Using BufferID 0

Downloading image...
Download succeeded
36.  0   2  Disk       ATA      MARVELL SD88SA02 D20Y

Updating Marvell firmware on Bus: 0, Target: 2

SAS1068E's links are 3.0 G, 3.0 G, 3.0 G, 3.0 G, off, off, off, off

     B___T  Type       Vendor   Product          Rev
 1.  0   0  Disk       ATA      MARVELL SD88SA02 D20Y
 2.  0   1  Disk       ATA      MARVELL SD88SA02 D20Y
 3.  0   2  Disk       ATA      MARVELL SD88SA02 D20Y
 4.  0   3  Disk       ATA      MARVELL SD88SA02 D20Y
Update to bus 0, target 2
Using mode 5
Using BufferID 0

Downloading image...
Download succeeded
36.  0   3  Disk       ATA      MARVELL SD88SA02 D20Y

Updating Marvell firmware on Bus: 0, Target: 3

SAS1068E's links are 3.0 G, 3.0 G, 3.0 G, 3.0 G, off, off, off, off

     B___T  Type       Vendor   Product          Rev
 1.  0   0  Disk       ATA      MARVELL SD88SA02 D20Y
 2.  0   1  Disk       ATA      MARVELL SD88SA02 D20Y
 3.  0   2  Disk       ATA      MARVELL SD88SA02 D20Y
 4.  0   3  Disk       ATA      MARVELL SD88SA02 D20Y
Update to bus 0, target 3
Using mode 5
Using BufferID 0

Downloading image...
Download succeeded


Before the changes will take effect you must power down the system
After you have rebooted the OS you will need to reformat all of the 
updated DOMs.

flash_dom -h 3 -d /tmp/firmware/SUNFlashDOM -b All -s

Aura Firmware Update Utility, Version 1.2.7

Copyright (c) 2009 Sun Microsystems, Inc. All rights reserved..

U.S. Government Rights - Commercial Software. Government users are subject
to the Sun Microsystems, Inc. standard license agreement and
applicable provisions of the FAR and its supplements.

Use is subject to license terms.

This distribution may include materials developed by third parties.

Sun, Sun Microsystems, the Sun logo, Sun StorageTek and ZFS are trademarks
or registered trademarks of Sun Microsystems, Inc. or its subsidiaries,
in the U.S. and other countries.



 3.  /proc/mpt/ioc2    LSI Logic SAS1068E C0     105      011b5c00     0

Using tag SD88SA02 to identify the DOMs
36.  0   0  Disk       ATA      MARVELL SD88SA02 D20Y
36.  0   1  Disk       ATA      MARVELL SD88SA02 D20Y
36.  0   2  Disk       ATA      MARVELL SD88SA02 D20Y
36.  0   3  Disk       ATA      MARVELL SD88SA02 D20Y
Found four DOMs.

36.  0   0  Disk       ATA      MARVELL SD88SA02 D20Y

Updating Marvell firmware on Bus: 0, Target: 0

SAS1068E's links are 3.0 G, 3.0 G, 3.0 G, 3.0 G, off, off, off, off

     B___T  Type       Vendor   Product          Rev
 1.  0   0  Disk       ATA      MARVELL SD88SA02 D20Y
 2.  0   1  Disk       ATA      MARVELL SD88SA02 D20Y
 3.  0   2  Disk       ATA      MARVELL SD88SA02 D20Y
 4.  0   3  Disk       ATA      MARVELL SD88SA02 D20Y
Update to bus 0, target 0
Using mode 5
Using BufferID 0

Downloading image...
Download succeeded
36.  0   1  Disk       ATA      MARVELL SD88SA02 D20Y

Updating Marvell firmware on Bus: 0, Target: 1

SAS1068E's links are 3.0 G, 3.0 G, 3.0 G, 3.0 G, off, off, off, off

     B___T  Type       Vendor   Product          Rev
 1.  0   0  Disk       ATA      MARVELL SD88SA02 D20Y
 2.  0   1  Disk       ATA      MARVELL SD88SA02 D20Y
 3.  0   2  Disk       ATA      MARVELL SD88SA02 D20Y
 4.  0   3  Disk       ATA      MARVELL SD88SA02 D20Y
Update to bus 0, target 1
Using mode 5
Using BufferID 0

Downloading image...
Download succeeded
36.  0   2  Disk       ATA      MARVELL SD88SA02 D20Y

Updating Marvell firmware on Bus: 0, Target: 2

SAS1068E's links are 3.0 G, 3.0 G, 3.0 G, 3.0 G, off, off, off, off

     B___T  Type       Vendor   Product          Rev
 1.  0   0  Disk       ATA      MARVELL SD88SA02 D20Y
 2.  0   1  Disk       ATA      MARVELL SD88SA02 D20Y
 3.  0   2  Disk       ATA      MARVELL SD88SA02 D20Y
 4.  0   3  Disk       ATA      MARVELL SD88SA02 D20Y
Update to bus 0, target 2
Using mode 5
Using BufferID 0

Downloading image...
Download succeeded
36.  0   3  Disk       ATA      MARVELL SD88SA02 D20Y

Updating Marvell firmware on Bus: 0, Target: 3

SAS1068E's links are 3.0 G, 3.0 G, 3.0 G, 3.0 G, off, off, off, off

     B___T  Type       Vendor   Product          Rev
 1.  0   0  Disk       ATA      MARVELL SD88SA02 D20Y
 2.  0   1  Disk       ATA      MARVELL SD88SA02 D20Y
 3.  0   2  Disk       ATA      MARVELL SD88SA02 D20Y
 4.  0   3  Disk       ATA      MARVELL SD88SA02 D20Y
Update to bus 0, target 3
Using mode 5
Using BufferID 0

Downloading image...
Download succeeded


Before the changes will take effect you must power down the system
After you have rebooted the OS you will need to reformat all of the 
updated DOMs.

flash_dom -h 4 -d /tmp/firmware/SUNFlashDOM -b All -s

Aura Firmware Update Utility, Version 1.2.7

Copyright (c) 2009 Sun Microsystems, Inc. All rights reserved..

U.S. Government Rights - Commercial Software. Government users are subject
to the Sun Microsystems, Inc. standard license agreement and
applicable provisions of the FAR and its supplements.

Use is subject to license terms.

This distribution may include materials developed by third parties.

Sun, Sun Microsystems, the Sun logo, Sun StorageTek and ZFS are trademarks
or registered trademarks of Sun Microsystems, Inc. or its subsidiaries,
in the U.S. and other countries.



 4.  /proc/mpt/ioc3    LSI Logic SAS1068E C0     105      011b5c00     0

Using tag SD88SA02 to identify the DOMs
36.  0   0  Disk       ATA      MARVELL SD88SA02 D20Y
36.  0   1  Disk       ATA      MARVELL SD88SA02 D20Y
36.  0   2  Disk       ATA      MARVELL SD88SA02 D20Y
36.  0   3  Disk       ATA      MARVELL SD88SA02 D20Y
Found four DOMs.

36.  0   0  Disk       ATA      MARVELL SD88SA02 D20Y

Updating Marvell firmware on Bus: 0, Target: 0

SAS1068E's links are 3.0 G, 3.0 G, 3.0 G, 3.0 G, off, off, off, off

     B___T  Type       Vendor   Product          Rev
 1.  0   0  Disk       ATA      MARVELL SD88SA02 D20Y
 2.  0   1  Disk       ATA      MARVELL SD88SA02 D20Y
 3.  0   2  Disk       ATA      MARVELL SD88SA02 D20Y
 4.  0   3  Disk       ATA      MARVELL SD88SA02 D20Y
Update to bus 0, target 0
Using mode 5
Using BufferID 0

Downloading image...
Download succeeded
36.  0   1  Disk       ATA      MARVELL SD88SA02 D20Y

Updating Marvell firmware on Bus: 0, Target: 1

SAS1068E's links are 3.0 G, 3.0 G, 3.0 G, 3.0 G, off, off, off, off

     B___T  Type       Vendor   Product          Rev
 1.  0   0  Disk       ATA      MARVELL SD88SA02 D20Y
 2.  0   1  Disk       ATA      MARVELL SD88SA02 D20Y
 3.  0   2  Disk       ATA      MARVELL SD88SA02 D20Y
 4.  0   3  Disk       ATA      MARVELL SD88SA02 D20Y
Update to bus 0, target 1
Using mode 5
Using BufferID 0

Downloading image...
Download succeeded
36.  0   2  Disk       ATA      MARVELL SD88SA02 D20Y

Updating Marvell firmware on Bus: 0, Target: 2

SAS1068E's links are 3.0 G, 3.0 G, 3.0 G, 3.0 G, off, off, off, off

     B___T  Type       Vendor   Product          Rev
 1.  0   0  Disk       ATA      MARVELL SD88SA02 D20Y
 2.  0   1  Disk       ATA      MARVELL SD88SA02 D20Y
 3.  0   2  Disk       ATA      MARVELL SD88SA02 D20Y
 4.  0   3  Disk       ATA      MARVELL SD88SA02 D20Y
Update to bus 0, target 2
Using mode 5
Using BufferID 0

Downloading image...
Download succeeded
36.  0   3  Disk       ATA      MARVELL SD88SA02 D20Y

Updating Marvell firmware on Bus: 0, Target: 3

SAS1068E's links are 3.0 G, 3.0 G, 3.0 G, 3.0 G, off, off, off, off

     B___T  Type       Vendor   Product          Rev
 1.  0   0  Disk       ATA      MARVELL SD88SA02 D20Y
 2.  0   1  Disk       ATA      MARVELL SD88SA02 D20Y
 3.  0   2  Disk       ATA      MARVELL SD88SA02 D20Y
 4.  0   3  Disk       ATA      MARVELL SD88SA02 D20Y
Update to bus 0, target 3
Using mode 5
Using BufferID 0

Downloading image...
Download succeeded


Before the changes will take effect you must power down the system
After you have rebooted the OS you will need to reformat all of the 
updated DOMs.

[WARNING] The hardware and firmware are not supported. See details below

[BIOSVersion]
Requires:
 08120104 
Found:
 08080102

[BIOSDate]
Requires:
 05/08/2012 
Found:
 05/23/2011

[ILOMVersion]
Requires:
 3.0.16.10.d r74499 
Found:
 3.0.16.10 r65138

[PCISlot:HBA:LSIModel:LSIhw:MPThw:LSIfw:MPTBios:DOM:OSDevice:DOMMake:DOMModel:DOMfw:CountAuraCountDOM]
Requires:
 AllSlots AllHBAs SAS1068E B3orC0 105 011b5c00 06.26.00.00 AllDOMs NotApplicable MARVELL SD88SA02 D21Y  4_16
 
Found:
 AllSlots AllHBAs SAS1068E B3orC0 105 011b5c00 06.26.00.00 AllDOMs NotApplicable MARVELL SD88SA02 D20Y  4_16

[WARNING] The hardware and firmware are not supported. See details above
[INFO] Rebooting in 5 minutes for firmware updates to take effect ...

                                     
Cache Flush is successfully done on adapter 0.

Exit Code: 0x00
Starting ipmi drivers: [  OK  ]

Starting ipmi_watchdog driver: [  OK  ]

Starting ipmi_poweroff driver: [  OK  ]


                                     
Cache Flush is successfully done on adapter 0.

Exit Code: 0x00
[INFO] Power cycle using /tmp/firmware/SUNBIOSPowerCycle
Wait 180 seconds for the ILOM power cycle package to take effect. Then start the power down.


Broadcast message from root (pts/0) (Thu Mar 14 13:36:45 2013):




The system is going down for system halt NOW!


Connection closed by foreign host.

——–

after reboot

1 Confirm the firmware update is complete

[root@dm02cel02 ~]#  cellcli -e "list physicaldisk attributes name, physicalFirmware \
>        where diskType = 'FlashDisk'"
	 FLASH_1_0	 D21Y
	 FLASH_1_1	 D21Y
	 FLASH_1_2	 D21Y
	 FLASH_1_3	 D21Y
	 FLASH_2_0	 D21Y
	 FLASH_2_1	 D21Y
	 FLASH_2_2	 D21Y
	 FLASH_2_3	 D21Y
	 FLASH_4_0	 D21Y
	 FLASH_4_1	 D21Y
	 FLASH_4_2	 D21Y
	 FLASH_4_3	 D21Y
	 FLASH_5_0	 D21Y
	 FLASH_5_1	 D21Y
	 FLASH_5_2	 D21Y
	 FLASH_5_3	 D21Y
[root@dm02cel02 ~]# /opt/oracle.cellos/CheckHWnFWProfile -d | grep -A1 ILOMVersion
[ILOMVersion]
 3.0.16.10.d r74499
[root@dm02cel02 ~]# /opt/oracle.cellos/CheckHWnFWProfile -c loose
[SUCCESS] The hardware and firmware profile matches one of the supported profiles


2 active griddisks

[root@dm02cel02 ~]# cellcli -e 'alter griddisk all active'
GridDisk DATA_DM02_CD_00_dm02cel02 successfully altered
GridDisk DATA_DM02_CD_01_dm02cel02 successfully altered
GridDisk DATA_DM02_CD_02_dm02cel02 successfully altered
GridDisk DATA_DM02_CD_03_dm02cel02 successfully altered
GridDisk DATA_DM02_CD_04_dm02cel02 successfully altered
GridDisk DATA_DM02_CD_05_dm02cel02 successfully altered
GridDisk DATA_DM02_CD_06_dm02cel02 successfully altered
GridDisk DATA_DM02_CD_07_dm02cel02 successfully altered
GridDisk DATA_DM02_CD_08_dm02cel02 successfully altered
GridDisk DATA_DM02_CD_09_dm02cel02 successfully altered
GridDisk DATA_DM02_CD_10_dm02cel02 successfully altered
GridDisk DATA_DM02_CD_11_dm02cel02 successfully altered
GridDisk DBFS_DG_CD_02_dm02cel02 successfully altered
GridDisk DBFS_DG_CD_03_dm02cel02 successfully altered
GridDisk DBFS_DG_CD_04_dm02cel02 successfully altered
GridDisk DBFS_DG_CD_05_dm02cel02 successfully altered
GridDisk DBFS_DG_CD_06_dm02cel02 successfully altered
GridDisk DBFS_DG_CD_07_dm02cel02 successfully altered
GridDisk DBFS_DG_CD_08_dm02cel02 successfully altered
GridDisk DBFS_DG_CD_09_dm02cel02 successfully altered
GridDisk DBFS_DG_CD_10_dm02cel02 successfully altered
GridDisk DBFS_DG_CD_11_dm02cel02 successfully altered
GridDisk RECO_DM02_CD_00_dm02cel02 successfully altered
GridDisk RECO_DM02_CD_01_dm02cel02 successfully altered
GridDisk RECO_DM02_CD_02_dm02cel02 successfully altered
GridDisk RECO_DM02_CD_03_dm02cel02 successfully altered
GridDisk RECO_DM02_CD_04_dm02cel02 successfully altered
GridDisk RECO_DM02_CD_05_dm02cel02 successfully altered
GridDisk RECO_DM02_CD_06_dm02cel02 successfully altered
GridDisk RECO_DM02_CD_07_dm02cel02 successfully altered
GridDisk RECO_DM02_CD_08_dm02cel02 successfully altered
GridDisk RECO_DM02_CD_09_dm02cel02 successfully altered
GridDisk RECO_DM02_CD_10_dm02cel02 successfully altered
GridDisk RECO_DM02_CD_11_dm02cel02 successfully altered

3 Recreate the CELLBOOT USB

[root@dm02cel02 ~]# /opt/oracle.cellos/make_cellboot_usb.sh -execute -force
Candidate for the Oracle Exadata Cell start up boot device     : /dev/sdm
Partition on candidate device                                  : /dev/sdm1
The current product version                                    : 11.2.3.1.0.120304
Label of the current Oracle Exadata Cell start up boot device  : CELLBOOT
The current CELLBOOT USB product version                       : 11.2.3.1.0.120304
[DEBUG] set_cell_boot_usb: cell usb        : /dev/sdm
[DEBUG] set_cell_boot_usb: mnt sys         : /
[DEBUG] set_cell_boot_usb: preserve        : preserve
[DEBUG] set_cell_boot_usb: mnt usb         : /mnt/usb.make.cellboot
[DEBUG] set_cell_boot_usb: lock            : /tmp/usb.make.cellboot.lock
[DEBUG] set_cell_boot_usb: serial console  : 
[DEBUG] set_cell_boot_usb: kernel mode     : kernel
[DEBUG] set_cell_boot_usb: mnt iso save    : 
Create CELLBOOT USB on device /dev/sdm

The number of cylinders for this disk is set to 2825.
There is nothing wrong with that, but this is larger than 1024,
and could in certain setups cause problems with:
1) software that runs at boot time (e.g., old versions of LILO)
2) booting and partitioning software from other OSs
   (e.g., DOS FDISK, OS/2 FDISK)

Command (m for help): Building a new DOS disklabel. Changes will remain in memory only,
until you decide to write them. After that, of course, the previous
content won't be recoverable.


The number of cylinders for this disk is set to 2825.
There is nothing wrong with that, but this is larger than 1024,
and could in certain setups cause problems with:
1) software that runs at boot time (e.g., old versions of LILO)
2) booting and partitioning software from other OSs
   (e.g., DOS FDISK, OS/2 FDISK)
Warning: invalid flag 0x0000 of partition table 4 will be corrected by w(rite)

Command (m for help): Command action
   e   extended
   p   primary partition (1-4)
Partition number (1-4): First cylinder (1-2825, default 1): Last cylinder or +size or +sizeM or +sizeK (1-2825, default 2825): 
Command (m for help): The partition table has been altered!

Calling ioctl() to re-read partition table.
Syncing disks.
mke2fs 1.39 (29-May-2006)
Filesystem label=
OS type: Linux
Block size=4096 (log=2)
Fragment size=4096 (log=2)
489600 inodes, 978513 blocks
48925 blocks (5.00%) reserved for the super user
First data block=0
Maximum filesystem blocks=1002438656
30 block groups
32768 blocks per group, 32768 fragments per group
16320 inodes per group
Superblock backups stored on blocks: 
	32768, 98304, 163840, 229376, 294912, 819200, 884736

Writing inode tables: done                            
Creating journal (16384 blocks): done
Writing superblocks and filesystem accounting information: done

This filesystem will be automatically checked every 32 mounts or
180 days, whichever comes first.  Use tune2fs -c or -i to override.
tune2fs 1.39 (29-May-2006)
Setting maximal mount count to -1
Setting interval between checks to 0 seconds
//opt/oracle.cellos/iso ~
Copying ./imgboot.lst to /mnt/usb.make.cellboot/. ...
...

Copying ./imgboot.lst.14793859 to /mnt/usb.make.cellboot/. ...
Copying ./initrd.img.14793859 to /mnt/usb.make.cellboot/. ...
Running "tar -x -j -p -v -C /mnt/usb.make.cellboot -f //opt/oracle.cellos/iso/cellbits/cellboot.tbz initrd-*.img vmlinuz-* grub/" ...
grub/
grub/xfs_stage1_5
grub/vstafs_stage1_5
grub/menu.lst
grub/e2fs_stage1_5
grub/grub.conf
grub/oracle.xpm.gz
grub/device.map
grub/stage1
grub/jfs_stage1_5
grub/reiserfs_stage1_5
grub/minix_stage1_5
grub/ufs2_stage1_5
grub/stage2
grub/iso9660_stage1_5
grub/fat_stage1_5
grub/ffs_stage1_5
initrd-2.6.18-194.3.1.0.2.el5.img
vmlinuz-2.6.18-194.3.1.0.2.el5
Copying //opt/oracle.cellos/tmpl/oracle.xpm.gz to /mnt/usb.make.cellboot/grub/oracle.xpm.gz ...
[DEBUG] set_grub_conf_n_initrd: mnt sys        : /
[DEBUG] set_grub_conf_n_initrd: grub template  : USB_grub.in
[DEBUG] set_grub_conf_n_initrd: boot dir       : /mnt/usb.make.cellboot
[DEBUG] set_grub_conf_n_initrd: kernel param   : 2.6.18-274.18.1.0.1.el5
[DEBUG] set_grub_conf_n_initrd: marker         : I_am_CELLBOOT_usb
[DEBUG] set_grub_conf_n_initrd: mode           : 
[DEBUG] set_grub_conf_n_initrd: Image id file: //opt/oracle.cellos/image.id
[DEBUG] set_grub_conf_n_initrd: System device where image id exists: /dev/md5
[DEBUG] set_grub_conf_n_initrd: Kernel version: 2.6.18-274.18.1.0.1.el5
[DEBUG] set_grub_conf_n_initrd: System device with image_id (/dev/md5) and kernel version (2.6.18-274.18.1.0.1.el5) are in sync
[DEBUG] set_grub_conf_n_initrd: Full kernel version: 2.6.18-274.18.1.0.1.el5
[DEBUG] set_grub_conf_n_initrd: system device for the next boot: /dev/md5
[DEBUG] set_grub_conf_n_initrd: initrd for the next boot: /mnt/usb.make.cellboot/initrd-2.6.18-274.18.1.0.1.el5.img
[INFO] set_grub_conf_n_initrd: Set /dev/md5 in /mnt/usb.make.cellboot/I_am_CELLBOOT_usb
[INFO] Set kernel 2.6.18-274.18.1.0.1.el5 and system device /dev/md5 in generated /mnt/usb.make.cellboot/grub/grub.conf
[INFO] Set /dev/md5 in /mnt/usb.make.cellboot/initrd-2.6.18-274.18.1.0.1.el5.img
43450 blocks
log/
log/do_image.sh.log
log/cellos.11.2.3.1.0.120304.20120507.021428.PDT.tar.gz
log/cellos.11.2.3.1.0.120304.20120507.032629.EDT.tar.gz


    GNU GRUB  version 0.97  (640K lower / 3072K upper memory)

 [ Minimal BASH-like line editing is supported.  For the first word, TAB
   lists possible command completions.  Anywhere else TAB lists the possible
   completions of a device/filename.]
grub> root (hd0,0)
 Filesystem type is ext2fs, partition type 0x83
grub> setup (hd0)
 Checking if "/boot/grub/stage1" exists... no
 Checking if "/grub/stage1" exists... yes
 Checking if "/grub/stage2" exists... yes
 Checking if "/grub/e2fs_stage1_5" exists... yes
 Running "embed /grub/e2fs_stage1_5 (hd0)"...  16 sectors are embedded.
succeeded
 Running "install /grub/stage1 (hd0) (hd0)1+16 p (hd0,0)/grub/stage2 /grub/grub.conf"... succeeded
Done.


4 checking ASM disk recover status

[root@dm02cel02 ~]# cellcli -e list griddisk attributes name, asmmodestatus
	 DATA_DM02_CD_00_dm02cel02	 SYNCING  ---> recovering
	 DATA_DM02_CD_01_dm02cel02	 SYNCING
	 DATA_DM02_CD_02_dm02cel02	 SYNCING
	 DATA_DM02_CD_03_dm02cel02	 SYNCING
	 DATA_DM02_CD_04_dm02cel02	 SYNCING
	 DATA_DM02_CD_05_dm02cel02	 SYNCING
	 DATA_DM02_CD_06_dm02cel02	 SYNCING
	 DATA_DM02_CD_07_dm02cel02	 SYNCING
	 DATA_DM02_CD_08_dm02cel02	 SYNCING
	 DATA_DM02_CD_09_dm02cel02	 SYNCING
	 DATA_DM02_CD_10_dm02cel02	 SYNCING
	 DATA_DM02_CD_11_dm02cel02	 SYNCING
	 DBFS_DG_CD_02_dm02cel02  	 ONLINE   ---> complete
	 DBFS_DG_CD_03_dm02cel02  	 ONLINE
	 DBFS_DG_CD_04_dm02cel02  	 ONLINE
	 DBFS_DG_CD_05_dm02cel02  	 ONLINE
	 DBFS_DG_CD_06_dm02cel02  	 ONLINE
	 DBFS_DG_CD_07_dm02cel02  	 ONLINE
	 DBFS_DG_CD_08_dm02cel02  	 ONLINE
	 DBFS_DG_CD_09_dm02cel02  	 ONLINE
	 DBFS_DG_CD_10_dm02cel02  	 ONLINE
	 DBFS_DG_CD_11_dm02cel02  	 ONLINE
	 RECO_DM02_CD_00_dm02cel02	 OFFLINE  ---> not start recovering
	 RECO_DM02_CD_01_dm02cel02	 OFFLINE
	 RECO_DM02_CD_02_dm02cel02	 OFFLINE
	 RECO_DM02_CD_03_dm02cel02	 OFFLINE
	 RECO_DM02_CD_04_dm02cel02	 OFFLINE
	 RECO_DM02_CD_05_dm02cel02	 OFFLINE
	 RECO_DM02_CD_06_dm02cel02	 OFFLINE
	 RECO_DM02_CD_07_dm02cel02	 OFFLINE
	 RECO_DM02_CD_08_dm02cel02	 OFFLINE
	 RECO_DM02_CD_09_dm02cel02	 OFFLINE
	 RECO_DM02_CD_10_dm02cel02	 OFFLINE
	 RECO_DM02_CD_11_dm02cel02	 OFFLINE
[root@dm02cel02 ~]# cellcli -e list griddisk attributes name, asmmodestatus
	 DATA_DM02_CD_00_dm02cel02	 SYNCING
	 DATA_DM02_CD_01_dm02cel02	 SYNCING
	 DATA_DM02_CD_02_dm02cel02	 SYNCING
	 DATA_DM02_CD_03_dm02cel02	 SYNCING
	 DATA_DM02_CD_04_dm02cel02	 SYNCING
	 DATA_DM02_CD_05_dm02cel02	 SYNCING
	 DATA_DM02_CD_06_dm02cel02	 SYNCING
	 DATA_DM02_CD_07_dm02cel02	 SYNCING
	 DATA_DM02_CD_08_dm02cel02	 SYNCING
	 DATA_DM02_CD_09_dm02cel02	 SYNCING
	 DATA_DM02_CD_10_dm02cel02	 SYNCING
	 DATA_DM02_CD_11_dm02cel02	 SYNCING
	 DBFS_DG_CD_02_dm02cel02  	 ONLINE
	 DBFS_DG_CD_03_dm02cel02  	 ONLINE
	 DBFS_DG_CD_04_dm02cel02  	 ONLINE
	 DBFS_DG_CD_05_dm02cel02  	 ONLINE
	 DBFS_DG_CD_06_dm02cel02  	 ONLINE
	 DBFS_DG_CD_07_dm02cel02  	 ONLINE
	 DBFS_DG_CD_08_dm02cel02  	 ONLINE
	 DBFS_DG_CD_09_dm02cel02  	 ONLINE
	 DBFS_DG_CD_10_dm02cel02  	 ONLINE
	 DBFS_DG_CD_11_dm02cel02  	 ONLINE
	 RECO_DM02_CD_00_dm02cel02	 SYNCING
	 RECO_DM02_CD_01_dm02cel02	 SYNCING
	 RECO_DM02_CD_02_dm02cel02	 SYNCING
	 RECO_DM02_CD_03_dm02cel02	 SYNCING
	 RECO_DM02_CD_04_dm02cel02	 SYNCING
	 RECO_DM02_CD_05_dm02cel02	 SYNCING
	 RECO_DM02_CD_06_dm02cel02	 SYNCING
	 RECO_DM02_CD_07_dm02cel02	 SYNCING
	 RECO_DM02_CD_08_dm02cel02	 SYNCING
	 RECO_DM02_CD_09_dm02cel02	 SYNCING
	 RECO_DM02_CD_10_dm02cel02	 SYNCING
	 RECO_DM02_CD_11_dm02cel02	 SYNCING
NOTE: Found o/192.168.10.4/DATA_DM02_CD_00_dm02cel02 for disk DATA_DM02_CD_00_DM02CEL02
SUCCESS: disk DATA_DM02_CD_00_DM02CEL02 (12.3746554323) replaced in diskgroup DATA_DM02
..

NOTE: Found o/192.168.10.4/DATA_DM02_CD_11_dm02cel02 for disk DATA_DM02_CD_11_DM02CEL02
SUCCESS: disk DATA_DM02_CD_11_DM02CEL02 (23.3746554331) replaced in diskgroup DATA_DM02

NOTE: disk 12 (DATA_DM02_CD_00_DM02CEL02) in group 1 (DATA_DM02) is online for writes
..
NOTE: disk 23 (DATA_DM02_CD_11_DM02CEL02) in group 1 (DATA_DM02) is online for writes

NOTE: Found o/192.168.10.4/RECO_DM02_CD_00_dm02cel02 for disk RECO_DM02_CD_00_DM02CEL02
SUCCESS: disk RECO_DM02_CD_00_DM02CEL02 (12.3746554389) replaced in diskgroup RECO_DM02
..

NOTE: Found o/192.168.10.4/RECO_DM02_CD_11_dm02cel02 for disk RECO_DM02_CD_11_DM02CEL02
SUCCESS: disk RECO_DM02_CD_11_DM02CEL02 (23.3746554395) replaced in diskgroup RECO_DM02



Thu Mar 14 13:48:47 2013
NOTE: disk 12 (RECO_DM02_CD_00_DM02CEL02) in group 3 (RECO_DM02) is online for reads
..
NOTE: disk 23 (RECO_DM02_CD_11_DM02CEL02) in group 3 (RECO_DM02) is online for reads


Thu Mar 14 13:55:58 2013
NOTE: disk 12 (DATA_DM02_CD_00_DM02CEL02) in group 1 (DATA_DM02) is online for reads
..
NOTE: disk 23 (DATA_DM02_CD_11_DM02CEL02) in group 1 (DATA_DM02) is online for reads

5 after all disks status are online and check flash disk work well.

CellCLI> LIST METRICCURRENT WHERE objectType = 'FLASHCACHE'
	 FC_BYKEEP_OVERWR       	 FLASHCACHE	 0.000 MB
	 FC_BYKEEP_OVERWR_SEC   	 FLASHCACHE	 0.000 MB/sec
	 FC_BYKEEP_USED         	 FLASHCACHE	 0.000 MB
	 FC_BY_USED             	 FLASHCACHE	 9,988 MB
	 FC_IO_BYKEEP_R         	 FLASHCACHE	 0.000 MB
	 FC_IO_BYKEEP_R_SEC     	 FLASHCACHE	 0.000 MB/sec
	 FC_IO_BYKEEP_W         	 FLASHCACHE	 0.000 MB
	 FC_IO_BYKEEP_W_SEC     	 FLASHCACHE	 0.000 MB/sec
	 FC_IO_BY_R             	 FLASHCACHE	 5,790 MB
	 FC_IO_BY_R_MISS        	 FLASHCACHE	 12,530 MB
	 FC_IO_BY_R_MISS_SEC    	 FLASHCACHE	 1.540 MB/sec
	 FC_IO_BY_R_SEC         	 FLASHCACHE	 5.707 MB/sec
	 FC_IO_BY_R_SKIP        	 FLASHCACHE	 120,480 MB
	 FC_IO_BY_R_SKIP_SEC    	 FLASHCACHE	 202 MB/sec
	 FC_IO_BY_W             	 FLASHCACHE	 12,110 MB
	 FC_IO_BY_W_SEC         	 FLASHCACHE	 1.415 MB/sec
	 FC_IO_ERRS             	 FLASHCACHE	 0
	 FC_IO_RQKEEP_R         	 FLASHCACHE	 0 IO requests
	 FC_IO_RQKEEP_R_MISS    	 FLASHCACHE	 0 IO requests
	 FC_IO_RQKEEP_R_MISS_SEC	 FLASHCACHE	 0.0 IO/sec
	 FC_IO_RQKEEP_R_SEC     	 FLASHCACHE	 0.0 IO/sec
	 FC_IO_RQKEEP_R_SKIP    	 FLASHCACHE	 0 IO requests
	 FC_IO_RQKEEP_R_SKIP_SEC	 FLASHCACHE	 0.0 IO/sec
	 FC_IO_RQKEEP_W         	 FLASHCACHE	 0 IO requests
	 FC_IO_RQKEEP_W_SEC     	 FLASHCACHE	 0.0 IO/sec
	 FC_IO_RQ_R             	 FLASHCACHE	 579,959 IO requests
	 FC_IO_RQ_R_MISS        	 FLASHCACHE	 386,847 IO requests
	 FC_IO_RQ_R_MISS_SEC    	 FLASHCACHE	 48.2 IO/sec
	 FC_IO_RQ_R_SEC         	 FLASHCACHE	 651 IO/sec
	 FC_IO_RQ_R_SKIP        	 FLASHCACHE	 220,180 IO requests
	 FC_IO_RQ_R_SKIP_SEC    	 FLASHCACHE	 263 IO/sec
	 FC_IO_RQ_W             	 FLASHCACHE	 455,710 IO requests
	 FC_IO_RQ_W_SEC         	 FLASHCACHE	 47.6 IO/sec

CellCLI> LIST METRICCURRENT WHERE objectType = 'FLASHLOG'  
	 FL_ACTUAL_OUTLIERS           	 FLASHLOG	 0 IO requests
	 FL_BY_KEEP                   	 FLASHLOG	 0
	 FL_DISK_FIRST                	 FLASHLOG	 155,818 IO requests
	 FL_DISK_IO_ERRS              	 FLASHLOG	 0 IO requests
	 FL_EFFICIENCY_PERCENTAGE     	 FLASHLOG	 100 %
	 FL_EFFICIENCY_PERCENTAGE_HOUR	 FLASHLOG	 100 %
	 FL_FLASH_FIRST               	 FLASHLOG	 11,176 IO requests
	 FL_FLASH_IO_ERRS             	 FLASHLOG	 0 IO requests
	 FL_FLASH_ONLY_OUTLIERS       	 FLASHLOG	 0 IO requests
	 FL_IO_DB_BY_W                	 FLASHLOG	 5,106 MB
	 FL_IO_DB_BY_W_SEC            	 FLASHLOG	 0.768 MB/sec
	 FL_IO_FL_BY_W                	 FLASHLOG	 5,605 MB
	 FL_IO_FL_BY_W_SEC            	 FLASHLOG	 1.955 MB/sec
	 FL_IO_W                      	 FLASHLOG	 166,994 IO requests
	 FL_IO_W_SKIP_BUSY            	 FLASHLOG	 0 IO requests
	 FL_IO_W_SKIP_BUSY_MIN        	 FLASHLOG	 0.0 IO/sec
	 FL_IO_W_SKIP_LARGE           	 FLASHLOG	 0 IO requests
	 FL_PREVENTED_OUTLIERS        	 FLASHLOG	 0 IO requests

6 When all these steps are completed proceed to the next storage server.

———————————–

Update database server node.

1 after all rolling update on cell nodes do database node update one by one.

2 Shutdown and disable Oracle Clusterware. Database instances and other
cluster resources will be stopped in this step.

# GRID_HOME/grid/bin/crsctl stop crs
# GRID_HOME/grid/bin/crsctl disable crs
3 apply patch 14793859

# cd /root/14793859
# ./p14793859.sh
4 Activate the new firmware
/opt/oracle.cellos/CheckHWnFWProfile -U /opt/oracle.cellos/iso/cellbits

5 Confirm the ILOM update is complete.

# /opt/oracle.cellos/CheckHWnFWProfile -d | grep -A1 ILOMVersion
[ILOMVersion]
3.0.16.10.d r74499

Run the CheckHWnFWProfile utility to verify firmware profile is supported

# /opt/oracle.cellos/CheckHWnFWProfile -c loose
[SUCCESS] The hardware and firmware profile matches one of the supported profiles

6 Enable and startup Oracle Clusterware.

# GRID_HOME/grid/bin/crsctl enable crs
# GRID_HOME/grid/bin/crsctl start crs

7 Proceed to the next database server and end of all of firmware update.

Exadata X2-2 Hang死一例

November 3, 2012 Exadata, oracle No comments

Exadata不是万能的 一直觉得xd DB端的服务器太差,但是oracle总是可以理直气壮的说我们卖的不是硬件是软件.垃圾SQL压死Exadata还是很容易的

系统为X2-2 Quarter rack Red Hat Enterprise Linux Server release 5.7 (Tikanga)

[root@dm01cel02 ~]# imageinfo 

Kernel version: 2.6.18-274.18.1.0.1.el5 #1 SMP Thu Feb 9 19:07:16 EST 2012 x86_64
Cell version: OSS_11.2.3.1.0_LINUX.X64_120304
Cell rpm version: cell-11.2.3.1.0_LINUX.X64_120304-1

Active image version: 11.2.3.1.0.120304
Active image activated: 2012-05-09 14:12:31 -0700
Active image status: success
Active system partition on device: /dev/md5
Active software partition on device: /dev/md7

In partition rollback: Impossible

Cell boot usb partition: /dev/sdm1
Cell boot usb version: 11.2.3.1.0.120304

Inactive image version: undefined
Rollback to the inactive partitions: Impossible

BI Exadata 下午节点2出现hang 死

Node1 log:

2012-11-02 13:23:46.299: [    CSSD][1094125888]clssgmGrockOpTagProcess: Request to commission member(2) using key(2) for grock(CLSN.ONSPROC.MASTER)
2012-11-02 13:23:46.299: [    CSSD][1094125888]clssgmUpdateGrpData: grock(CLSN.ONSPROC.MASTER), commissioner(2/2)
2012-11-02 13:23:46.299: [    CSSD][1094125888]clssgmQueueGrockEvent: groupName(CLSN.ONSPROC.MASTER) count(2) master(1) event(18), incarn 2, mbrc 1, to member 1, events 0xa0, state 0x0
2012-11-02 13:23:46.299: [    CSSD][1094125888]clssgmHandleGrockRcfgUpdate: grock(CLSN.ONSPROC.MASTER), updateseq(25), status(0), sendresp(1)
2012-11-02 13:23:46.399: [    CSSD][1094125888]clssgmTestSetLastGrockUpdate: grock(CLSN.ONSPROC.MASTER), updateseq(25) msgseq(26), lastupdt<0x2aaab45140e0>, ignoreseq(0)
2012-11-02 13:23:46.399: [    CSSD][1094125888]clssgmGrockOpTagProcess: Request to commission member(-1) using key(2) for grock(CLSN.ONSPROC.MASTER)
2012-11-02 13:23:46.399: [    CSSD][1094125888]clssgmUpdateGrpData: grock(CLSN.ONSPROC.MASTER), commissioner(-1/0)
2012-11-02 13:23:46.399: [    CSSD][1094125888]clssgmQueueGrockEvent: groupName(CLSN.ONSPROC.MASTER) count(2) master(1) event(18), incarn 0, mbrc 1, to member 1, events 0xa0, state 0x0
2012-11-02 13:23:46.399: [    CSSD][1094125888]clssgmHandleGrockRcfgUpdate: grock(CLSN.ONSPROC.MASTER), updateseq(26), status(0), sendresp(1)
2012-11-02 13:23:46.529: [    CSSD][1094125888]clssgmTestSetLastGrockUpdate: grock(CLSN.ONSNETPROC.MASTER), updateseq(25) msgseq(26), lastupdt<0x2aaab809aca0>, ignoreseq(0)
2012-11-02 13:23:46.529: [    CSSD][1094125888]clssgmAddMember: member (2/0x2aaaaca7bd60) added. pbsz(0) prsz(0) flags 0x0 to grock (0x2aaab807bc60/CLSN.ONSNETPROC.MASTER)
2012-11-02 13:23:46.529: [    CSSD][1094125888]clssgmQueueGrockEvent: groupName(CLSN.ONSNETPROC.MASTER) count(2) master(1) event(1), incarn 8, mbrc 2, to member 1, events 0xa0, state 0x0
2012-11-02 13:23:46.529: [    CSSD][1094125888]clssgmCommonAddMember: global group grock CLSN.ONSNETPROC.MASTER member(2/Remote) node(2) flags 0x0 0xaca7bd60
2012-11-02 13:23:46.529: [    CSSD][1094125888]clssgmHandleGrockRcfgUpdate: grock(CLSN.ONSNETPROC.MASTER), updateseq(26), status(0), sendresp(1)
2012-11-02 13:23:46.627: [    CSSD][1094125888]clssgmTestSetLastGrockUpdate: grock(CLSN.ONSNETPROC.MASTER), updateseq(26) msgseq(27), lastupdt<0x2aaaaca468a0>, ignoreseq(0)
2012-11-02 13:23:46.627: [    CSSD][1094125888]clssgmGrockOpTagProcess: Request to commission member(2) using key(2) for grock(CLSN.ONSNETPROC.MASTER)
2012-11-02 13:23:46.627: [    CSSD][1094125888]clssgmUpdateGrpData: grock(CLSN.ONSNETPROC.MASTER), commissioner(2/2)
2012-11-02 13:23:46.627: [    CSSD][1094125888]clssgmQueueGrockEvent: groupName(CLSN.ONSNETPROC.MASTER) count(2) master(1) event(18), incarn 2, mbrc 1, to member 1, events 0xa0, state 0x0
2012-11-02 13:23:46.627: [    CSSD][1094125888]clssgmHandleGrockRcfgUpdate: grock(CLSN.ONSNETPROC.MASTER), updateseq(27), status(0), sendresp(1)
2012-11-02 13:23:46.882: [    CSSD][1094125888]clssgmTestSetLastGrockUpdate: grock(CLSN.ONSNETPROC.MASTER), updateseq(27) msgseq(28), lastupdt<0x14eba350>, ignoreseq(0)
2012-11-02 13:23:46.882: [    CSSD][1094125888]clssgmUpdateGrpData: grock(CLSN.ONSNETPROC.MASTER), private data(2052), incarn(7)
2012-11-02 13:23:46.882: [    CSSD][1094125888]clssgmQueueGrockEvent: groupName(CLSN.ONSNETPROC.MASTER) count(2) master(1) event(8), incarn 7, mbrc 0, to member 1, events 0xa0, state 0x0
2012-11-02 13:23:46.882: [    CSSD][1094125888]clssgmHandleGrockRcfgUpdate: grock(CLSN.ONSNETPROC.MASTER), updateseq(28), status(0), sendresp(1)
2012-11-02 13:23:46.886: [    CSSD][1094125888]clssgmTestSetLastGrockUpdate: grock(CLSN.ONSNETPROC.MASTER), updateseq(28) msgseq(29), lastupdt<0x14e56670>, ignoreseq(0)
2012-11-02 13:23:46.886: [    CSSD][1094125888]clssgmGrockOpTagProcess: Request to commission member(-1) using key(2) for grock(CLSN.ONSNETPROC.MASTER)
2012-11-02 13:23:46.886: [    CSSD][1094125888]clssgmUpdateGrpData: grock(CLSN.ONSNETPROC.MASTER), commissioner(-1/0)
2012-11-02 13:23:46.886: [    CSSD][1094125888]clssgmQueueGrockEvent: groupName(CLSN.ONSNETPROC.MASTER) count(2) master(1) event(18), incarn 0, mbrc 1, to member 1, events 0xa0, state 0x0
2012-11-02 13:23:46.886: [    CSSD][1094125888]clssgmHandleGrockRcfgUpdate: grock(CLSN.ONSNETPROC.MASTER), updateseq(29), status(0), sendresp(1)

同时node 1 因为得不到相应的资源而hang住:

2:  0: waiting for 'enq: TO - contention' 
3:  0: waiting for 'rdbms ipc message'  
4:  0: waiting for 'VKTM Logical Idle Wait' 
5:  0: waiting for 'rdbms ipc message'  
6:  0: waiting for 'DIAG idle wait'     
7:  0: waiting for 'rdbms ipc message'  
8:  0: waiting for 'PING'               
9:  0: waiting for 'rdbms ipc message'  
10: 0: waiting for 'rdbms ipc message'  
11: 0: waiting for 'DIAG idle wait'     
12: 0: waiting for 'rdbms ipc message'  
13: 0: waiting for 'ges remote message' 
14: 0: waiting for 'gcs remote message' 
15: 0: waiting for 'gcs remote message' 
16: 0: waiting for 'rdbms ipc message'  
17: 0: waiting for 'GCR sleep'          
18: 0: waiting for 'rdbms ipc message'  
19: 0: waiting for 'rdbms ipc message'  
20: 0: waiting for 'rdbms ipc message'  
21: 0: waiting for 'rdbms ipc message'  
22: 0: waiting for 'rdbms ipc message'  
23: 0: waiting for 'ges inquiry response' 
24: 0: waiting for 'DFS lock handle'    
。。。。。。。。。。
158:0: waiting for 'library cache lock' 
159:0: waiting for 'library cache lock' 
160:0: waiting for 'library cache lock' 
162:0: waiting for 'LNS ASYNC end of log' 
163:0: waiting for 'rdbms ipc message'  
164:0: waiting for 'wait for unread message on broadcast channel' 
165:0: waiting for 'Streams AQ: qmn coordinator idle wait' 
166:0: waiting for 'Streams AQ: qmn slave idle wait' 
167:1: waited for 'Streams AQ: waiting for time management or cleanup tasks' 
168:0: waiting for 'rdbms ipc message'  
170:0: waiting for 'class slave wait'   
173:0: waiting for 'library cache pin'  
     Cmd: Select
174:0: waiting for 'SQL*Net message from client' 
175:0: waiting for 'SQL*Net message from client' 
     Cmd: PL/SQL Execute
188:0: waiting for 'library cache lock' 
208:0: waiting for 'SQL*Net message from client' 
241:0: waiting for 'enq: PS - contention'[Enqueue PS-00000001-00000E0F] 
247:0: waiting for 'enq: FB - contention' 
     Cmd: Insert
261:0: waiting for 'SQL*Net message from client' 
292:0: waiting for 'library cache lock' 
293:0: waiting for 'library cache lock' 
294:0: waiting for 'library cache lock' 
295:0: waiting for 'library cache lock' 
296:0: waiting for 'library cache lock' 
297:0: waiting for 'library cache lock' 
298:0: waiting for 'library cache lock' 
299:0: waiting for 'library cache lock' 
Blockers
~~~~~~~~

        Above is a list of all the processes. If they are waiting for a resource
        then it will be given in square brackets. Below is a summary of the
        waited upon resources, together with the holder of that resource.
        Notes:
        ~~~~~
         o A process id of '???' implies that the holder was not found in the
           systemstate.

                    Resource Holder State
Enqueue DR-00000000-00000000    ??? Blocker
Enqueue CF-00000000-00000000    23: 0: waiting for 'ges inquiry response'
Enqueue TT-00000001-00000000    ??? Blocker
Enqueue PS-00000001-00000E0F    ??? Blocker

Object Names
~~~~~~~~~~~~
Enqueue DR-00000000-00000000                                  
Enqueue CF-00000000-00000000                                  
Enqueue TT-00000001-00000000                                  
Enqueue PS-00000001-00000E0F

—————————————————————————————————————-

Node 2: 瞬间负载到665

-bash-3.2# uptime
13:49:56 up 93 days, 3:50, 4 users, load average: 665.00, 585.62, 416.01
-bash-3.2# su – oracle
[oracle@dm01db02 ~]$ ps -ef |grep -i local=no |awk ‘{print $2}’|xargs kill -9
kill 104896: No such process
kill 104996: No such process
kill 116568: No such process
[oracle@dm01db02 ~]$
[oracle@dm01db02 ~]$
[oracle@dm01db02 ~]$
[oracle@dm01db02 ~]$ uptime
13:52:09 up 93 days, 3:52, 12 users, load average: 542.61, 584.73, 438.93
[oracle@dm01db02 ~]$
[oracle@dm01db02 ~]$
[oracle@dm01db02 ~]$ ora active
13:52:12 up 93 days, 3:52, 12 users, load average: 499.96, 575.19, 436.62
select b.sid,substr(b.username,1,10) username,decode(program, null, machine,replace(program,’ (TNS V1-V3)’,”)||decode(machine,null,’@’||terminal)) machine,
*

Node 2 log:

Nov  2 13:07:55 dm01db02 kernel: osysmond.bin: page allocation failure. order:5, mode:0xd0
Nov  2 13:07:55 dm01db02 kernel:
Nov  2 13:07:55 dm01db02 kernel: Call Trace:
Nov  2 13:07:55 dm01db02 kernel:  [<ffffffff8000f6ee>] __alloc_pages+0x2ef/0x308
Nov  2 13:07:55 dm01db02 kernel:  [<ffffffff800179d2>] cache_grow+0x139/0x3c7
Nov  2 13:07:55 dm01db02 kernel:  [<ffffffff8005bdfe>] cache_alloc_refill+0x138/0x188
Nov  2 13:07:55 dm01db02 kernel:  [<ffffffff800deaa4>] __kmalloc+0x95/0x9f
Nov  2 13:07:55 dm01db02 kernel:  [<ffffffff8002de21>] __alloc_skb+0x5c/0x12e
Nov  2 13:07:55 dm01db02 kernel:  [<ffffffff80025f4d>] tcp_sendmsg+0x184/0xb07
Nov  2 13:07:55 dm01db02 kernel:  [<ffffffff80063002>] thread_return+0x62/0xfe
Nov  2 13:07:55 dm01db02 kernel:  [<ffffffff80054f00>] sock_sendmsg+0xf8/0x14a
Nov  2 13:07:55 dm01db02 kernel:  [<ffffffff800a2f69>] autoremove_wake_function+0x0/0x2e
Nov  2 13:07:55 dm01db02 kernel:  [<ffffffff8003e10e>] do_futex+0x2c2/0xce5
Nov  2 13:08:28 dm01db02 kernel:  [<ffffffff8022e041>] sys_sendto+0x131/0x164
Nov  2 13:08:28 dm01db02 kernel:  [<ffffffff800b9042>] audit_filter_syscall+0x87/0xad
Nov  2 13:08:28 dm01db02 kernel:  [<ffffffff8005d28d>] tracesys+0xd5/0xe0




已经拿不到内存了


[root@dm01db02 log]# cat /proc/meminfo 
MemTotal:     98848188 kB
MemFree:      23464372 kB
Buffers:         22128 kB
Cached:       28832676 kB
SwapCached:     171616 kB
Active:       31461904 kB
Inactive:      1366720 kB
HighTotal:           0 kB
HighFree:            0 kB
LowTotal:     98848188 kB
LowFree:      23464372 kB
SwapTotal:    25165816 kB
SwapFree:     24766720 kB
Dirty:            1044 kB

13:07 分的系统负载

top - 13:07:30 up 93 days,  3:07,  0 users,  load average: 18.21, 9.87, 7.92
Tasks: 1029 total,  36 running, 991 sleeping,   0 stopped,   2 zombie
Cpu(s): 20.6%us, 73.9%sy,  0.0%ni,  5.1%id,  0.2%wa,  0.0%hi,  0.2%si,  0.0%st
Mem:  98848188k total, 98193652k used,   654536k free,      972k buffers
Swap: 25165816k total,   673764k used, 24492052k free, 30687764k cached

   PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND                                                                                  
114139 oracle    18   0 33.3g  10g 9.1g R 100.0 10.8   1:48.71 oracleedw12 (LOCAL=NO)                                                                  
109482 oracle    25   0 32.9g  16g  16g R 100.0 17.8 134:19.39 oracleedw12 (LOCAL=NO)                                                                  
  1403 root      19  -5     0    0    0 R 100.0  0.0  11:46.46 [kswapd0]     ---------------> SWAP                                                                             
105153 oracle    25   0 32.3g 134m  89m R 100.0  0.1   0:08.46 oracleedw12 (LOCAL=NO)                                                                  
122184 oracle    16   0 32.2g 8.1g 8.0g S 100.0  8.5   0:45.68 oracleedw12 (LOCAL=NO)                                                                  
105965 root      24   0     0    0    0 Z 100.0  0.0   0:05.62 [rds-ping] <defunct>                                                                    
105068 oracle    25   0 32.6g 1.3g 1.0g R 100.0  1.4   0:22.82 oracleedw12 (LOCAL=NO)                                                                  
104996 oracle    25   0 32.6g 1.7g 1.3g R 100.0  1.8   0:31.10 oracleedw12 (LOCAL=NO)                                                                  
 67163 oracle    25   0 33.4g  15g  14g R 99.4 16.1 167:26.91 oracleedw12 (LOCAL=NO)      

这几个top session 的CPU 使用都为100%

Nov  2 13:33:39 dm01db02 kernel: Node 0 DMA: 6*4kB 2*8kB 3*16kB 3*32kB 3*64kB 3*128kB 1*256kB 1*512kB 0*1024kB 0*2048kB 2*4096kB = 9720kB
Nov  2 13:33:39 dm01db02 kernel: Node 0 DMA32: 0*4kB 0*8kB 3*16kB 1*32kB 0*64kB 2*128kB 0*256kB 1*512kB 1*1024kB 1*2048kB 92*4096kB = 380752kB
Nov  2 13:33:39 dm01db02 kernel: Node 0 Normal: 105*4kB 33*8kB 0*16kB 1*32kB 11*64kB 1*128kB 1*256kB 1*512kB 1*1024kB 1*2048kB 11*4096kB = 50444kB
Nov  2 13:33:39 dm01db02 kernel: Node 0 HighMem: empty
Nov  2 13:33:39 dm01db02 kernel: 7498545 pagecache pages
Nov  2 13:33:39 dm01db02 kernel: Swap cache: add 3728475, delete 3674006, find 632575/817933, race 97+346
Nov  2 13:33:39 dm01db02 kernel: Free swap  = 21173180kB
Nov  2 13:33:39 dm01db02 kernel: Total swap = 25165816kB
Nov  2 13:33:39 dm01db02 kernel: Free swap:       21173140kB
Nov  2 13:33:39 dm01db02 kernel: 25690112 pages of RAM
Nov  2 13:33:39 dm01db02 kernel: 978065 reserved pages
Nov  2 13:33:39 dm01db02 kernel: 555281842 pages shared
Nov  2 13:33:39 dm01db02 kernel: 54880 pages swap cached

DRM 信息:

2012-11-02 12:58:21.409428 :
End DRM(3959) for pkey transfer request(s) from 1
* received DRM start msg from 1 (cnt 3, last 1, rmno 3960)

*** 2012-11-02 13:02:45.382
Rcvd DRM(3960) AFFINITY Transfer pkey 2220292.0 to 2 oscan 1.1
Rcvd DRM(3960) AFFINITY Transfer pkey 2221641.0 to 1 oscan 1.1
Rcvd DRM(3960) AFFINITY Transfer pkey 2221642.0 to 1 oscan 1.1
ftd (30) received from node 1 (8 0.30/0.0)
all ftds received

* kjxftdn: break from kjxftdn, post lmon later
ftd (33) received from node 1 (8 0.33/0.0)
all ftds received

* kjxftdn: break from kjxftdn, post lmon later

*** 2012-11-02 13:02:45.561
ftd (35) received from node 1 (8 0.35/0.0)
all ftds received

* kjxftdn: break from kjxftdn, post lmon later
ftd (37) received from node 1 (8 0.37/0.0)
all ftds received
ASH WORST MINUTES FOR DRM FREEZE WAITS:

APPROACH: These are the minutes where the avg drm freeze time
was the highest (in milliseconds).

MINUTE              INST_ID EVENT                            TOTAL_WAIT_TIME      WAITS   AVG_TIME_WAITED
---------------- ---------- ------------------------------ ----------------- ---------- -----------------
Nov02_1359                2 gcs drm freeze in enter server         10757.718         30           358.591
Nov02_1429                2 gcs drm freeze in enter server          2495.686          8           311.961


DYNAMIC_REMASTER_STATS
This shows where time is spent during DRM operations.

Instance: 1
Remaster Ops: 7
Remaster Time: 2275
Remastered Objects: 127
Quiesce Time: 165
Freeze Time: 10
Cleanup Time: 131
Replay Time: 108
Fixwrite Time: 184
Sync Time: 1674
Resources Cleaned: 0
Replayed Locks Sent: 822
Replayed Locks Received: 1299002
Current Objects: 39

Instance: 2
Remaster Ops: 7
Remaster Time: 2263
Remastered Objects: 127
Quiesce Time: 739
Freeze Time: 13
Cleanup Time: 218
Replay Time: 829
Fixwrite Time: 426
Sync Time: 37
Resources Cleaned: 0
Replayed Locks Sent: 91404
Replayed Locks Received: 571022
Current Objects: 114

附一份简化的CHM 信息 使用OCLUMON 得到

system_dump_chm.txt

通过kill process之前的netstat 找到对应的IP:PORT

tcp        0      0 10.0.1.204:1522             10.1.0.90:50956             ESTABLISHED 27295/oracleedw12  

通过IP 找到对应机器 10.1.0.90(xen19-vm05)

通过machine_name 找到sql_id

总结

12:30左右 QRY的用户发起了大量操作 累积时间1小时 ,可能导致了大量DRM, 一个长时间的过程 cpu idle 降至0.x% ,memory被吃光 ,由于exadata的DB端服务器并不是很强劲,导致 hang死.另外注意大量processes 导致的PTE 内存消耗 同样会耗尽内存。

请注意即使XD使用了RDS还是请调整节点分配策略, DBA会商议是否关闭 DRM和RML, 后续SQL 会跟BI 商议 可能会开启RM 控制QRY 操作

Oracle Platinum Services

October 23, 2012 Exadata, oracle No comments

Oracle的白金服务是一种Oracle标准支持下的特殊权利,为客户提供标准合同范畴外的额外服务,以最大的提升Exadata的用户的客户体验,生产机器将在oracle所提供的标准化配置中与oracle总部的call center保持双向连通。Oracle的白金服务代表了一种新的模式,为客户提供了可以与远程工程师一起troubleshooting的机会,Exadata的故障将由远程call center响应如远程故障监测等,更快的响应速度和恢复时间,更新和修补程序部署,无需支付任何额外费用。 Oracle的白金服务远远超出了传统意义上的IT支持。

目前ORACLE对Exadata,Exalogic,SuperCluster 提供了此项服务。

SPARC SuperCluster :ORACLE SUPER CLUSTER SUN T4-4

针对这项服务oracle将免费提供一个gateway机器并部署OEM12c配置VPN 使用ASR发送报告至远端的call center。

ORACLE将提供最为强大的专家TEAM 来support这项服务这种SR请求将直接由美国方面处理并且可能会转到研发部门处理

在XD X3系列中Oracle Platinum Services可能被集成到标准化服务中,由此可见ORACLE对推动这项服务的决心,同时也侧面反映出ORACLE的云策略,通过这种方式将更好的推动一体化服务。我们公司将作为此项服务中国大陆的第一家试点,期待!

Exadata Smart Flash Logging

September 5, 2012 Exadata, oracle No comments

Oracle Exadata 从 11.2.2.4.0开始引入了 Smart Flash Logging的新特性,在exadata环境中,当oracle 需要写入日志的时候将会启动parallel write 同时在 disk 和 flash两端写入数据,controller中有任意一个率先写完就会通知RDBMS数据库继续工作, 该特性着力于提高Exadata对Redo写入的响应时间和吞吐量。(The feature allows redo writes to be written to both flash cache and disk controller cache, with an acknowledgement sent to the RDBMS as soon as either of these writes complete; this improves response times and thoughput.) 而对于ESFL是否可以减少log file sync (LFS) 本人持否定态度

Requirements:
Version 11.2.2.4 at the cell level
Version 11.2.0.3 at the DB level (when it comes out) or 11.2.0.2 BP11

[root@dm01cel01 ~]# imageinfo 

Kernel version: 2.6.18-274.18.1.0.1.el5 #1 SMP Thu Feb 9 19:07:16 EST 2012 x86_64
Cell version: OSS_11.2.3.1.0_LINUX.X64_120304
Cell rpm version: cell-11.2.3.1.0_LINUX.X64_120304-1

Active image version: 11.2.3.1.0.120304
Active image activated: 2012-05-09 14:03:04 -0700
Active image status: success
Active system partition on device: /dev/md5
Active software partition on device: /dev/md7

In partition rollback: Impossible

Cell boot usb partition: /dev/sdm1
Cell boot usb version: 11.2.3.1.0.120304

Inactive image version: undefined
Rollback to the inactive partitions: Impossible

CellCLI>  list metriccurrent where objectType='FLASHLOG'
	 FL_ACTUAL_OUTLIERS           	 FLASHLOG	 0 IO requests
	 FL_BY_KEEP                   	 FLASHLOG	 0
	 FL_DISK_FIRST                	 FLASHLOG	 24,627,386 IO requests
	 FL_DISK_IO_ERRS              	 FLASHLOG	 0 IO requests
	 FL_EFFICIENCY_PERCENTAGE     	 FLASHLOG	 100 %
	 FL_EFFICIENCY_PERCENTAGE_HOUR	 FLASHLOG	 100 %
	 FL_FLASH_FIRST               	 FLASHLOG	 960,101 IO requests
	 FL_FLASH_IO_ERRS             	 FLASHLOG	 0 IO requests
	 FL_FLASH_ONLY_OUTLIERS       	 FLASHLOG	 0 IO requests
	 FL_IO_DB_BY_W                	 FLASHLOG	 1,395,420 MB
	 FL_IO_DB_BY_W_SEC            	 FLASHLOG	 10.797 MB/sec
	 FL_IO_FL_BY_W                	 FLASHLOG	 1,457,415 MB
	 FL_IO_FL_BY_W_SEC            	 FLASHLOG	 10.966 MB/sec
	 FL_IO_W                      	 FLASHLOG	 25,587,487 IO requests
	 FL_IO_W_SKIP_BUSY            	 FLASHLOG	 0 IO requests
	 FL_IO_W_SKIP_BUSY_MIN        	 FLASHLOG	 0.0 IO/sec
	 FL_IO_W_SKIP_LARGE           	 FLASHLOG	 0 IO requests
	 FL_PREVENTED_OUTLIERS        	 FLASHLOG	 732 IO requests

CellCLI> list flashlog detail
	 name:              	 dm01cel01_FLASHLOG
	 cellDisk:          	 FD_01_dm01cel01,FD_05_dm01cel01,FD_12_dm01cel01,FD_08_dm01cel01,FD_13_dm01cel01,FD_11_dm01cel01,FD_09_dm01cel01,FD_15_dm01cel01,FD_00_dm01cel01,FD_02_dm01cel01,FD_03_dm01cel01,FD_04_dm01cel01,FD_10_dm01cel01,FD_14_dm01cel01,FD_07_dm01cel01,FD_06_dm01cel01
	 creationTime:      	 2012-06-26T17:41:40+08:00
	 degradedCelldisks: 	 
	 effectiveSize:     	 512M
	 efficiency:        	 100.0
	 id:                	 4146aa41-632a-4210-8652-0b9da4216ac6
	 size:              	 512M
	 status:            	 normal

FL/disk IO 的比重为3.9% 不是一个很高的比重。其实在exadata系统中FL并不能完全替代redo disk write 参见Harrison的part1,part2,反映出SSD对于sequential write的乏力

同时 Kevin Closson 也指出了ESFL的实际目的:

“Exadata Smart Flash Log (ESFL) is really just a way to flow LGWR traffic through different adaptors (PCI Flash). In fact, simply plugging in another LSI HDD controller with a few SATA drives dedicated to redo streaming writes would actually do quite well if not as well as ESFL. That is a topic for a different post.”

How to disable ESFL

To disable Smart Flash Log for all databases
— Use DROP FLASHLOG CellCLI command in the storage servers
To disable Smart Flash Log for an individual database
— Use ALTER IORMPLAN dbplan=((name=test, flashLog=off))

REF:Exadata Smart Flash Cache Features and the Oracle Exadata Database Machine

Exadata offloading and Smart Scan

August 25, 2012 Exadata, oracle No comments

同事aaqwsh做的关于Exadata-X2-2的smart scan总结

Exadata migration tips

August 7, 2012 Exadata, migration, oracle No comments

Exadata migration完成 目前采用的是物理DG switch over的方式 采用11g active duplicate 实现整个数据大小为2T左右 总共耗时约为10个小时 每秒达到近60M
采取这种方法的好处为:
1:主库无需停机
2:可以采用exadata最佳性能AU_SIZE=4M
3:复制过程中不会产生集中化IO,不过还是建议在夜里进行操作
4:可以直接switch over 使用原来的库为备库

处理完毕后 我们可以查看主机状态:

[grid@dm01db01 ~]$ crsctl stat res -t
--------------------------------------------------------------------------------
NAME           TARGET  STATE        SERVER                   STATE_DETAILS       
--------------------------------------------------------------------------------
Local Resources
--------------------------------------------------------------------------------
ora.DATA_DM01.dg
               ONLINE  ONLINE       dm01db01                                     
               ONLINE  ONLINE       dm01db02                                     
ora.DBFS_DG.dg
               ONLINE  ONLINE       dm01db01                                     
               ONLINE  ONLINE       dm01db02                                     
ora.LISTENER.lsnr
               ONLINE  ONLINE       dm01db01                                     
               ONLINE  ONLINE       dm01db02                                     
ora.RECO_DM01.dg
               ONLINE  ONLINE       dm01db01                                     
               ONLINE  ONLINE       dm01db02                                     
ora.asm
               ONLINE  ONLINE       dm01db01                 Started             
               ONLINE  ONLINE       dm01db02                 Started             
ora.gsd
               OFFLINE OFFLINE      dm01db01                                     
               OFFLINE OFFLINE      dm01db02                                     
ora.net1.network
               ONLINE  ONLINE       dm01db01                                     
               ONLINE  ONLINE       dm01db02                                     
ora.ons
               ONLINE  ONLINE       dm01db01                                     
               ONLINE  ONLINE       dm01db02                                     
ora.registry.acfs
               ONLINE  ONLINE       dm01db01                                     
               ONLINE  ONLINE       dm01db02                                     
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.LISTENER_SCAN1.lsnr
      1        ONLINE  ONLINE       dm01db02                                     
ora.cvu
      1        ONLINE  ONLINE       dm01db02                                     
ora.dm01db01.vip
      1        ONLINE  ONLINE       dm01db01                                     
ora.dm01db02.vip
      1        ONLINE  ONLINE       dm01db02                                     
ora.edw1.db
      1        ONLINE  ONLINE       dm01db01                 Open                
      2        ONLINE  ONLINE       dm01db02                 Open                
ora.oc4j
      1        ONLINE  ONLINE       dm01db02                                     
ora.scan1.vip
      1        ONLINE  ONLINE       dm01db02                                     
[grid@dm01db01 ~]$ 


SQL> /

GROUP_NUMBER NAME							  VALUE
------------ ------------------------------------------------------------ ------------------------------------------------------------
	   1 access_control.enabled					  FALSE
	   1 access_control.umask					  066
	   1 au_size							  4194304
	   1 cell.smart_scan_capable					  TRUE
	   1 compatible.asm						  11.2.0.2.0
	   1 compatible.rdbms						  11.2.0.2.0
	   1 disk_repair_time						  3.6h
	   1 idp.boundary						  auto
	   1 idp.type							  dynamic
	   1 sector_size						  512
	   1 template.ARCHIVELOG.mirror_region				  0

可以看到 au_size=4194304 使用了oracle推荐的best au_size for exadata 对于4M这个数值 可以参考这篇文章 exadata AU_SIZE

通过上图可以看到au size 4m对于exadata的 smart scan io 提升比较明显,另外测试发现11g active duplicate 对于 “cell physical IO bytes saved during optimized RMAN file restore”并没有影响

SQL> SELECT name, value/1024/1024 MB from v$sysstat a WHERE
  2  a.name = 'physical read total bytes' OR
a.name = 'physical write total bytes' OR
a.name = 'cell physical IO interconnect bytes' OR
a.name = 'cell physical IO bytes eligible for predicate offload' OR
a.name = 'cell physical IO bytes saved during optimized file creation' OR
a.name = 'cell physical IO bytes saved during optimized RMAN file restore' OR
a.name = 'cell IO uncompressed bytes' OR
a.name = 'cell physical IO interconnect bytes returned by smart scan' OR
a.name = 'cell physical IO bytes saved by storage index';  3    4    5    6    7    8    9   10  

NAME									 MB
---------------------------------------------------------------- ----------
physical read total bytes					 2043013.32
physical write total bytes					 187975.958
cell physical IO interconnect bytes				 1497804.22
cell physical IO bytes saved during optimized file creation		  0
cell physical IO bytes saved during optimized RMAN file restore 	  0
cell physical IO bytes eligible for predicate offload		 1042550.99
cell physical IO bytes saved by storage index			 127372.633
cell physical IO interconnect bytes returned by smart scan	 119367.207
cell IO uncompressed bytes					 915887.641

关于v$sysstat对于ed cell的statistics 可以参照下表

Statistic Description

cell flash cache read hits

The number of read requests that were a cache hit on exadata flash cache.

cell IO uncompressed bytes

The total size of uncompressed data that is processed on the cell. For scan on hybrid-columnar-compressed tables, this statistic is the size of data after decompression.

cell physical IO interconnect bytes returned by smart scan

The number of bytes that are returned by the cell for Smart Scan only, and does not include bytes for other database I/O.

cell physical IO bytes saved by storage index

The number of bytes saved by storage index.

cell physical IO bytes eligible for predicate offload

The total number of I/O bytes processed with physical disks when processing was offloaded to the cell.

cell physical IO bytes sent directly to DB node to balance CPU usage

The number of I/O bytes sent back to the database server for processing due to CPU usage on Exadata Cell.

cell physical IO bytes saved during optimized file creation

The number of I/O bytes saved by the database host by offloading the file creation operation to cells. This statistic shows the Exadata Cell benefit due to optimized file creation operations.

cell physical IO bytes saved during optimized RMAN file restore

The number of I/O bytes saved by the database host by offloading the RMAN file restore operation to cells. This statistic shows the Exadata Cell benefit due to optimized RMAN file restore operations.

cell physical IO interconnect bytes

The number of I/O bytes exchanged over the interconnection between the database host and cells.

physical read requests optimized

Total number of read requests satisfied either by using Exadata Smart Flash Cache or storage index.

physical read total bytes

Total amount of I/O bytes for reads processed with physical disks. This includes when processing was offloaded to the cell and when processing was not offloaded.

physical read total bytes optimized

Total number of bytes read from Exadata Smart Flash Cache or storage index.

physical write total bytes

Total amount of I/O bytes for writes processed with physical disks. This includes when processing was offloaded to the cell and when processing was not offloaded.

另外官方也给出了几种解决exadata migration的方案

Tip:

ASM Online Migration: This method is only applicable if your database is already using ASM and you do not need to adjust the ASM allocation unit (AU) size. 
To use this method you must also be able to connect your current database storage to Database Machine and migrate the database instances to Database 
Machine. After migrating the database instances, the data migration is very simple, just add new Exadata-based grid disks to your ASM disk groups and drop 
existing disks from your ASM disk groups to migrate the data.

当然迁移的方法还有很多OGG,DSG,甚至DDS。但是需要说明的一点是当使用ASM Online Migration时 虽然不严格要求AU_SIZE(best practice for 4M in ED) 相同 但是不同AU_SIZE的migration 会影响迁移之后的性能

Some Exadata Docs

July 15, 2012 Exadata, oracle No comments

可以通过以下文档了解exadata软硬件的一些情况,以及troubleshooting时如何收集信息,需要收集怎么样的信息-包括了hardware,software
network,ILOM等。另外还有一些ED的远程容灾建议。

Doc ID 888828.1 Database Machine and Exadata Storage Server 11g Release 2 (11.2) Supported Firmware and Software Versions

Doc ID 1187674.1 Master Note for Oracle Database Machine and Exadata Storage Server-Doc ID 1187674.1

Doc ID 735323.1 Exadata Storage Server Diagnostic Collection Guide

Doc ID 1070954.1 Oracle Exadata Database Machine exachk or HealthCheck

Doc ID 1308924.1 Disaster Recovery Best Practices for Exadata Database Machine using Data Guard

Doc ID 761868.1 Disk Failures

Doc ID 1053498.1 Networking layer

Doc ID 1064019.1 Memory, PCI or Power Supply components

Doc ID 1062544.1 ILOM, ILO or LO100 components

Exadata RDS

July 10, 2012 Exadata, oracle No comments

Oracle RAC RDS reference:


Oracle Real Application Clusters (RAC) allows Oracle Database to run any packaged or custom application, unchanged across a set of clustered servers. This provides high availability and flexible scalability. If a clustered server fails, then database continues running on the rest of the servers. When you need additional processing power, you can add additional servers to the cluster online.
A key hardware component of this technology is the private interconnect network between the clustered servers. The interconnect is used by the cluster for inter-node messaging. The interconnect is also used by RAC to implement cache fusion technology.
Reliable Datagram Sockets (RDS) protocol provides reliable datagram services multiplexing UDP packets over InfiniBand connection improving performance to Oracle RAC. It provides high performance cluster interconnect for Oracle 10g RAC, utilizing InfiniBand which has 10X to 30X bandwidth advantage and 10X to 30X latency reduction vs. Gigabit Ethernet.

Only the database servers are configured with a 64K MTU. Presumably this is to benefit TCP/IP traffic between the database servers, and between the database servers and any external host that is linked to the IB switch. You may be surprised to know that the IB ports on the storage cells are configured with the standard 1,500 byte MTU size. The large MTU size is not necessary on the storage cells, because I/O between the database grid and the storage grid utilizes the RDS protocol, which is much more efficient for database I/O and bypasses the TCP/IP protocol stack altogether.

As you can see from the diagram, using the RDS protocol to bypass the TCP processing cuts out a portion of the overhead required to transfer data across the network. Note that the RDS protocol is also used for interconnect traffic between RAC nodes.

目前Exadata DBserver端与cell端 都采用了RDS协议 这使得Exadata 整体之间的通信变的更为流畅

CELL端可以看到日志:

CELL communication is configured to use 1 interface(s):
192.168.10.3
Memory reserved for cellsrv: 22355 MBMemory for other processes: 1600 MB.
Auto Online Feature 1.2
CellServer MD5 Binary Checksum: a9f851b13ee6a072d7724e21c0c99118
OS Hugepage status:
Total/free hugepages available=4013/4013; hugepage size=2048KB
MS_ALERT HUGEPAGE CLEAR
Cache Allocation: Num 1MB hugepage buffers: 8000 Num 1MB non-hugepage buffers: 0
Cache Allocation: BufferSize: 512. Num buffers: 5000. Start Address: 2AACA2E00000
Cache Allocation: BufferSize: 2048. Num buffers: 5000. Start Address: 2AACA3072000
Cache Allocation: BufferSize: 4096. Num buffers: 5000. Start Address: 2AACA3A37000
Cache Allocation: BufferSize: 8192. Num buffers: 10000. Start Address: 2AACA4DC0000
Cache Allocation: BufferSize: 16384. Num buffers: 5000. Start Address: 2AACA9BE1000
Cache Allocation: BufferSize: 32768. Num buffers: 5000. Start Address: 2AACAEA02000
Cache Allocation: BufferSize: 65536. Num buffers: 5000. Start Address: 2AACB8643000
Cache Allocation: BufferSize: 10485760. Num buffers: 23. Start Address: 2AACCBEC4000
CELL communication is configured to use 1 interface(s):
192.168.10.3
IPC version: Oracle RDS/IP (generic)
IPC Vendor 1 Protocol 3
Version 4.1

database server 之间同样使用RDS

SQL> oradebug setmypid
Statement processed.
SQL> oradebug ipc
Information written to trace file.
SQL>
SQL> oradebug tracefile_name
/u01/app/oracle/diag/rdbms/edw/edw1/trace/edw1_ora_81207.trc

[oracle@dm01db01 ~]$ more /u01/app/oracle/diag/rdbms/edw/edw1/trace/edw1_ora_81207.trc |grep RDS
SKGXP:[2ba9e2cb3e08.43]{ctx}: SSKGXPT 0x2ba9e2cb5548 flags 0×5 { READPENDING } sockno 5 IP 192.168.10.1 RDS 38133 lerr 0
SKGXP:[2ba9e2cb3e08.96]{ctx}: SSKGXPT 0x2ba9e3033e50 flags 0×0 sockno 11 IP 192.168.10.1 RDS 49737 lerr 0
SKGXP:[2ba9e2cb3e08.97]{ctx}: SKGXPGPID Internet address 192.168.10.1 RDS port number 49737, mask 22
[oracle@dm01db01 ~]$

[root@dm01cel01 ExadataRDS]# rds-info -I

RDS IB Connections:
LocalAddr RemoteAddr LocalDev RemoteDev
192.168.10.3 192.168.10.3 fe80::21:2800:1ef:6ba3 fe80::21:2800:1ef:6ba3
192.168.10.3 192.168.10.2 fe80::21:2800:1ef:6ba3 fe80::21:2800:1ef:783f
192.168.10.3 192.168.10.1 fe80::21:2800:1ef:6ba3 fe80::21:2800:1ef:788b

新到两台x2-2

May 29, 2012 Exadata, oracle 3 comments


oracle服务还是很到位的 2台机器弄了这么一辆卡车来运 🙂