Internals

Capture binds always actually ?

August 28, 2012 Internals, oracle No comments

我们可以通过v$sql_bind_capture 捕获到 bind values 这篇文章主要阐述v$sql_bind_capture与v$sql_plan的一些联系

SQL*Plus: Release 11.2.0.3.0 Production on Tue Aug 28 14:07:13 2012

Copyright (c) 1982, 2011, Oracle.  All rights reserved.

Connected to an idle instance.

SQL> startup
ORACLE instance started.

Total System Global Area 4275781632 bytes
Fixed Size		    2235208 bytes
Variable Size		  889193656 bytes
Database Buffers	 3372220416 bytes
Redo Buffers		   12132352 bytes
Database mounted.
Database opened.
SQL> drop table liu;

Table dropped.

SQL> conn liu/liu
Connected.
SQL> select * from liu;

       ID1	  ID2	     ID3
---------- ---------- ----------
	13	   14	      15
	 7	    8	       9
	10	   11	      12
	 1	    2	       3
	 4	    5	       6

SQL> variable N1 number;
variable N2 number;
variable N3 number;
begin
 :N1 := 1; 
 :N2 := 2;
 :N3 := 3;
 END;
 /
print N1;
print N2;
print N3; SQL> SQL> SQL>   2    3    4    5    6  
PL/SQL procedure successfully completed.

SQL> 
	N1
----------
	 1

SQL> 
	N2
----------
	 2

SQL> 

	N3
----------
	 3

SQL> SELECT * FROM LIU WHERE ID1= :N1 AND ID2= :N2 AND ID3= :N3;

       ID1	  ID2	     ID3
---------- ---------- ----------
	 1	    2	       3

SQL> @find_sql
Enter value for sql_text: %FROM LIU WHERE ID1%
old   6: where upper(sql_text) like upper(nvl('&sql_text',sql_text))
new   6: where upper(sql_text) like upper(nvl('%FROM LIU WHERE ID1%',sql_text))
Enter value for sql_id: 
old   8: and sql_id like nvl('&sql_id',sql_id)
new   8: and sql_id like nvl('',sql_id)

SQL_ID					CHILD_NUMBER  PLAN_HASH      EXECS  AVG_ETIME	 AVG_LIO SQL_TEXT
--------------------------------------- ------------ ---------- ---------- ---------- ---------- --------------------------------------------------------------------------------
2y32qkr40b82a					   0 3882242470 	 1    .012357	      47 SELECT * FROM LIU WHERE ID1= :N1 AND ID2= :N2 AND ID3= :N3


SQL> select * from table(dbms_xplan.display_cursor('2y32qkr40b82a',null,'typical +peeked_binds'));

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
SQL_ID	2y32qkr40b82a, child number 0
-------------------------------------
SELECT * FROM LIU WHERE ID1= :N1 AND ID2= :N2 AND ID3= :N3

Plan hash value: 3882242470

--------------------------------------------------------------------------
| Id  | Operation	  | Name | Rows  | Bytes | Cost (%CPU)| Time	 |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |	 |	 |	 |     3 (100)| 	 |
|*  1 |  TABLE ACCESS FULL| LIU  |     1 |    39 |     3   (0)| 00:00:01 |

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
--------------------------------------------------------------------------

Peeked Binds (identified by position):
--------------------------------------

   1 - :N1 (NUMBER): 1
   2 - :N2 (NUMBER): 2
   3 - :N3 (NUMBER): 3

Predicate Information (identified by operation id):
---------------------------------------------------

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------

   1 - filter(("ID1"=:N1 AND "ID2"=:N2 AND "ID3"=:N3))

Note
-----
   - dynamic sampling used for this statement (level=2)


29 rows selected.

SQL> select value_string from v$sql_bind_capture where sql_id='2y32qkr40b82a' 
  2  ;

VALUE_STRING
--------------------------------------------------------------------------------
1
2
3

继续改变bind values :

SQL> variable N1 number;
variable N2 number;
variable N3 number;
begin
 :N1 := 4; 
 :N2 := 5;
 :N3 := 6;
 END;
 /
print N1;
print N2;
print N3;SQL> SQL> SQL>   2    3    4    5    6  
PL/SQL procedure successfully completed.

SQL> 
	N1
----------
	 4

SQL> 
	N2
----------
	 5

SQL> 

	N3
----------
	 6

SQL> SELECT * FROM LIU WHERE ID1= :N1 AND ID2= :N2 AND ID3= :N3;

       ID1	  ID2	     ID3
---------- ---------- ----------
	 4	    5	       6
	 
SQL> select * from table(dbms_xplan.display_cursor('2y32qkr40b82a',null,'typical +peeked_binds'));

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
SQL_ID	2y32qkr40b82a, child number 0
-------------------------------------
SELECT * FROM LIU WHERE ID1= :N1 AND ID2= :N2 AND ID3= :N3

Plan hash value: 3882242470

--------------------------------------------------------------------------
| Id  | Operation	  | Name | Rows  | Bytes | Cost (%CPU)| Time	 |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |	 |	 |	 |     3 (100)| 	 |
|*  1 |  TABLE ACCESS FULL| LIU  |     1 |    39 |     3   (0)| 00:00:01 |

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
--------------------------------------------------------------------------

Peeked Binds (identified by position):
--------------------------------------

   1 - :N1 (NUMBER): 1
   2 - :N2 (NUMBER): 2
   3 - :N3 (NUMBER): 3

Predicate Information (identified by operation id):
---------------------------------------------------

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------

   1 - filter(("ID1"=:N1 AND "ID2"=:N2 AND "ID3"=:N3))

Note
-----
   - dynamic sampling used for this statement (level=2)


29 rows selected.


SQL>  select value_string from v$sql_bind_capture where sql_id='2y32qkr40b82a' ;

VALUE_STRING
--------------------------------------------------------------------------------
1
2
3

没有变化,继续强制产生一个child cursor

SQL> insert into liu values (128000,128000,128000);

1 row created.

SQL> variable N1 number;
variable N2 number;
variable N3 number;
begin
 :N1 := 128000; 
 :N2 := 128000;
 :N3 := 128000;
 END;
 /SQL> SQL> SQL>   2    3    4    5    6  

PL/SQL procedure successfully completed.

SQL> alter session set  optimizer_mode=first_rows_100;

Session altered.

SQL> SELECT * FROM LIU WHERE ID1= :N1 AND ID2= :N2 AND ID3= :N3;

       ID1	  ID2	     ID3
---------- ---------- ----------
    128000     128000	  128000
    

Enter value for sql_text: %FROM LIU WHERE ID1%
old   6: where upper(sql_text) like upper(nvl('&sql_text',sql_text))
new   6: where upper(sql_text) like upper(nvl('%FROM LIU WHERE ID1%',sql_text))
Enter value for sql_id: 
old   8: and sql_id like nvl('&sql_id',sql_id)
new   8: and sql_id like nvl('',sql_id)

SQL_ID					CHILD_NUMBER  PLAN_HASH      EXECS  AVG_ETIME	 AVG_LIO SQL_TEXT
--------------------------------------- ------------ ---------- ---------- ---------- ---------- --------------------------------------------------------------------------------
2y32qkr40b82a					   0 3882242470 	 1    .002613	      23 SELECT * FROM LIU WHERE ID1= :N1 AND ID2= :N2 AND ID3= :N3
2y32qkr40b82a					   1 3882242470 	 2    .006787	    65.5 SELECT * FROM LIU WHERE ID1= :N1 AND ID2= :N2 AND ID3= :N3


SQL> select * from table(dbms_xplan.display_cursor('2y32qkr40b82a',null,'typical +peeked_binds'));

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
SQL_ID	2y32qkr40b82a, child number 0
-------------------------------------
SELECT * FROM LIU WHERE ID1= :N1 AND ID2= :N2 AND ID3= :N3

Plan hash value: 3882242470

--------------------------------------------------------------------------
| Id  | Operation	  | Name | Rows  | Bytes | Cost (%CPU)| Time	 |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |	 |	 |	 |     3 (100)| 	 |
|*  1 |  TABLE ACCESS FULL| LIU  |     1 |    39 |     3   (0)| 00:00:01 |

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
--------------------------------------------------------------------------

Peeked Binds (identified by position):
--------------------------------------

   1 - :N1 (NUMBER): 128000
   2 - :N2 (NUMBER): 128000
   3 - :N3 (NUMBER): 128000

Predicate Information (identified by operation id):
---------------------------------------------------

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------

   1 - filter(("ID1"=:N1 AND "ID2"=:N2 AND "ID3"=:N3))

Note
-----
   - dynamic sampling used for this statement (level=2)

SQL_ID	2y32qkr40b82a, child number 1
-------------------------------------
SELECT * FROM LIU WHERE ID1= :N1 AND ID2= :N2 AND ID3= :N3


PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
Plan hash value: 3882242470

--------------------------------------------------------------------------
| Id  | Operation	  | Name | Rows  | Bytes | Cost (%CPU)| Time	 |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |	 |	 |	 |     3 (100)| 	 |
|*  1 |  TABLE ACCESS FULL| LIU  |     1 |    39 |     3   (0)| 00:00:01 |
--------------------------------------------------------------------------

Peeked Binds (identified by position):
--------------------------------------

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------

   1 - :N1 (NUMBER): 1
   2 - :N2 (NUMBER): 2
   3 - :N3 (NUMBER): 3

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter(("ID1"=:N1 AND "ID2"=:N2 AND "ID3"=:N3))

Note

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
-----
   - dynamic sampling used for this statement (level=2)


58 rows selected.


SQL> select value_string from v$sql_bind_capture where sql_id='2y32qkr40b82a'

VALUE_STRING
--------------------------------------------------------------------------------
1
2
3
128000
128000
128000

6 rows selected.

在产生子游标的情况下同时捕获到,我们来看下other_xml的值

select other_xml from v$sql_plan where sql_id=’2y32qkr40b82a’ and other_xml is not null

<other_xml><info type="db_version">11.2.0.3</info>
<info type="parse_schema"><![CDATA["LIU"]]></info><info type="dynamic_sampling">2</info>
<info type="plan_hash">3882242470</info><info type="plan_hash_2">281948136</info>
<peeked_binds><bind nam=":N1" pos="1" dty="2" pre="0" scl="0" mxl="22">c30d51</bind>
<bind nam=":N2" pos="2" dty="2" pre="0" scl="0" mxl="22">c30d51</bind>
<bind nam=":N3" pos="3" dty="2" pre="0" scl="0" mxl="22">c30d51</bind></peeked_binds>
<outline_data><hint><![CDATA[IGNORE_OPTIM_EMBEDDED_HINTS]]></hint>
<hint><![CDATA[OPTIMIZER_FEATURES_ENABLE('11.2.0.3')]]></hint>
<hint><![CDATA[DB_VERSION('11.2.0.3')]]></hint>
<hint><![CDATA[FIRST_ROWS(100)]]></hint>
<hint><![CDATA[OUTLINE_LEAF(@"SEL$1")]]></hint>
<hint><![CDATA[FULL(@"SEL$1" "LIU"@"SEL$1")]]></hint>
</outline_data></other_xml>
SQL> select display_raw('c30d51','NUMBER') from dual;

DISPLAY_RAW('C30D51','NUMBER')
--------------------------------------------------------------------------------
128000

确实在other_xml中捕获到了这个值,这里要注意如果关闭”_optim_peek_user_binds” other_xml将无法捕获这些values

继续 我们开启两个窗口做loop循环:

session 1:

SQL>  declare 
N1  number;
N2  number;
N3  number;
v_count number;
begin
for i in 1..1000000000 loop
N1 := 4; 
N2 := 5;
N3 := 6;
SELECT count(*) into v_count FROM LIU WHERE ID1= N1 AND ID2= N2 AND ID3= N3;
end loop;
end;  2    3    4    5    6    7    8    9   10   11   12   13  
 14  /
 
 
session 2:

SQL>  declare 
N1  number;
N2  number;
N3  number;
v_count number;
begin
for i in 1..1000000000 loop
N1 := 7; 
N2 := 8;
N3 := 9;
SELECT count(*) into v_count FROM LIU WHERE ID1= N1 AND ID2= N2 AND ID3= N3;
end loop;
end;  2    3    4    5    6    7    8    9   10   11   12   13  
 14  /
 
session 3

SQL> @find_sql
Enter value for sql_text: %LIU%
old   6: where upper(sql_text) like upper(nvl('&sql_text',sql_text))
new   6: where upper(sql_text) like upper(nvl('%LIU%',sql_text))
Enter value for sql_id: 
old   8: and sql_id like nvl('&sql_id',sql_id)
new   8: and sql_id like nvl('',sql_id)

SQL_ID					CHILD_NUMBER  PLAN_HASH      EXECS  AVG_ETIME	 AVG_LIO SQL_TEXT
--------------------------------------- ------------ ---------- ---------- ---------- ---------- --------------------------------------------------------------------------------
33xdcqmh004pw					   0 4104423748     889189 .000024552 7.00001574 SELECT COUNT(*) FROM LIU WHERE ID1= :B3 AND ID2= :B2 AND ID3= :B1
3s3f5r30shvz4					   0 4104423748 	 1    .003918	       7 SELECT /* OPT_DYN_SAMP */ /*+ ALL_ROWS IGNORE_WHERE_CLAUSE NO_PARALLEL(SAMPLESUB
												 ) opt_param('parallel_execution_enabled', 'false') NO_PARALLEL_INDEX(SAMPLESUB)
												 NO_SQL_TUNE */ NVL(SUM(C1),0), NVL(SUM(C2),0) FROM (SELECT /*+ IGNORE_WHERE_CLAU
												 SE NO_PARALLEL("LIU") FULL("LIU") NO_PARALLEL_INDEX("LIU") */ 1 AS C1, CASE WHEN
												  "LIU"."ID1"=:B1 AND "LIU"."ID2"=:B2 AND "LIU"."ID3"=:B3 THEN 1 ELSE 0 END AS C2
												  FROM "LIU"."LIU" "LIU") SAMPLESUB

g8h776pgzzxym					   0	      0 	 0   41.31562	 6161555  declare N1  number; N2  number; N3  number; v_count number; begin for i in 1..1
												 000000000 loop N1 := 4; N2 := 5; N3 := 6; SELECT count(*) into v_count FROM LIU
												 WHERE ID1= N1 AND ID2= N2 AND ID3= N3; end loop; end;

SQL_ID					CHILD_NUMBER  PLAN_HASH      EXECS  AVG_ETIME	 AVG_LIO SQL_TEXT
--------------------------------------- ------------ ---------- ---------- ---------- ---------- --------------------------------------------------------------------------------


SQL> alter system set "_cursor_bind_capture_interval"=1;

System altered.

15:19:38 SQL> /

VALUE_STRING
--------------------------------------------------------------------------------
4
5
6

Elapsed: 00:00:00.00
15:19:40 SQL> /

VALUE_STRING
--------------------------------------------------------------------------------
7
8
9

Elapsed: 00:00:00.01
15:19:41 SQL> /

VALUE_STRING
--------------------------------------------------------------------------------
7
8
9

Elapsed: 00:00:00.00
15:20:58 SQL> /

VALUE_STRING
--------------------------------------------------------------------------------
7
8
9

Elapsed: 00:00:00.00
15:20:59 SQL> /

VALUE_STRING
--------------------------------------------------------------------------------
7
8
9

Elapsed: 00:00:00.00
15:20:59 SQL> /

VALUE_STRING
--------------------------------------------------------------------------------
4
5
6

可以看到Capture binds 在不停的变化,而此时通过dbms_xplan peeking binds 则不会产生变化,这也验证了JL的这篇文章。
这篇文章的主旨为 “Capture binds always actually?”.通过实验已经证明了 – No 对于这个问题衍生出了peeked binds,Dion也曾总结过这个问题 可以总结为

– Peeked binds value of dbms_xplan.display_cursor is extracted from v$sql_plan.other_xml

– Captured and Peeked binds will refresh when generating child cursors

– Captured bind value is extraced from X$KQLFBC view.

– Peeked binds and Captured bind are not same thing

另外还可以通过一些手段去tracking bind values :10046,logminer等 下面给个示范:

SQL> oradebug setorapid 41;
Oracle pid: 41, Unix process pid: 30705, image: oracle@db-2-15 (TNS V1-V3)
SQL> 
SQL> 
SQL> oradebug dump errorstack 4;
Statement processed.
SQL> oradebug dump errorstack 4;
Statement processed.
SQL> oradebug tracefile_name;
/home/oracle/diag/rdbms/yhdpri/yhd/trace/yhd_ora_30705.trc

—– Bind Info (kkscoacd) —–
Bind#0
oacdty=02 mxl=22(21) mxlc=00 mal=00 scl=00 pre=00
oacflg=13 fl2=206001 frm=00 csi=00 siz=24 off=0
kxsbbbfp=7f3c6bfd2b78 bln=22 avl=03 flg=09
value=1231
Bind#1
oacdty=02 mxl=22(21) mxlc=00 mal=00 scl=00 pre=00
oacflg=13 fl2=206001 frm=00 csi=00 siz=24 off=0
kxsbbbfp=7f3c6bfd2ba8 bln=22 avl=04 flg=09
value=21231
Bind#2
oacdty=02 mxl=22(21) mxlc=00 mal=00 scl=00 pre=00
oacflg=13 fl2=206001 frm=00 csi=00 siz=24 off=0
kxsbbbfp=7f3c6bfd2bd8 bln=22 avl=03 flg=09
value=3455
Frames pfr 0x7f3c6bff26b0 siz=4400 efr 0x7f3c6bff25f8 siz=4384
Cursor frame dump
enxt: 5.0×00000010 enxt: 4.0×00000048 enxt: 3.0×00000078 enxt: 2.0×00000060
enxt: 1.0×00001000
pnxt: 1.0×00000010
kxscphp=0x7f3c6bff0898 siz=984 inu=864 nps=544
kxscdfhp=0x7f3c6bff0778 siz=984 inu=88 nps=0
kxscbhp=0x7f3c6bff0c90 siz=984 inu=184 nps=0
kxscwhp=0x7f3c6bff0db0 siz=18680 inu=8448 nps=0
kxscefhp=0x7f3c6bff0ed0 siz=4056 inu=40 nps=40
Starting SQL statement dump
SQL Information
user_id=0 user_name=SYS module=sqlplus@db-2-15 (TNS V1-V3) action=
sql_id=33xdcqmh004pw plan_hash_value=-190543548 problem_type=4
—– Current SQL Statement for this session (sql_id=33xdcqmh004pw) —–
SELECT COUNT(*) FROM LIU WHERE ID1= :B3 AND ID2= :B2 AND ID3= :B1

Refhttp://kerryosborne.oracle-guy.com/2009/07/creating-test-scripts-with-bind-variables/

处理坏块一例(2)

August 17, 2012 Internals, maintain, oracle 2 comments

帮朋友check一次中国银联数据库的坏块问题(并不是严格意义上的坏块),大致情况为smon需要trascation recover —>特定的block 发现无法读取 —>transaction recover出现问题 —>smon terminated

Wed Aug 15 19:02:57 2012
SMON: Parallel transaction recovery slave got internal error
SMON: Downgrading transaction recovery to serial
Wed Aug 15 19:02:57 2012
Errors in file /oracle/admin/settora/bdump/settora_smon_4755904.trc:
ORA-00600: internal error code, arguments: [kdourp_inorder2], [43], [43], [7], [44], [], [], []
Doing block recovery for file 10 block 176769
Doing block recovery for file 26 block 3259716

Block recovery from logseq 108030, block 49455 to scn 41320651147
Wed Aug 15 19:02:58 2012
Recovery of Online Redo Log: Thread 1 Group 5 Seq 108030 Reading mem 0
Mem# 0: /orasvr/settora/redo05.log
Block recovery completed at rba 108030.49555.16, scn 9.2665945485
ORACLE Instance settora (pid = 8) – Error 600 encountered while recovering transaction (7, 19) on object 331910.

过一段时间数据库就会down掉,从日志信息里面可以看到:

Wed Aug 15 19:23:18 2012
SMON: Parallel transaction recovery slave got internal error
SMON: Downgrading transaction recovery to serial
Wed Aug 15 19:23:18 2012
Errors in file /oracle/admin/settora/bdump/settora_smon_4755904.trc:
ORA-00600: internal error code, arguments: [kdourp_inorder2], [43], [43], [7], [44], [], [], []
Wed Aug 15 19:23:20 2012
Errors in file /oracle/admin/settora/bdump/settora_pmon_4690184.trc:
ORA-00474: SMON process terminated with error
Wed Aug 15 19:23:20 2012
PMON: terminating instance due to error 474
Instance terminated by PMON, pid = 4690184

执行下面SQL

select /*+full(a)*/ from PRO_SETTLE3.SHOPACCOUNT a;

returns:ORA-00600: internal error code, arguments: [kdourp_inorder2], [43], [43], [7], [44], [], [], []

使用dbv工具来检测:

$ dbv file=/orasvr1/settle5/settle000.dbf blocksize=8192

DBVERIFY: Release 10.2.0.4.0 - Production on Thu Aug 16 09:40:49 2012

Copyright (c) 1982, 2007, Oracle.  All rights reserved.

DBVERIFY - Verification starting : FILE = /orasvr1/settle5/settle000.dbf
kdrchk:  row is marked as both Last and Next continue
          prow=0x1101aec98 flag=0x07
Block Checking: DBA = 42119809, Block Type = KTB-managed data block
data header at 0x1101ae07c
kdbchk: bad row tab 0, slot 51
Page 176769 failed with check code 6253


DBVERIFY - Verification complete

Total Pages Examined         : 386688
Total Pages Processed (Data) : 282817
Total Pages Failing   (Data) : 1
Total Pages Processed (Index): 98760
Total Pages Failing   (Index): 0
Total Pages Processed (Other): 2707
Total Pages Processed (Seg)  : 0
Total Pages Failing   (Seg)  : 0
Total Pages Empty            : 2404
Total Pages Marked Corrupt   : 0
Total Pages Influx           : 0
Highest block SCN            : 2666197038 (9.2666197038)

Total Pages Failing (Data) : 1

Page 176769 被标识为corruption

Total Pages Examined         : 512      Number of blocks looked at
Total Pages Processed (Data) : 1        Number of TABLE blocks seen
Total Pages Failing   (Data) : 1        Number of TABLE blocks with internal inconsistencies
Total Pages Processed (Index): 0        Number of INDEX blocks seen
Total Pages Failing   (Index): 0        Number of INDEX block with internal inconsistencies
Total Pages Empty            : 507      Number of unused blocks seen
Total Pages Marked Corrupt   : 2        Number of blocks with corrupt cache wrappers
Total Pages Influx           : 0        Number of pages we re-read as the page looked like it was being modified when it was first read.
SQL> select file#,block#,blocks from v$database_block_corruption;
 
no rows selected

在数据库中并没有关于坏块的记录存在。这并不是一个普通意义上的坏块。对于这种块使用corrupt skip是没有用的skip_corrupt_blocks只能使oracle跳过Oracle能够读出的块,而如果在操作系统层read调用就失败的,则不能跳过该块。
oracle 在read page 176769时发生了error 6253 (cannot read arguments from address file,其实这并不是真正的root cause,我在和huangyong讨论过后基本认定为一个内部的inconsistent block) oracle 不能 mark 为 corrupt block.

这里的思路为使用rowid来跳过这些有问题的block 大致方法如下

SQL> SELECT dbms_rowid.rowid_create(1,331910,10,176769,0) LOW_RID from DUAL;

LOW_RID
——————
AABRCGAAKAAArKBAAA

SQL> SELECT dbms_rowid.rowid_create(1,331910,10,176770,0) HI_RID from dual;

HI_RID
——————
AABRCGAAKAAArKCAAA

SQL> SELECT dbms_rowid.rowid_create(1,331910,26,3259716,0) LOW_RID from DUAL;

LOW_RID
——————
AABRCGAAaAAMb1EAAA

SQL> SELECT dbms_rowid.rowid_create(1,331910,26,3259717,0) HI_RID from DUAL;

HI_RID
——————
AABRCGAAaAAMb1FAAA

SQL> create table PRO_SETTLE3.SHOPACCOUNT_1 as select /*+ rowid(a) */ * from PRO_SETTLE3.SHOPACCOUNT a where rowid < 'AABRCGAAKAAArKBAAA'; Table created. SQL> insert into PRO_SETTLE3.SHOPACCOUNT_1 select /*+ rowid(a) */ * from PRO_SETTLE3.SHOPACCOUNT a where rowid >= ‘AABRCGAAKAAArKCAAA’ and rowid < 'AABRCGAAaAAMb1EAAA'; 49626 rows created. SQL> insert into PRO_SETTLE3.SHOPACCOUNT_1 select /*+ rowid(a) */ * from PRO_SETTLE3.SHOPACCOUNT a where rowid >= ‘AABRCGAAaAAMb1FAAA’;

208837 rows created.

SQL> select count(*) from PRO_SETTLE3.SHOPACCOUNT_1;

COUNT(*)
———-
262256

之后drop掉这张有问题的table restart database 恢复正常.

同样我们可以采用Kerry Osborne的脚本来实现.这里稍微修改了下 增加了选择tablespace 的功能temp

eg:

[oracle@testdb ~]$ dd if=/dev/zero  of=/data/oracle/oradata/wuxuan1/liu.dbf   bs=8192 seek=1798  count=1 conv=notrunc
1+0 records in
1+0 records out
8192 bytes (8.2 kB) copied, 5.2e-05 seconds, 158 MB/s
[oracle@testdb ~]$ sqlplus liu/liu

SQL*Plus: Release 11.2.0.3.0 Production on Fri Aug 17 02:32:20 2012

Copyright (c) 1982, 2011, Oracle.  All rights reserved.

Error accessing PRODUCT_USER_PROFILE
Warning:  Product user profile information not loaded!
You may need to run PUPBLD.SQL as SYSTEM

Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options

SQL> select count(*) from temp;
select count(*) from temp
                     *
ERROR at line 1:
ORA-01578: ORACLE data block corrupted (file # 4, block # 1798)
ORA-01110: data file 4: '/data/oracle/oradata/wuxuan1/liu.dbf'


SQL> @temp	

WARNING: This script may issue a DROP TABLE command. Do not execute it unless you have read through it
and are comfortable you know what it does.

Ready? (hit ctl-C to quit)  
Enter value for owner_name: LIU
old   8: v_owner_name varchar2(30) := upper('&owner_name');
new   8: v_owner_name varchar2(30) := upper('LIU');
Enter value for table_name: TEMP
old   9: v_table_name varchar2(30) := upper('&table_name');
new   9: v_table_name varchar2(30) := upper('TEMP');
Enter value for tablespace: LIU
old  11: v_tablespace varchar2(30) := upper('&tablespace');
new  11: v_tablespace varchar2(30) := upper('LIU');


Saved 13247 records in TEMP_SAVED.
201 bads records in TEMP_BAD.

PL/SQL procedure successfully completed.

SQL> 
SQL> select count(*) from temp_saved;

  COUNT(*)
----------
     13247
     
SQL> select  * from  v$database_block_corruption;

     FILE#     BLOCK#	  BLOCKS CORRUPTION_CHANGE# CORRUPTIO
---------- ---------- ---------- ------------------ ---------
	 4	  258	       1		  0 ALL ZERO
	 4	 1798	       1		  0 ALL ZERO

block 1798 已经被标识为media corruption

手动构造rowid

SQL> select * from temp_bad where rownum<5; 

OLD_ROWID			 OLD_FILE OLD_OBJECT  OLD_BLOCK    OLD_ROW
------------------------------ ---------- ---------- ---------- ----------
ERROR_MESSAGE
--------------------------------------------------------------------------------
4.1798.181			    13646	   4	   1798        181
ORA-01578: ORACLE data block corrupted (file # 4, block # 1798)
ORA-01110: data file 4: '/data/oracle/oradata/wuxuan1/liu.dbf'

4.1798.182			    13646	   4	   1798        182
ORA-01578: ORACLE data block corrupted (file # 4, block # 1798)
ORA-01110: data file 4: '/data/oracle/oradata/wuxuan1/liu.dbf'

4.1798.183			    13646	   4	   1798        183

OLD_ROWID			 OLD_FILE OLD_OBJECT  OLD_BLOCK    OLD_ROW
------------------------------ ---------- ---------- ---------- ----------
ERROR_MESSAGE
--------------------------------------------------------------------------------
ORA-01578: ORACLE data block corrupted (file # 4, block # 1798)
ORA-01110: data file 4: '/data/oracle/oradata/wuxuan1/liu.dbf'

4.1798.184			    13646	   4	   1798        184
ORA-01578: ORACLE data block corrupted (file # 4, block # 1798)
ORA-01110: data file 4: '/data/oracle/oradata/wuxuan1/liu.dbf'

SQL> select dbms_rowid.rowid_create(1,13646,4,1798,181) from dual;

DBMS_ROWID.ROWID_C
------------------
AAADVOAAEAAAAcGAC1

SQL> select dbms_rowid.rowid_create(1,13646,4,1798,182) from dual;

DBMS_ROWID.ROWID_C
------------------
AAADVOAAEAAAAcGAC2




SQL> select * from temp where rowid ='AAADVOAAEAAAAcGAC1';
select * from temp where rowid ='AAADVOAAEAAAAcGAC1'
              *
ERROR at line 1:
ORA-01578: ORACLE data block corrupted (file # 4, block # 1798)
ORA-01110: data file 4: '/data/oracle/oradata/wuxuan1/liu.dbf'


SQL> select * from temp where rowid ='AAADVOAAEAAAAcGAC2';
select * from temp where rowid ='AAADVOAAEAAAAcGAC2'
              *
ERROR at line 1:
ORA-01578: ORACLE data block corrupted (file # 4, block # 1798)
ORA-01110: data file 4: '/data/oracle/oradata/wuxuan1/liu.dbf'

Reference:
Extracting Data from a Corrupt Table using ROWID Range Scans in Oracle8 and higher [ID 61685.1]
Extracting Data from a Corrupt Table using DBMS_REPAIR or Event 10231 [ID 33405.1]
Extract rows from a CORRUPT table creating ROWID from DBA_EXTENTS [ID 422547.1]

Bitmap index tips

June 7, 2012 Internals, oracle No comments

对于BITMAP index的使用 一直存在一些争议 我们来看看下面几个观点:

1. Low cardinality – Some dabase vendor, like Oracle, provides very practical suggestion

If the number of distinct values of a column is less than 1% of the number of rows in the table, or if the values in a column are repeated more than 100 times, then the column is a candidate for a bitmap index.
B-tree indexes are most effective for high-cardinality data: that is, data with many possible values, such as CUSTOMER_NAME or PHONE_NUMBER.
There are 100 or more rows for each distinct value in the indexed column. When this limit is met, the bitmap index will be much smaller than a regular index, and you will be able to create the index much faster than a regular index.

2. No or little insert/update

Updating bitmap indexes take a lot of resources. Here are the suggestions
Building and maintaining an index structure can be expensive, and it can consume resources such as disk space, CPU, and I/O capacity. Designers must ensure that the benefits of any index outweigh the negatives of index maintenance.
Use this simple estimation guide for the cost of index maintenance: each index maintained by an INSERT, DELETE, or UPDATE of the indexed keys requires about three times as much resource as the actual DML operation on the table. What this means is that if you INSERT into a table with three indexes, then it will be approximately 10 times slower than an INSERT into a table with no indexes. For DML, and particularly for INSERT-heavy applications, the index design should be seriously reviewed, which might require a compromise between the query and INSERT performance.

3. Multiple Columns

One of the advantage is that multiple bitmap indexes can be merged and the column does not have to selective!More than one column in the table has an index that the optimizer can use to improve performance on a table scan Combining bitmap indexes on non-selective columns allows efficient AND and OR operations with a great number of rowids with minimal I/O





对于观点1 观点2 其实就是说明bitmap index 适合cardinality distinct小,并且并发DML很少的环境(OLAP) 但是对于大量的插入更新操作在cardinality很小的情况下 无论是索引的大小 还是插入的时间 bitmap index 还是占有一定的优势的 但是随着distinct的上升 无论index size还是插入的时间 都会发生恐怖的变化,我们可以做一个简单的对比:

SQL*Plus: Release 11.2.0.3.0 Production on Thu Jun 7 16:59:35 2012

Copyright (c) 1982, 2011, Oracle. All rights reserved.

Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 – 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options

SQL>

Elapsed: 00:00:00.00
SQL> create index user_data2.idx_temp_liu_id on user_data2.temp_liu(id);

Index created.

Elapsed: 00:00:00.46
SQL> insert into user_data2.temp_liu select 1 from dual connect by level < 40000001; 40000000 rows created. Elapsed: 00:03:36.70 SQL> commit;

Commit complete.

b-tree index distinct count =1 耗时 3:36.70

Elapsed: 00:00:01.14
SQL> select sum(bytes)/1024/1024/1024 from dba_segments where segment_name =’IDX_TEMP_LIU_ID’;

SUM(BYTES)/1024/1024/1024
————————-
1.06738281

Elapsed: 00:00:00.85
SQL> drop index user_data2.IDX_TEMP_LIU_ID;

Index dropped.

Elapsed: 00:00:01.75
SQL> create bitmap index user_data2.IDX_TEMP_LIU_ID on user_data2.temp_liu (id);

Index created.

Elapsed: 00:00:06.79
SQL> drop index user_data2.IDX_TEMP_LIU_ID;

Index dropped.

Elapsed: 00:00:00.01
SQL> truncate table user_data2.temp_liu;

Table truncated.

Elapsed: 00:00:02.02
SQL> create bitmap index user_data2.IDX_TEMP_LIU_ID on user_data2.temp_liu (id);

Index created.

Elapsed: 00:00:00.01
SQL> select count(*) from user_data2.temp_liu;

COUNT(*)
———-
0

Elapsed: 00:00:00.00
SQL> insert into user_data2.temp_liu select 1 from dual connect by level < 40000001; 40000000 rows created. Elapsed: 00:03:10.28 SQL> commit;

Commit complete.

Elapsed: 00:00:00.06
SQL> select sum(bytes)/1024/1024/1024 from dba_segments where segment_name =’IDX_TEMP_LIU_ID’;

SUM(BYTES)/1024/1024/1024
————————-
.005859375

Elapsed: 00:00:01.95





可以看到 耗时小于b-tree 但是index size << b-tree index size 下面测试下distinct count =2的对比


SQL> truncate table user_data2.temp_liu;

Table truncated.

Elapsed: 00:00:00.11
SQL> drop index user_data2.IDX_TEMP_LIU_ID;

Index dropped.

Elapsed: 00:00:00.01
SQL> create index user_data2.idx_temp_liu_id on user_data2.temp_liu(id);

Index created.

Elapsed: 00:00:00.18
SQL>
SQL> insert into user_data2.temp_liu select mod(rownum,2) from dual connect by level < 40000001; 40000000 rows created. Elapsed: 00:10:33.48 SQL> commit;

Commit complete.

Elapsed: 00:00:01.09

SQL> truncate table user_data2.temp_liu ;

Table truncated.

Elapsed: 00:00:02.31
SQL> drop index user_data2.IDX_TEMP_LIU_ID;

Index dropped.

Elapsed: 00:00:00.01
SQL> create bitmap index user_data2.IDX_TEMP_LIU_ID on user_data2.temp_liu (id);

Index created.

Elapsed: 00:00:00.17
SQL> insert into user_data2.temp_liu select mod(rownum,2) from dual connect by level < 40000001; 40000000 rows created. Elapsed: 00:03:43.77



b-tree 耗时接近10分钟 bitmap index 耗时 3:43s 差距继续扩大 bitmap_index << b-tree index


SQL> drop index user_data2.IDX_TEMP_LIU_ID;

Index dropped.

Elapsed: 00:00:02.12
SQL> truncate table user_data2.temp_liu ;

Table truncated.

Elapsed: 00:00:01.04

SQL> insert into user_data2.temp_liu select ceil (rownum/2000000) from dual connect by level < 10000001; 10000000 rows created. Elapsed: 00:00:19.68 SQL> truncate table user_data2.temp_liu;

Table truncated.

Elapsed: 00:00:00.27
SQL> create bitmap index user_data2.IDX_TEMP_LIU_ID on user_data2.temp_liu (id);

Index created.

Elapsed: 00:00:00.01
SQL> insert into user_data2.temp_liu select ceil (rownum/2000000) from dual connect by level < 10000001; 10000000 rows created. Elapsed: 00:00:45.71 SQL> select sum(bytes) from dba_segments where segment_name=’IDX_TEMP_LIU_ID’;

SUM(BYTES)
———-
2097152

Elapsed: 00:00:00.13
SQL> truncate table user_data2.temp_liu;

Table truncated.

Elapsed: 00:00:00.28
SQL> drop index user_data2.IDX_TEMP_LIU_ID
2 ;

Index dropped.

Elapsed: 00:00:00.01
SQL> create index user_data2.IDX_TEMP_LIU_ID on user_data2.temp_liu (id);

Index created.

Elapsed: 00:00:00.01
SQL> insert into user_data2.temp_liu select ceil (rownum/2000000) from dual connect by level < 10000001; 10000000 rows created. Elapsed: 00:00:41.29 SQL> select sum(bytes) from dba_segments where segment_name=’IDX_TEMP_LIU_ID’;

SUM(BYTES)
———-
277872640

Elapsed: 00:00:00.21
SQL> select distinct id from user_data2.temp_liu ;

ID
———-
3
4
1
2
5

Elapsed: 00:00:01.60





可以看到 插入速度并没有得到明显的提升 但是size 仍然bitmap << b-tree


SQL> truncate table user_data2.temp_liu;

Table truncated.

Elapsed: 00:00:00.24
SQL> create bitmap index user_data2.IDX_TEMP_LIU_ID on user_data2.temp_liu (id);

Index created.

Elapsed: 00:00:00.02
SQL>
SQL>
SQL>
SQL> insert into user_data2.temp_liu select ceil (rownum/5) from dual connect by level < 10000001; 10000000 rows created. Elapsed: 00:01:39.25 SQL> commit;

Commit complete.

Elapsed: 00:00:00.22
SQL> select sum(bytes) from dba_segments where segment_name=’IDX_TEMP_LIU_ID’;

SUM(BYTES)
———-
104857600

Elapsed: 00:00:00.01





对比之前的插入操作 插入1000万数据 耗费时间 distinct count=5 << distinct count=2000000


SQL> truncate table user_data2.temp_liu ;

Table truncated.

Elapsed: 00:00:00.37
SQL> drop index user_data2.IDX_TEMP_LIU_ID
2 ;

Index dropped.

Elapsed: 00:00:00.01
SQL> create index user_data2.IDX_TEMP_LIU_ID on user_data2.temp_liu (id);

Index created.

Elapsed: 00:00:00.02
SQL> insert into user_data2.temp_liu select ceil (rownum/5) from dual connect by level < 10000001; 10000000 rows created. Elapsed: 00:00:38.16 SQL> select sum(bytes) from dba_segments where segment_name=’IDX_TEMP_LIU_ID’;

SUM(BYTES)
———-
177209344





可以看到 插入速度已经远远大于 bitmap-index 此时的size 也已经接近b-tree size



总结一下这个小实验 在distinct 大量增加的环境下bitmap 索引完全失去了优势 大量的维护代价 不但让索引的大小成倍增加 维护索引的代价也是剧增 并且在OLTP这种高并发的环境下BITMAP显得那么格格不入 BITMAP index 只能被用在DW DSS等低并发环境,当一条insert 执行的时候,如果column value=x的,由于要更新bitmap,需要lock value=x的所有行,所以千万别在oltp中使用bitmap index

可以给个流程图:

FROM Richard Foote

Bitmap indexes should only be considered in Data Warehouse, low concurrency DML type environments due to their locking implications and certainly pre 10g, Bitmap indexes had growth issues after significant DML changes. However it’s a complete nonsense to suggest that Bitmap indexes should only be considered with columns with “few” distinct values, else things will run 100s of times slower.





关于DML 操作对于bitmap index的影响 我们可以先看一下bitmap index block的结构:



In Oracle 9.2 and below, insertion is very inefficient
When a row is inserted
if leaf row already exists for ROWID then bitmap is updated
otherwise a new leaf row is inserted for the eight bits around the new row

In Oracle 10g, insertion is more efficient
rows are inserted into existing leaf rows where possible
if necessary start ROWID or end ROWID is adjusted





大量的DML 操作造成的 ROWID adjusted 的代价是很大的 所以在9i的库 我们经常发现bitmap index变的十分巨大(曾经见过一个2T的bitmap index)

上面的写了很多 看来bitmap index 真是一无是处了,但是在很多特定情况下 bitmap 还是具有一定优势的 就如上面第三个观点所说的 Multiple Columns的情况下
bitmap index merge :
Combining bitmap indexes on non-selective columns allows efficient AND and OR operations with a great number of rowids with minimal I/O

其实可以通过下面的图客观的反映出来

另外bitmap join index也是DW环境一个很不错的选择

总结 when to use bitmap index ?


1.Low cardinality

2.little concurrency DML statement

3.Multiple Columns using index merge in DW

4.bitmap join index in DW

5.just use it when you think your OLTP system is not bad enough 🙂

some change in cache buffer handle

June 7, 2012 Internals, oracle No comments

今天开发因为连接数不够 强烈要求加大processes值,突然想到了processes对_cursor_db_buffers_pinned的影响,,小记之:

当会话需要pin住buffer header时它首先要获去buffer handle,实例所存在的buffer handle总数由(each process can cache buffer handles*processes决定),也就是说在10g中可以认为 total buffer handle (_db_handles)=5*processes(默认_db_handles_cached为5),而在11g中这个值发生了变化(in 11g _db_handles_cached =8) 而一个process在同一时间内所能pin住的最大handle数由(_cursor_db_buffers_pinned)所决定,我们可以演示一下 _cursor_db_buffers_pinned是怎么算出来的:

SQL> select * from v$version;

BANNER
——————————————————————————–
Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 – 64bit Production
PL/SQL Release 11.2.0.3.0 – Production
CORE 11.2.0.3.0 Production
TNS for Linux: Version 11.2.0.3.0 – Production
NLSRTL Version 11.2.0.3.0 – Production

SQL>

COMPONENT CURRENT_SIZE
————————————————– ———————–
shared pool 8791261184
large pool 16777216
java pool 16777216
streams pool 33554432
DEFAULT buffer cache 59827552256

SQL> show parameter process;

NAME TYPE VALUE
———————————— ———– ——————————
aq_tm_processes integer 0
db_writer_processes integer 6
gcs_server_processes integer 12
job_queue_processes integer 10
log_archive_max_processes integer 2
processes integer 2100

59827552256/8192/2100=3477.699047619048

NAME VALUE DESCRIPTION
——————————————— ——————– ———————————————————————-
_cursor_db_buffers_pinned 3505 additional number of buffers a cursor can pin at once

可以看到 3478-2 ≈ 3505

_db_handles_cached – default 5
_db_handles – derived: possibly processes * _db_handles_cached
_cursor_db_buffers_pinned – derived: possibly (db_block_buffers / processes) – 2
_session_kept_cursor_pins – may do for cursors what _db_handles does for buffer headers (10g only)

关于这几个参数
_db_handles_cached in 11g -default 8
_session_kept_cursor_pins 在11gR2已经没有了这个参数
11g:
[oracle@testdb ~]$ ora params _session_kept_cursor_pins

no rows selected

10g:
[oracle@testdb2 ~]$ ora params _session_kept_cursor_pins

NAME VALUE DESCRIPTION
——————————————— ——————– ———————————————————————-
_session_kept_cursor_pins 0 Number of cursors pins to keep in a session

关于这块内容刘兄已经解释的很清楚:

“当会话需要pin住buffer header时它首先要获取buffer handle,得到buffer handle的过程中首先要抢占cache buffer handles栓,为了避免对于cache buffer handles栓的过度争用,每个会话被允许cache一小撮buffer handles,也叫保留集(reserved set)。该保留集的上限由隐式参数_db_handles_cached(默认为5)所控制,在此基础上会话在执行不是十分复杂的SQL时不必反复申请栓。
同时存在一些十分复杂的语句,需要同时pin住大量的缓存,此时若存在空闲的handle,会话被允许索要手头5个cached buffer handles以外的handle。也为了限制单个会话的资源占用量,Oracle通过内部算法总是尽量让每个进程能够pin住”合理份额“的buffer,由此单个会话同一时间所能持有的总handle数等于db_block_buffers/processes,隐式参数_cursor_db_buffers_pinned指出了该值。另cache buffer handles并没有子栓,仅有父栓本身,所以如果系统中有大量复杂SQL,会话所能cache的buffer handles远不足以满足需求时,该栓就成为整个系统性能的瓶颈,甚至可能出现潜在的死锁/饿死问题。”




可以参见这篇文章:Know more about Cache Buffer Handle

Oracle latch internals

June 6, 2012 Internals, oracle 4 comments

Oracle latch internals


Server latch implementation called KSL, e.g. every table with name starting with x$ksl… is latch
related and every SGA has ksl… structure as well. In order to get complete list of x$ksl table,
connect with sysdba permissions to oracle and run query:

SELECT kqftanam
FROM x$kqfta
WHERE kqftanam LIKE ‘X$KSL%’

eg:

[oracle@testdb ~]$ ora si

SQL*Plus: Release 11.2.0.3.0 Production on Thu Jun 6 15:37:53 2013

Copyright (c) 1982, 2011, Oracle. All rights reserved.

Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 – 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options

SQL> SELECT kqftanam
FROM x$kqfta
WHERE kqftanam LIKE ‘X$KSL%’ 2 3 ;

KQFTANAM
——————————
X$KSLLTR
X$KSLHOT
X$KSLLCLASS
X$KSLECLASS
X$KSLEMAP
X$KSLLD
X$KSLED
X$KSLCS
X$KSLSCS
X$KSLES
X$KSLSESHIST

KQFTANAM
——————————
X$KSLEI
X$KSLLW
X$KSLPO
X$KSLWSC
X$KSLWH
X$KSLWT

17 rows selected.

There are two structures associated with a latch in KSL – ksllt and kslla.

latch instance structure
struct ksllt {
union ksllt.kslltlat kslltlat;
eword kslltlvl; /* latch level, to enforce hierarchy */
ub2 kslltnum; /* latch number, to index descriptors */
uword kslltwhr; /* the context from where the latch is held */
ubig_ora kslltwhy; /* why the latch is being acquired */
ub4 kslltwgt; /* count of times gotten wait */
ub4 kslltngt; /* count of times gotten nowait */
eword kslltefd; /* error frame depth when gotten, this is used to */
ub4 kslltwff; /* count of wait gets that failed first try */
ub4 kslltnfa; /* count of times failed nowait get */
ub4 kslltntg; /* count of total gets of latch */
ub4 kslltwts; /* number of waiters */
ub4 ksllttmr; /* down/hold time timer */
kstat kslltwtt; /* wait time for this latch */
kstat ksllthdt; /* hold time for this latch */
kstat kslltdnt; /* down time for this latch */
kstat kslltwtw; /* wait time for associated wait list latch */
ub4 kslltwxs; /* count of X process waiting because of S holders */
ub4 kslltwsx; /* count of S process waiting because of X holder */
ub4 kslltwsw; /* count of S process waiting because of X waiter */
ub4 kslltmxs; /* X misses owing to S holders */
ub4 kslltmsx; /* S misses owing to X holders */
ub4 kslltmxw; /* S misses owing to X waiters */
ub4 kslltwsl; /* count of times slept when failed to get latch */
struct ksllt * kslltchg; /* last child gotten as part of get-parent op */
ub4 kslltwkc; /* count of wakeups that have been done */
ub4 kslltwth; /* count of sleeps that were holding another latch */
ub4 yields_ksllt; /* count of yields */
BitMask for “struct ksllt.yields_ksllt”
KSLHSTMAX BIT 4 // if you change this, must change x$ table

ub4 ksllthst[4];
ub4 gets_wl_ksllt; /* number of gets of wait list latch */
ub4 misses_wl_ksllt; /* number of misses on wait list latch */
ub4 sleeps_wl_ksllt; /* number of sleeps on wait list latch */
ub4 yields_wl_ksllt; /* number of yields on wait list latch */
struct kslla * waitproc_ksllt; /* log change to wait list */
kgglk waiters_ksllt; /* head of waiter list */
kgglk kslltlnk; /* linked list of children */
void * kslltrec; /* recovery info for this latch */
b1 kslltsp1[3]; /* 1 spare 1 byte field */
uword kslltcnm; /* child number */
uword class_ksllt; /* latch class */
skgslt wl_ksllt;
}

struct kslla {
ksllt * ksllalat[( 8 + 7 )]; /* latches owned at each level */
ksllt * ksllalaq; /* latch being acquired */
ksllt * wl_flux_kslla; /* latch whose wait list I’m acquiring */
ksllt * ksllaxbs; /* EXWAIT bit set in shared latch */
ksllt * ksllawat; /* latch being waited for: not protected by any latch, */
ubig_ora ksllawhy; /* context for the latch req */
uword ksllawere; /* location from where the req is being made */
word ksllaevt; /* event waiting for */
uword ksllalow; /* bit array of latches owned, for levels [0..9] */
ubig_ora ksllaevx; /* extension to the event id ksllaevt */
kgglk waiters_kslla; /* link on waiter list */
struct ksllt * ksllawtr; /* latch waiter list this process is on. */
struct ksllt * ksllaspn; /* latch this process is spinning on */
uword ksllawst; /* process wait state (asleep/awake) */
ub4 ksllamd[( 8 + 7 )]; /* For r/w latches, mode in which latch is held */
uword ksllalpe; /* latch post expected */
kgglk ksllapwl; /* Link in post/wait queue the process is on */
kslpwq * ksllapwq; /* post/wait queue the process is on */
sword ksllapwi; /* Index of p/w queue the process was last on */
krmid ksllapid; /* the res ID for the last post received */
krmid ksllasid; /* the res ID for the last post sent */
uword ksllaprv; /* loc ID for the last post received */
uword ksllapsn; /* loc ID (where) for the last post sent */
ub4 ksllapsc; /* count of # of posts sent by the process */
ub4 ksllaprc; /* count of # of posts received by the process */
struct ksupr * ksllapos; /* the last process to post me */
struct ksupr * ksllalpo; /* the last process posted by me */
kgslfx ksllafac; /* context for wait/post facilities */
uword ksllawrk; /* is there work for a process about to be posted */
ub2 ksllasfa; /* service managed wait facility */
BitMask for “struct kslla.ksllasfa”
KSLLASFA_IO BIT 0x0001 // ksfd I/O pending
KSLLASFA_IPC BIT 0x0002 // ksxp I/O pending

}

struct kslld {
text * kslldnam; /* human-readable latch name */
void (*kslldcln); /* _ sword action, ksllt *lp _ cleanup proc*/
void (*kslldgcl); /* _ kgsmp *gp, sword action, struct kgslt *lp _ */
size_t kslldrsz; /* recovery structure size */
b1 kslldlvl; /* latch level */
b1 kslldpnt; /* TRUE if parent of a class of latches */
ub1 kslldlng; /* TRUE if latch can be held for long periods */
b1 kslldg2c; /* parent allowing wait get of 2 children */
uword kslldshl; /* TRUE if latch is shareable */
uword kslldcls; /* latch class */
}

struct ksllt is a definition of the latch itself and all these structures (latches itself and pointers to
hold latches v processes) are visible on instance level via fixed SGA, while kslla is a process
related structure and is visible as part of X$KSUPR(Kernel Services User Process) structure

All available kslla structures can be identified by using the following query:

SELECT c.kqfconam field_name, c.kqfcooff offset , kqfcotyp,
DECODE (kqfcotyp ,
0, ‘hex string’,
1, ‘varchar2’ ,
2, ‘number’ ,
11, ‘word’,
12, ‘datetime’ ,
23, ‘raw’ ,
‘other’
) typ,
kqfcosiz sz
FROM x$kqfco c , x$kqfta t
WHERE t.indx = c.kqfcotab
AND kqftanam = ‘X$KSUPR’
AND kqfconam LIKE ‘KSLLA%’
AND c.kqfcooff > 0
ORDER BY offset ASC

For example, if you are using Oracle 9.2.0.5 you will see the following results

KSLLALAQ ,296 ,0,hex string ,8
KSLLAWAT ,320 ,0,hex string ,8
KSLLAWHY ,328 ,11,word,8
KSLLAWER ,336 ,11,word,4
KSLLALOW ,344 ,0,hex string ,4
KSLLASPN ,384 ,0,hex string ,8
KSLLAPRV ,544 ,11,word,4
KSLLAPSN ,548 ,11,word,4
KSLLAPSC ,552 ,11,word,4
KSLLAPRC ,556 ,11,word,4

Only ksllawat and ksllaspn are actually available via V$PROCESS; ksllawat represents latch waits and ksllaspn represents latch spins. Other fields can be used mostly for debugging purposes such as halt analysis, etc. For example, ksllalaq has pointer ksllt* to acquired latch set by KSLBEGIN, pointing to the head of linked list used in latch wait posting implementation (KSLBEGIN is a begin macro in the defined in the Oracle ksl.h code). Once a process acquires a latch at a certain level, it cannot acquire anymore latch at level that is equal or less than that level (unless it acquires nowait).

To support this issue, another nondocumented column ksllalow from X$KSUPR is used.
To see all the latches in the fixed SGA, the following query can be used:

select k.ksmfsadr , ksmfsnam, ksmfstyp, ksmfssiz, kslldnam, kslldlvl
from x$ksmfsv k , x$kslld a
where k .ksmfstyp like ‘%ksllt%’ and k.ksmfsadr = a.kslldadr
order by ksmfsnam

X$KSLLD : [K]ernel [S]ervice [L]atch [L]ock [D]escriptor
KSLLDLVL : [K]ernel [S]ervice [L]atch [L]ock [D]escriptor [L]atch Le[V]e[L]

eg:

SQL> select * from v$version;

BANNER
——————————————————————————–
Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 – 64bit Production
PL/SQL Release 11.2.0.3.0 – Production
CORE 11.2.0.3.0 Production
TNS for Linux: Version 11.2.0.3.0 – Production
NLSRTL Version 11.2.0.3.0 – Production

SQL> select distinct kslldlvl from x$kslld order by 1;

KSLLDLVL
———-
0
1
2
3
4
5
6
7
8
9
10

KSLLDLVL
———-
11
14
15
16

15 rows selected.

SQL> select * from v$version;

BANNER
—————————————————————-
Oracle Database 10g Enterprise Edition Release 10.2.0.5.0 – 64bi
PL/SQL Release 10.2.0.5.0 – Production
CORE 10.2.0.5.0 Production
TNS for Linux: Version 10.2.0.5.0 – Production
NLSRTL Version 10.2.0.5.0 – Production

SQL> select distinct kslldlvl from x$kslld order by 1;

KSLLDLVL
———-
0
1
2
3
4
5
6
7
8
9
11

KSLLDLVL
———-
12
13

13 rows selected.

11g got more levels than 10g but Most latches have a level between 0 and 8

Let’s see how we can lock / unlock enqueues latch ksqeql_ via oradebug. This latch is used to protect operation on KSE (enqueue) related structures in memory. Assum e , value of k.ks mf sadr of this latch th at received from the previous query was 000000038000A0C0 (0x000000038000A0C0 = 15032426688)

In order to lock the latch, we can use function

word kslgetl(ksllt *latch_to_lock , word wait )
sqlplus “/ as sysdba”
oradebug setmypid
SQL> oradebug call kslgetl 15032426688 1

Function returned the value 1, meaning that we locked the latch.

Now let’s try to connect to Oracle. You can see that your session was halt because you are holding enqueue latch, therefore Oracle is unable to update even its own X$ table entries. Let’s return to oradebug and will release (free) the enqueue latch.

SQL> oradebug call kslfre 15032426688Function returned 8000A0C0

This time yours another session continued immediately.
Now let’s check the enqueu latch by queriynig v$latch:

select wait_time
from v$latch
where name = ‘enqueues’

The wait time returned is extremely big (in exam ple it was 619714031, e.g. 10 m i nutes).
List of all latch related tables as following:

GV$LATCH
GV$LATCH_CHILDREN
GV$LATCH_PARENT
GV$DLM_LATCH

*Note: list excludes x$kslld which is process internal structure and not represented in the SGA at
whole.

GV$DLM_LATCH is a special fixed table, used in the OPS/RAC environment only to monitor distributed lock manager latch. This fixed view exists only for historical reasons, since DLM did not used KSL ksllt and kslla latch related structures prior to version 8.1.5. DLM had own kjsllt and kjslla structures. DLM uses standard KSL structures from version 8.1.5 and up, therefore DLM latches can be monitored via V$LATCH fixed views.
GV$LATCHHOLDER is a process, e.g. X$KSUPR, fixed array of process structures,
GV$LATCH_MISSES is table which is non-directly points to the latch structures.

The next interesting question is where in the Oracle code each latch is used. It is possible to see the latch contention on some latch. However its name would be meaningless so at least you can identify where in oracle code (up to function name) either this latch has been locked. Such identification can be done, in normal case by running the following query:

select parent_name, location, nwfail_count, sleep_count from v$latch_misses;

where column location divided to 1 – 3 parts divided by ‘:’
1) Oracle kernel function.
2) Optional kslbegin (macro to lock latch) or operation name.
3) Optional description or comment if single function has several locks/unlocks for the same
latch.

For example, let’s look at the function “kcbgtcr” used to “get cache buffer for consistent read” or in other words it reads buffer from disk to buffer cache and then roll it back up to first SCN of the query, caused CR read.

In Oracle 9.2.0.5 this function has 4 different places where “cache buffers lru chain” latch to which the appropriate related block can be locked. Execute the following query:

SELECT t1 .ksllasnam “parent_name” , t2 .ksllwnam “location” ,
t2.ksllwlbl “unit to guard”
FROM x$ksllw t2, x$kslwsc t1
WHERE t2 .indx = t1 .indx
AND t2 .ksllwnam LIKE ‘kcbgtcr%’
AND ksllasnam = ‘cache buffers lru chain’

Result of query above will be something like:

— — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — –
parent_name location unit to guard
— — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — –
cache buffers lru chain kc bgtcr:CR Scan:KCBRSTOP buffer header
cache buffers lru chain kc bgtcr:CR Scan:KCBRSAVE buffer header
cache buffers lru chain kc bgtcr:CR Scan:KCBRSKIP buffer header
cache buffers lru chain kc bgtcr:CR Scan:best buffer header
— — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — —


*Note: In next topic about buffer cache and buffer cache related latches in the future discussion we will talk about CR blocks and related parameters.

PMON invokes a clean up function for releasing latches if the holding process dies.PMON will initiate the clean up function.

for 10g:

NAME VALUE DESCRIPTION
——————————————— ——————– ———————————————————————-
_max_exponential_sleep 0 max sleep during exponential backoff
_max_sleep_holding_latch 4 max time to sleep while holding a latch
_spin_count 2000 Amount to spin waiting for a latch

The sleep time on latches approximately doubles with each wait until upper limit is reached. This is an exponential backoff.

If Oracle instance is halt and you have some reason to think that is caused by latching problem, then you can use oradebug to dump latch statistics:

1. connect as sysdba
2. In order to dump latch states, perform the following operations:

? oradebug setmypid
? oradebug dump latches 2
? oradebug tracefile_name
Trace file will be generated with name, for example such as /oravl01/oracle/adm/bigsun/udump/bigsun_ora_21039.trc

Open this trace file to see the latch with high and constantly increasing between dumps sleeps count in the case of willing-to-wait latch and failed count in the case of no-wait latch.

For example, if you performed 2 dumps with insignificant interval between and have seen in the first dump for some child redo allocation latch sleeps count of 3. In the second dump sleeps count specific cache buffers lru chain latch is guard for specific part of buffer lru chain and block we are going to use has entry in this hash chain of 13 with “failed first” increased to the same number as sleeps count and “gotten” counts remains the same. All this means that some (and at least one) process is waited constantly for this latch all
the time between two dumps, e.g. another process hold this latch.

Example:


dump (1):

396670178 Child redo allocation level=5 child#=1
Location from where latch is held : kcrfwi : more space: strand #
Context saved from call : 0
state=free
gotten 7776125 times wait, failed first 355 sleeps 3
gotten 0 times nowait, failed : 0


dump (2):

396670178 Child redo allocation level=5 child#=1
Location from where latch is held : kcrfwi : more space: strand #
Context saved from call : 0
state=free
gotten 7776125 times wait, failed first 365 sleeps 13
gotten 0 times nowait, failed : 0

Let’s try and emulate such “bug” example. To do that, we will need two sessions namely sess_1 and sess_2. Our child redo allocation latch address is 0x396670178 that can be converted into 15408234872. sess_1 will be connected as sysdba and sess_2 as oracle dba:

step 1. sess_1: oradebug setmypid
step 2. sess_1: oradebug dump latches 2
step 3. sess_1: oradebug call kslgetl 15408234872 1
step 4. sess_2: create table test1 as select * from dba_objects;
step 5. wait 1 minute
step 6. sess_1: oradebug call kslfre 15408234872
step 7. sess_1: oradebug dump latches 2

On the step 4 session sess_2 was halted. If you will try performing dump latches while some latch is locked, you will have a good chance to receive well known ORA-03113 (end-of-file on communication channel). However this is relates to latch and platform dependent.

Due to the error stacks it is looks like this is Oracle bug since latch state represented in the latch dump. However during all tests we have performed, we never seen in the dump files states other then “state=free”. It’s seems that Oracle waits several seconds until each latch will be freed and then dumps its state. If latch has not been freed during several seconds, ORA-03113 may occur…

396670178 Child redo allocation level=5 child#=1
Location from where latch is held : kcrfwi : more space: strand #
Context saved from call : 0
state=free
gotten 7784997 times wait, failed first 362 sleeps 10
gotten 0 times nowait, failed : 0


396670178 Child redo allocation level=5 child#=1
Location from where latch is held : kcrfwi: before write: strand #
Context saved from call : 0
state =free
gotten 7786065 times wait, failed first 367 sleeps 14
gotten 0 times nowait, failed: 0

In this case “gotten” increased significantly since to perform following statement:

create table test1 as select * from dba_objects;

more then 1000 times redo the space allocations has been performed.

In the general case, even if your instance is not halt but you want to see latch statistics in this format, you can dump latches statistics by running

ALTER SESSION SET EVENTS ‘immediate trace name latches level 2’;

Dump latches have two available levels:
? level 1 when dump just basic latch information (without statistics)
? level 2 when dump just latch information with statistics

and you can also user event 10005 before 9i

eg:

event = “10005 trace name context forever, level 1”

“hcheck.sql” script to check for known problems in Oracle8i, Oracle9i, Oracle10g and Oracle 11g

April 20, 2012 Internals, oracle No comments

Execution Environment:

Access Privileges:
Requires to be run connected as SYS schema

Usage:
$ sqlplus
SQL*Plus: Release 9.2.0.2.0 – Production on Mon Nov 11 12:00:06 2002

Copyright (c) 1982, 2002, Oracle Corporation. All rights reserved.

Enter user-name: / as sysdba

Connected to:
Oracle9i Enterprise Edition Release 9.2.0.2.0 – Production
With the Partitioning, OLAP and Oracle Data Mining options
JServer Release 9.2.0.2.0 – Production
SQL> set serveroutput on size unlimited
SQL> spool outputfile
SQL> execute hcheck.full


Instructions:

1. Connect as SYS schema in sqlplus
2. Create package hOut as described in Note:101468.1
3. Create package hcheck in SYS schema (Refer the attachments under SCRIPT to Create package hcheck, this is hcheck2.sql and/or hcheck3.sql
4. spool outputfile
5. execute hcheck.full

6. Output will go to the spool file and the session trace file.
The script will report various dictionary related issues
which may or may not be a problem – Any problems reported
should be reviewed by an experienced support analyst as some
reported “problems” may be normal and expected.

PROOFREAD THE SCRIPT BEFORE USING IT! Due to differences in the way text
editors, e-mail packages, and operating systems handle text formatting (spaces,
tabs, and carriage returns), this script may not be in an executable state
when you first receive it. Check over the script to ensure that errors of
this type are corrected.The script will produce an output file named [outputfile].
This file can be viewed in a browser or uploaded for support analysis.

Description
To provide a single package which looks for common data dictionary
problems.
The script can be used with Oracle versions 9i onwards (release 2.00 includes 8.1)
It checks consistency of certain dictionary relationships and looks for certain
known issues – certain reported “problems” will be normal and expected.

This script is for use mainly under the guidance of Oracle Support.

eg:

SQL> exec hcheck.full
H.Check Version 9i+/hc3.45
—————————————
Catalog Version 11.2.0.3.0 (1102000300)
—————————————

Catalog Fixed
Procedure Name Version Vs Release Run
—————————— … ———- — ———- —
.- SynLastDDLTim … 1102000300 > 1001000200 : n/a
.- LobNotInObj … 1102000300 > 1000000200 : n/a
.- MissingOIDOnObjCol … 1102000300 <= *All Rel* : Ok .- SourceNotInObj ... 1102000300 > 1002000100 : n/a
.- IndIndparMismatch … 1102000300 > 1102000100 : n/a
.- InvCorrAudit … 1102000300 > 1102000100 : n/a
.- OversizedFiles … 1102000300 <= *All Rel* : Ok .- TinyFiles ... 1102000300 > 900010000 : n/a
.- PoorDefaultStorage … 1102000300 <= *All Rel* : Ok .- PoorStorage ... 1102000300 <= *All Rel* : Ok .- MissTabSubPart ... 1102000300 > 900010000 : n/a
.- PartSubPartMismatch … 1102000300 > 1102000100 : n/a
.- TabPartCountMismatch … 1102000300 <= *All Rel* : Ok .- OrphanedTabComPart ... 1102000300 > 900010000 : n/a
.- ZeroTabSubPart … 1102000300 > 902000100 : n/a
.- MissingSum$ … 1102000300 <= *All Rel* : Ok .- MissingDir$ ... 1102000300 <= *All Rel* : Ok .- DuplicateDataobj ... 1102000300 <= *All Rel* : Ok .- ObjSynMissing ... 1102000300 <= *All Rel* : Ok .- ObjSeqMissing ... 1102000300 <= *All Rel* : Ok .- OrphanedUndo ... 1102000300 <= *All Rel* : Ok .- OrphanedIndex ... 1102000300 <= *All Rel* : Ok .- OrphanedIndexPartition ... 1102000300 <= *All Rel* : Ok .- OrphanedIndexSubPartition ... 1102000300 <= *All Rel* : Ok .- OrphanedTable ... 1102000300 <= *All Rel* : Ok .- OrphanedTablePartition ... 1102000300 <= *All Rel* : Ok .- OrphanedTableSubPartition ... 1102000300 <= *All Rel* : Ok .- MissingPartCol ... 1102000300 <= *All Rel* : Ok .- OrphanedSeg$ ... 1102000300 <= *All Rel* : Ok .- OrphanedIndPartObj# ... 1102000300 > 1101000600 : n/a
.- DuplicateBlockUse … 1102000300 <= *All Rel* : Ok .- HighObjectIds ... 1102000300 > 801060000 : n/a
.- PQsequence … 1102000300 > 800060000 : n/a
.- TruncatedCluster … 1102000300 > 801070000 : n/a
.- FetUet … 1102000300 <= *All Rel* : Ok .- Uet0Check ... 1102000300 <= *All Rel* : Ok .- ExtentlessSeg ... 1102000300 <= *All Rel* : Ok .- SeglessUET ... 1102000300 <= *All Rel* : Ok .- BadInd$ ... 1102000300 <= *All Rel* : Ok .- BadTab$ ... 1102000300 <= *All Rel* : Ok .- BadIcolDepCnt ... 1102000300 > 1101000700 : n/a
.- WarnIcolDep … 1102000300 > 1101000700 : n/a
.- OnlineRebuild$ … 1102000300 <= *All Rel* : Ok .- DropForceType ... 1102000300 > 1001000200 : n/a
.- TrgAfterUpgrade … 1102000300 <= *All Rel* : Ok .- FailedInitJVMRun ... 1102000300 <= *All Rel* : Ok .- TypeReusedAfterDrop ... 1102000300 > 900010000 : n/a
.- Idgen1$TTS … 1102000300 > 900010000 : n/a
.- DroppedFuncIdx … 1102000300 > 902000100 : n/a
.- BadOwner … 1102000300 > 900010000 : n/a
.- UpgCheckc0801070 … 1102000300 <= *All Rel* : Ok .- BadPublicObjects ... 1102000300 <= *All Rel* : Ok .- BadSegFreelist ... 1102000300 <= *All Rel* : Ok .- BadCol# ... 1102000300 > 1001000200 : n/a
.- BadDepends … 1102000300 <= *All Rel* : Ok .- CheckDual ... 1102000300 <= *All Rel* : Ok .- ObjectNames ... 1102000300 <= *All Rel* : Ok .- BadCboHiLo ... 1102000300 <= *All Rel* : Ok .- ChkIotTs ... 1102000300 <= *All Rel* : Ok .- NoSegmentIndex ... 1102000300 <= *All Rel* : Ok .- BadNextObject ... 1102000300 <= *All Rel* : Ok .- OrphanIndopt ... 1102000300 > 902000800 : n/a
.- UpgFlgBitTmp … 1102000300 > 1001000100 : n/a
.- RenCharView … 1102000300 > 1001000100 : n/a
.- Upg9iTab$ … 1102000300 > 902000400 : n/a
.- Upg9iTsInd … 1102000300 > 902000500 : n/a
.- Upg10gInd$ … 1102000300 > 1002000000 : n/a
.- DroppedROTS … 1102000300 <= *All Rel* : Ok .- ChrLenSmtcs ... 1102000300 > 1101000600 : n/a
.- FilBlkZero … 1102000300 <= *All Rel* : Ok .- DbmsSchemaCopy ... 1102000300 <= *All Rel* : Ok Found 0 potential problem(s) and 0 warning(s) PL/SQL procedure successfully completed.

创建index之前如何确定其大小

March 8, 2012 Internals, maintain, oracle 1 comment

昨天看了roger的一篇文章,如何在创建index 之前估算它的大小,这里有一种更简单的方法实现:

主要思路就是采用了dbms_space.create_index_cost这个包

–create a testing table named t
SQL> create table t as select * from dba_objects;

Table created.

SQL> declare
begin
for i in 1..10 loop
insert /*+ APPEND */ into t select * from t;
commit;
end loop;
end;
/

PL/SQL procedure successfully completed.

SQL> select count(*) from t;

COUNT(*)
———-
38171648

1 row selected.

–As told by Oracle docs, you have to have updated statistics in order to allow CREATE_INDEX_COST to produce accurate estimation
SQL> exec dbms_stats.gather_table_stats(‘LIU’,’T’);

PL/SQL procedure successfully completed.

SQL> variable used_bytes number
SQL> variable allocated_bytes number
SQL> set autoprint on
SQL> exec dbms_space.create_index_cost( ‘create index t_idx on t(object_id,object_name,object_type) tablespace PURCHASE’, :used_bytes, :allocated_bytes );

PL/SQL procedure successfully completed.

ALLOCATED_BYTES
—————
1946157056

USED_BYTES
———-
1259664384

Verification
SQL> create index t_idx on t(object_id,object_name,object_type);

Index created.

SQL> select segment_name ,sum(bytes) from dba_segments where owner=’LIU’ and segment_name=’T_IDX’ group by segment_name;

SEGMENT_NAME SUM(BYTES)
——————————————————————————— ———-
T_IDX 1930428416

1 row selected.

可以看到预计空间为 1946157056byte 实际空间为 1930428416byte 结果极其的接近。

按照roger的方法 我们再计算一次:

对于8k的block:

block_size(8192)= = kcbh + ktbbh + kdxle + kd_off + DATA + block_size * Pctfree + 4
= 20 + 72 + 32 + 2 * Y + Y * ( 20 + 5 + 8 + rowid ) + 8192 * 0.1 + 4
8192 = 92 + 32 + 45Y + 819.2 + 4

SQL> SELECT (8192-92-32-819.2-4)/45 FROM dual;

(8192-92-32-819.2-4)/45
———————–
160.995

SQL> SELECT 38171648/((8192-92-32-819.2-4)/45) *8192 from dual;

38171648/((8192-92-32-819.2-4)/45)*8192
————————————————-
1942249319.354037

可以看到如果单用leaf block计算 同样很接近真实值,当然如果对一个very big的index而言 仅仅用leaf block去计算 还是会出现一定的偏差

Reference:http://www.killdb.com/2012/02/26/how-to-calculate-indexes-size-when-you-create-index.html?wpc=dlc#comment-805

11gR2 RAC Rebootless Node Fencing

March 6, 2012 Internals, oracle, RAC No comments

Rebootless Node Fencing

In versions before 11.2.0.2 Oracle Clusterware tried to prevent a split-brain with a fast reboot (better: reset) of the server(s) without waiting for ongoing I/O operations or synchronization of the file systems. This mechanism has been changed in version 11.2.0.2 (first 11g Release 2 patch set). After deciding which node to evict, the clusterware:


. attempts to shut down all Oracle resources/processes on the server (especially processes generating I/Os)

. will stop itself on the node

. afterwards Oracle High Availability Service Daemon (OHASD) will try to start the Cluster Ready Services (CRS) stack again. Once the cluster interconnect is back online,all relevant cluster resources on that node will automatically start

. kill the node if stop of resources or processes generating I/O is not possible (hanging in kernel mode, I/O path, etc.)

This behavior change is particularly useful for non-cluster aware applications.

[cssd(3713)]CRS-1610:Network communication with node rac1 (1) missing for 90% of timeout interval.
Removal of this node from cluster in 2.190 seconds

[cssd(3713)]CRS-1652:Starting clean up of CRSD resources.

[cssd(3713)]CRS-1654:Clean up of CRSD resources finished successfully.
[cssd(3713)]CRS-1655:CSSD on node rac2 detected a problem and started to shutdown.

[cssd(5912)]CRS-1713:CSSD daemon is started in clustered mode

与11gR2之前的机制不一样,Oracle不再直接kill掉这个node 而是采取了kill 相关的process 如果尝试kill失败 则会去kill node,这种机制对于big cluster 是一种很好的保护,避免了node reboot 之后 resources re-mastered 导致的资源冻结,下面这段话比较详细的说明了这个观点:


Prior to 11g R2, during voting disk failures the node will be rebooted to protect the integrity of the cluster. But rebooting cannot be necessarily just the communication issue. The node can be hanging or the IO operation can be hanging so potentially the reboot decision can be the incorrect one. So Oracle Clusterware will fence the node without rebooting. This is a big (and big) achievement and changes in the way the cluster is designed.

The reason why we will have to avoid the reboot is that during reboots resources need to re-mastered and the nodes remaining on the cluster should be re-formed. In a big cluster with many numbers of nodes, this can be potentially a very expensive operation so Oracle fences the node by killing the offending process so the cluster will shutdown but the node will not be shutdown. Once the IO path is available or the network heartbeat is available, the cluster will be started again. Be assured the data will be protected but it will be done without any pain rebooting the nodes. But in the cases where the reboot is needed to protect the integrity, the cluster will decide to reboot the node.

reference from :RAC_System_Test_Plan_Outline_11gr2_v2_0

Scripts:lock_internal

February 10, 2012 Internals, oracle No comments


set echo off

————————————————–
— @name: lock_internal
— @author: dion cho
— @description: my own version of dba_lock_internal
—————————————————

define __SID = “&1”
define __NAME = “&2”

col handle format a10
col type format a5

select
*
from (
— enqueue
select
sid,
type,
‘(‘||id1||’,’||id2||’)’ as “handle”,
lmode as “mod”,
request as “req”,
(select object_name from dba_objects where object_id = id1 and rownum=1) as name
from v$lock
where sid in (&__SID)
union all
— library cache lock
select
(select sid from v$session where saddr = kgllkuse) as sid,
‘lock’ as “type”,
kgllkhdl||” as “handle”,
kgllkmod as “mod”,
kgllkreq as “req”,
substr(kglnaobj, 1, 60) as name
from sys.x$kgllk
where kgllkuse in (select saddr from v$session where sid in (&__SID))
and (kgllkmod > 0 or kgllkreq > 0)
union all
— library cache pin
select
(select sid from v$session where saddr = kglpnuse) as sid,
‘pin’ as “type”,
kglpnhdl||” as “handle”,
kglpnmod as “mod”,
kglpnreq as “req”,
(select substr(kglnaobj, 1, 60) from sys.x$kglob where kglhdadr = kglpnhdl and rownum = 1) as name
from sys.x$kglpn
where kglpnuse in (select saddr from v$session where sid in (&__SID))
and (kglpnmod > 0 or kglpnreq > 0)
)
where name like ‘&__NAME’
;

set echo on

example:


SQL> @lock_internal 159 %t2%
SQL> set echo off
old 13: where sid in (&__SID)
new 13: where sid in (159)
old 24: where kgllkuse in (select saddr from v$session where sid in (&__SID))
new 24: where kgllkuse in (select saddr from v$session where sid in (159))
old 36: where kglpnuse in (select saddr from v$session where sid in (&__SID))
new 36: where kglpnuse in (select saddr from v$session where sid in (159))
old 39: where name like ‘&__NAME’
new 39: where name like ‘%t2%’

SID TYPE handle mod req NAME
———- —– ———- ———- ———- ————————————————–
159 lock 3C8C9E1C 1 0 select /*+ session_cache */ * from t2
159 lock 393E4FD0 1 0 select /*+ session_cache */ * from t2

How does “Nologging” attribute

January 13, 2012 Internals, oracle No comments

NOLOGGING is supported in only a subset of the locations that support LOGGING. Only the following operations support the NOLOGGING mode:
DML:
1. Direct-path INSERT (serial or parallel) resulting either from an INSERT or a
MERGE statement. NOLOGGING is not applicable to any UPDATE operations
resulting from the MERGE statement.
2. Direct Loader (SQL*Loader)

DDL:
1. CREATE TABLE … AS SELECT
2. CREATE TABLE … LOB_storage_clause … LOB_parameters … NOCACHE |
CACHE READS
3. ALTER TABLE … LOB_storage_clause … LOB_parameters … NOCACHE |
CACHE READS (to specify logging of newly created LOB columns)
4. ALTER TABLE … modify_LOB_storage_clause … modify_LOB_parameters
… NOCACHE | CACHE READS (to change logging of existing LOB columns)
5. ALTER TABLE … MOVE
6. ALTER TABLE … (all partition operations that involve data movement)
– ALTER TABLE … ADD PARTITION (hash partition only)
– ALTER TABLE … MERGE PARTITIONS
– ALTER TABLE … SPLIT PARTITION
– ALTER TABLE … MOVE PARTITION
– ALTER TABLE … MODIFY PARTITION … ADD SUBPARTITION
– ALTER TABLE … MODIFY PARTITION … COALESCE SUBPARTITION
7. CREATE INDEX
8. ALTER INDEX … REBUILD
9. ALTER INDEX … REBUILD [SUB]PARTITION
10. ALTER INDEX … SPLIT PARTITION
For objects other than LOBs, if you omit this clause, then the logging attribute of the
object defaults to the logging attribute of the tablespace in which it resides.
For LOBs, if you omit this clause:
If you specify CACHE, then LOGGING is used (because you cannot have CACHE
NOLOGGING).
If you specify NOCACHE or CACHE READS, then the logging attribute defaults to the
logging attribute of the tablespace in which it resides.
NOLOGGING does not apply to LOBs that are stored inline with row data. That is, if
you specify NOLOGGING for LOBs with values less than 4000 bytes and you have not
disabled STORAGE IN ROW, then Oracle ignores the NOLOGGING specification and
treats the LOB data the same as other table data.

The NOLOGGING clause also specifies that subsequent direct loads using SQL*Loader
and direct load INSERT operations are not logged. Subsequent DML statements
(UPDATE, DELETE, and conventional path insert) are unaffected by the NOLOGGING
attribute of the table and generate redo.

Nologging Table Still generating Lots Of Archive Logs [ID 976722.1]


Modified 28-NOV-2011 Type HOWTO Status MODERATED
In this Document
Goal
Solution
References

This document is being delivered to you via Oracle Support’s Rapid Visibility (RaV) process and therefore has not been subject to an independent technical review.
Applies to:

Oracle Server – Enterprise Edition – Version: 10.2.0.4 to 10.2.0.4 – Release: 10.2 to 10.2
Information in this document applies to any platform.
***Checked for relevance on 28-Nov-2011***
Goal

Why is a Nologging Table still generating lots of archive log.

Solution

The nologging concept is still the same. Some redo still does get generated for objects regardless if the nologging option is set.

When nologging is set, the amount of redo generated is reduced but it is still generated.

For instance the DDL and DML information regarding the creation of any new objects is logged in the REDO logs for recovery purposes .

The situations where no log is generated is covered by:
Note 188691.1 “How to Avoid Generation of Redolog Entries”

Nologging operations are invoked by any of the following:
* SQL*Loader direct load operations
* Direct load INSERT operations from CREATE TABLE | INDEX or INSERT commands
* Loading into an object containing LOB data when its object’s segment characteristic is NOCACHE NOLOGGING

For databases in ARCHIVELOG mode, nologging operations can only occur for a particular object if and only if:

* Database allows for nologging (ALTER DATABASE NO FORCE LOGGING) and
* Tablespace allows for nologging (ALTER TABLESPACE NO FORCE LOGGING)
* Object allows for nologging (ALTER TABLE NOLOGGING)

NOTE:
The above means that if FORCE LOGGING is enabled at a higher level, the NOLOGGING at a lower level has no effect. That means that even for a nologging table redo information may still be logged.

Another aspect you should keep in mind is that, if the database is in NOARCHIVELOG mode, then the NOLOGGING flag has no effect.
For more information refer to :
Note 290161.1 – The Gains and Pains of Nologging Operations

Take also into account that by default the Oracle Database does not provide any supplemental logging which means that by default LogMiner is not usable/useful.

Supplemental logging should be enabled prior to generating log files that will be analyzed by LogMiner.

When you enable supplemental logging, additional information is recorded in the redo stream that is needed to make the information in the redo log files useful to you. Therefore, at the very least, you must enable minimal supplemental logging, as the following SQL statement shows:

SQL> ALTER DATABASE ADD SUPPLEMENTAL LOG DATA;

To determine whether supplemental logging is enabled, query the V$DATABASE view, as the following SQL statement shows:

SQL> SELECT SUPPLEMENTAL_LOG_DATA_MIN FROM V$DATABASE;

If the query returns a value of YES or IMPLICIT, minimal supplemental logging is enabled.

You can turn off Supplemental Logging by following command.

SQL> ALTER DATABASE DROP SUPPLEMENTAL LOG DATA;

Minimal supplemental logging ensures that Logminer (and any products building on Logminer technology) will have sufficient information to support chained rows and various storage arrangements such as cluster tables.

Minimal Supplemental Logging
o is required for Logminer
o includes additional information in redo stream

In summary, enabling supplemental logging would also increase the amount of information gathered in the log because of the increased granularity.

References

NOTE:188691.1 – How to Avoid Generation of Redolog Entries
NOTE:290161.1 – The Gains and Pains of Nologging Operations
NOTE:750198.1 – Effect of Supplemental Logging on LogMiner with Example