oracle data

Resource Manager (2)

September 9, 2012 maintain, oracle No comments

对于需要使用资源控制的应用,oracle提出了resource manager这一解决方案,针对不同的schema,Oracle提供不同的资源等级,但是这种资源的分配其实本身就存在一定的资源消耗,当大量的session涌入数据库的时候,oracle需要消耗一定数量的cpu去重新分配资源的调级,此时就会伴随的大量resmgr:cpu quantum等待事件,下面总结一下这次的测试:

环境简述,本次测试resource manager采用oracle 11.2.0.3 平台为IBM power750小型机具体配置如下:

System Model: IBM,8233-E8B
Machine Serial Number: 065443R
Processor Type: PowerPC_POWER7
Processor Implementation Mode: POWER 7
Processor Version: PV_7_Compat
Number Of Processors: 32
Processor Clock Speed: 3612 MHz
CPU Type: 64-bit
Kernel Type: 64-bit
LPAR Info: 1 06-5443R
Memory Size: 256512 MB
Good Memory Size: 256512 MB
Platform Firmware level: AL730_066
Firmware Version: IBM,AL730_066
Console Login: enable
Auto Restart: true
Full Core: false

压力端模拟大量bad sql 这些sql都是典型的消耗buffer get 与CPU的sql 在高并发的情况下 可以瞬间将cpu压至满负荷,参考下图:

可以看到压力测试在开启的几个时段可以直接将系统CPU 资源耗尽,下面的测试也是基于这个基础,使用user_data2跑bad sql ,而我们的test用户跑正常的sql

这套系统中我们使用swingbench正常测试结果为下图:

使用swingbench 100并发session,下面开始这次的测试
我们使用PLAN_NAME=’USER_TEST’ 分别设置3个level (TEST->level 1, USER_DATA2->level 2,Other->level 3):

exec DBMS_RESOURCE_MANAGER.CREATE_PENDING_AREA();
exec DBMS_RESOURCE_MANAGER.CREATE_PLAN(PLAN=>'USER_TEST',COMMENT=>'Resource plan/method for USER');
exec DBMS_RESOURCE_MANAGER.CREATE_CONSUMER_GROUP(CONSUMER_GROUP=> 'TEST_GROUP',COMMENT => 'Resource plan user group for TEST_GROUP');
exec DBMS_RESOURCE_MANAGER.CREATE_CONSUMER_GROUP(CONSUMER_GROUP=> 'USER_GROUP',COMMENT => 'Resource plan user group for USER_GROUP');
exec DBMS_RESOURCE_MANAGER.CREATE_PLAN_DIRECTIVE(PLAN=>'USER_TEST',GROUP_OR_SUBPLAN => 'TEST_GROUP',COMMENT => 'FIRST',mgmt_p1 => 100,mgmt_p2 => 0,mgmt_p3=>0);
exec DBMS_RESOURCE_MANAGER.CREATE_PLAN_DIRECTIVE(PLAN=>'USER_TEST',GROUP_OR_SUBPLAN => 'OTHER_GROUPS',COMMENT => 'OTHER',mgmt_p1 => 0, mgmt_p2 => 0,mgmt_p3=>100);
exec DBMS_RESOURCE_MANAGER.CREATE_PLAN_DIRECTIVE(PLAN=>'USER_TEST',GROUP_OR_SUBPLAN => 'USER_GROUP',COMMENT => 'SECECOND',mgmt_p1 => 0, mgmt_p2 => 100,mgmt_p3=>0);
exec DBMS_RESOURCE_MANAGER.SET_CONSUMER_GROUP_MAPPING (DBMS_RESOURCE_MANAGER.ORACLE_USER,'USER_DATA2', 'USER_GROUP');
exec DBMS_RESOURCE_MANAGER.SET_CONSUMER_GROUP_MAPPING (DBMS_RESOURCE_MANAGER.ORACLE_USER,'TEST', 'TEST_GROUP');
exec dbms_resource_manager.switch_plan( plan_name => 'USER_TEST', sid => 'USER' )
exec DBMS_RESOURCE_MANAGER.SET_INITIAL_CONSUMER_GROUP(user=>'TEST',consumer_group=>'TEST_GROUP');
exec DBMS_RESOURCE_MANAGER.SET_INITIAL_CONSUMER_GROUP(user=>'USER',consumer_group=>'USER_GROUP');
exec DBMS_RESOURCE_MANAGER.VALIDATE_PENDING_AREA();
exec DBMS_RESOURCE_MANAGER.SUBMIT_PENDING_AREA();

在开启Resource manager之后 保证test用户得到最大的资源利用,也就是说保证正常的应用最大程度的利用CPU资源,user_data2模拟bad sql(由测试部门提供专门的压力机器)

1.3600 bad sql sessions + 500 normal sessions (不开启RM):

2.3600 bad sql sessions + 500 normal sessions (开启RM)

3.128 bad sql sessions + 128 normal sessions (不开启RM)

4.128 bad sql sessions + 128 normal sessions (开启RM)

另外同事也测试了同一条SQL 在不同schema的情况通过AWR 可以发现oracle控制了被我们降级为level2的user_data2的资源使用(仅使用0.01%)但是DB time大量浪费在了cpu quantum

注意以上测试基于schema分级测试,使用不同schema测试得出的结论,在实际生产中同一套系统的不同应用一般不会区别schema,这个在下面的总结里会给到,在我们的final test 中,在同一个schema中开启RM 会影响到系统的总体吞吐量

总结
开启Resource Manager 的情况下 不同 schema 在资源分配level 1的情况下TPS TPM都有所提升 但是总体提升不多 (1200->1700),由于开启了RM 导致出现如下等待: resmgr:cpu quantum 这种等待会消耗大量的CPU 在系统不是很忙的情况下可能会消耗一定的系统资源。并且在不开启RM的情况下 CPU全部被耗尽的情况下:模拟前台的SQL依然可以正常工作TPS下降到1200左右,但在开启RM的情况下 oracle确实做到了schema级别的资源控制,这对特别重要的app起到了一定的保护作用,在最后的final test中开启RM 明显影响到了系统的吞吐量,前台的应用在保证了优先级的情况下SQL执行次数减少了一半之多。
Oracle为我们提供了一种在数据库层面实现资源分级的手段,至于这种方式是否合适,要针对不同的系统来调试不同的方案,正所谓没有完美的技术,只有完美的方案,如果单从个人角度来说不建议在核心系统开启RM。在Exadata系统中 oracle为我们提供了IORM在cell层面控制存储的IO资源分配,是将这个技术从DB层面延伸到了存储层面。此次测试数据仅供参考。

data_object_id

October 24, 2011 Internals, oracle 1 comment

xxx_objects.data_object_id differs from object_id after

* truncate table (or alter table truncate partition), unless table/partition is empty
* alter table move (or alter table move partition)
* alter index rebuild (or alter index rebuild partition)
* alter table exchange partition

Alter table move even without TABLESPACE clause (or specifying the same tablespace) physically
moves the table, based on dba_extents.file_id and block_id. But truncate table doesn’t move the
table. Think of data_object_id as an ID for the segment. If xxx_objects.data_object_id is null, it
must be an object not associated with a physical segment, such as view, sequence, domain index,
partitioned table/index whose segments are in the individual partitions.

Alter table move lob also changes lob segment’s data_object_id.

* How much does data_object_id increment?

Except in case of partition-table exchange, data_object_id is brought up to

select max(data_object_id)+1 from dba_objects

If you just create a new table, it will be assigned an object_id and data_object_id of

select greatest(max(object_id), max(data_object_id)) from dba_objects

If the table has one index, truncate will increment data_object_id by 2 because its index takes the
number one below it. If there’s no index, truncate should increment data_object_id by 1 only.

Exchange of a partition with a table swaps their data_object_id’s. So don’t assume data_object_id’s
always go up.

If the table or its partition is already empty, truncating it does not increment data_object_id.
(Rebuilding an empty index or its partition still increments data_object_id, and in case of
online index rebuild, data_object_id could increment by 3 because of “transient” objects created
and dropped during the rebuild.)

* When to use which ID?

You use data_object_id in dbms_rowid, v$bh and x$bh, but object_id in most other cases, such as
v$locked_object. Note that v$segstat, v$segment_statistics and v$logmnr_contents have both.

reference:http://yong321.freeshell.org/oranotes/DataObjectId.txt

datafile Checkpoint cnt

October 24, 2011 Internals, oracle 2 comments

通过dump datafile header 发现每个Checkpoint cnt不同

大致原因如下:

1.tablespace 建立的时间不同 现在我们新建一个tablespace ->liu datafile->liu01.dbf

DATA FILE #10:
(name #14) /u01/liu01.dbf
creation size=128 block size=8192 status=0xe head=14 tail=14 dup=1
tablespace 8, index=8 krfil=10 prev_file=0
unrecoverable scn: 0x0000.00000000 01/01/1988 00:00:00
Checkpoint cnt:2 scn: 0x0000.005339bb 10/24/2011 10:19:42
Stop scn: 0xffff.ffffffff 10/24/2011 10:19:42
Creation Checkpointed at scn: 0x0000.005339ba 10/24/2011 10:19:42
thread:1 rba:(0x37.c3a3.10)

DATA FILE #1:
(name #7) /u01/app/oracle/oradata/css/system01.dbf
creation size=0 block size=8192 status=0xe head=7 tail=7 dup=1
tablespace 0, index=1 krfil=1 prev_file=0
unrecoverable scn: 0x0000.00000000 01/01/1988 00:00:00
Checkpoint cnt:131 scn: 0x0000.0052d2cc 10/22/2011 09:58:13
Stop scn: 0xffff.ffffffff 10/21/2011 17:10:54

2.当tablespace 处于hot backup 或者offline 的时候 发生checkpoint 我们再建一个tablespace ->liu01 datafile ->liu101.dbf

DATA FILE #11:
(name #15) /u01/liu101.dbf
creation size=128 block size=8192 status=0xe head=15 tail=15 dup=1
tablespace 9, index=9 krfil=11 prev_file=0
unrecoverable scn: 0x0000.00000000 01/01/1988 00:00:00
Checkpoint cnt:2 scn: 0x0000.00533ffe 10/24/2011 10:25:22

alter tablespace liu begin backup;

DATA FILE #10:
(name #14) /u01/liu01.dbf
creation size=128 block size=8192 status=0xe head=14 tail=14 dup=1
tablespace 8, index=8 krfil=10 prev_file=0
unrecoverable scn: 0x0000.00000000 01/01/1988 00:00:00
Checkpoint cnt:8 scn: 0x0000.00534033 10/24/2011 10:26:56

DATA FILE #11:
(name #15) /u01/liu101.dbf
creation size=128 block size=8192 status=0xe head=15 tail=15 dup=1
tablespace 9, index=9 krfil=11 prev_file=0
unrecoverable scn: 0x0000.00000000 01/01/1988 00:00:00
Checkpoint cnt:7 scn: 0x0000.0053426f 10/24/2011 10:40:00

alter tablespace liu end backup;

DATA FILE #10:
(name #14) /u01/liu01.dbf
creation size=128 block size=8192 status=0xe head=14 tail=14 dup=1
tablespace 8, index=8 krfil=10 prev_file=0
unrecoverable scn: 0x0000.00000000 01/01/1988 00:00:00
Checkpoint cnt:10 scn: 0x0000.005342b8 10/24/2011 10:43:15

DATA FILE #11:
(name #15) /u01/liu101.dbf
creation size=128 block size=8192 status=0xe head=15 tail=15 dup=1
tablespace 9, index=9 krfil=11 prev_file=0
unrecoverable scn: 0x0000.00000000 01/01/1988 00:00:00
Checkpoint cnt:8 scn: 0x0000.005342b8 10/24/2011 10:43:15

alter tablespace liu offline;

DATA FILE #10:
(name #14) /u01/liu01.dbf
creation size=128 block size=8192 status=0x80 head=14 tail=14 dup=1
tablespace 8, index=8 krfil=10 prev_file=0
unrecoverable scn: 0x0000.00000000 01/01/1988 00:00:00
Checkpoint cnt:11 scn: 0x0000.005342e7 10/24/2011 10:45:15

DATA FILE #11:
(name #15) /u01/liu101.dbf
creation size=128 block size=8192 status=0xe head=15 tail=15 dup=1
tablespace 9, index=9 krfil=11 prev_file=0
unrecoverable scn: 0x0000.00000000 01/01/1988 00:00:00
Checkpoint cnt:9 scn: 0x0000.005342fd 10/24/2011 10:45:23

可以设置事件查看chk cnt

alter session set events = ‘immediate trace name file_hdrs level 10’