Internals

关于Partial write for all MySQL and Oracle

September 2, 2014 Internals, MYSQL, oracle No comments

很久之前看过这个帖子,当时没有细究,这几天看了下mysql相关的internal 突然想起这个问题,跟三四个朋友讨论了下 想把这个问题展开讲一讲。

源帖子内容见:https://community.oracle.com/thread/1087650?start=0&tstart=0

这个帖子的内容为一个老外问oracle如何避免partial write的,也就是大家所说的半个IO问题,IO 在写到一半的时候发生断裂,(瞬间断线,瞬间crash瞬间存储或者机器故障等) 虽然这个情况极少发生,但是在非常规恢复的时候也是经常遇到的,例如某个block内部出现不一致,也就是我们所谓的逻辑坏块。
我仔细的又过了一遍帖子,发现下面的几个回答完全不准确,或者有点张冠李戴的意思。

首先我们看看什么是mysql 的double write:

—–引自 mysqlperformace blog

为什么要用doublewrite?
目的是为了保证出现部分写失效(partial page write)–即数据页写到一半时就出现故障–时的数据安全性。Innodb并不在日志中记录整个数据页,而是使用一种称之为“生理”日志的技术,即日志项中只包含页号、对数据进行的操作(如更新一行记录)和日志序列号等信息。这一日志结构的优点是能够减少写入到日志的数据量,但这也要求要保持页内部的一致性。页的版本号是没关系的,数据页可以是当前版本(这时Innodb(故障恢复时)会跳过对页的更新操作)若是早期版本(这时Innodb将执行更新操作)。但如果页内部数据不一致,恢复将不能进行。

部分写失效
什么是部分写失效,为什么会出现这一现象呢?部分写失效指的是提交给操作系统的写数据页操作只完成了一部分。例如一个16K的Innodb数据页中只有4K被更新了,其余部分数据还是旧的。大部分的部分写失效都在断电时产生,在操作系统崩溃时也可能出现,这是由于操作系统可能将一个写16K数据的操作分割成多个写操作(这可能由文件碎片导致),而在执行这多个写操作的过程中发出的故障。当使用软件RAID技术时,数据页可能恰好跨越分片(stripe)的边界,这时也需要执行多个IO操作,因此也可能导致部分写失效。当使用硬件RAID又没有使用带电池的缓存时,断电时出现部分写失效也是可能的。当发送到磁盘本身只有一个写操作时,理论上硬件可以保证写操作即使在断电时也可以完成,因为驱动器内部应该积累有足够的电量来完成这一操作。但实话说我并不知道事实是不是这样,这很难检测,而且这也不是部分写失效的唯一原因。我所知道的只是部分写失效可能会出现,当Innodb实现 doublewrite功能前,我遇到过很多由于这一原因而导致数据被破坏。

doublewrite如何工作?
你可以将doublewrite看作是在Innodb表空间内部分配的一个短期的日志文件,这一日志文件包含100个数据页。Innodb在写出缓冲区中的数据页时采用的是一次写多个页的方式,这样多个页就可以先顺序写入到doublewrite缓冲区并调用fsync()保证这些数据被写出到磁盘,然后数据页才被定出到它们实际的存储位置并再次调用fsync()。故障恢复时Innodb检查doublewrite缓冲区与数据页原存储位置的内容,若数据页在doublewrite缓冲区中处于不一致状态将被简单的丢弃,若在原存储位置中不一致则从doublewrite缓冲区中还原。

doublewrite缓冲区对MySQL有何影响?
虽然doublewrite要求每个数据页都要被写二次,其性能开销远远小于二倍。写出到doublewrite缓冲区时是顺序写,因此开销很小。 doublewrite同时还可以降低Innodb执行的fsync()操作,即不需要写每个页时都调用一下fsync(),而可以提交多个写操作最后再调用一次fsync()操作,这使得操作系统可以优化写操作的执行顺序及并行使用多个存储设备。但在不使用doublewrite技术时也可以用这些优化,事实上这些优化是与doublewrite同时实现的。因此总体来说,我预计使用doublewrite技术带来的性能开销不会超过5%到10%。

能否禁用doublewrite?
如果你不关心数据一致性(比如使用了RAID0)或文件系统可以保证不会出现部分写失效,你可以通过将innodb_doublewrite参数设置为0还禁用doublewrite。但通常这可能带来更大的麻烦。

这里暂且不讨论为何mysql不开启double write会容易出现断裂的写操作. 在mysql中数据写的单元是以page为单位 1page=16KB 而在oracle中是以block为单位 block可以指定大小。但是写入OS的时候都是以OS block为单位,也就是说如果写入OS block时发生partial write 同样会出现逻辑问题。

这里我们看一个老外的回答:

It’s an interesting question. I think the clue is in the link you provided: “Such logging structure is geat as it require less data to be written to the log, however it requires pages to be internally consistent.”

What that’s saying (I think!) is that the contents of the innodb transaction log can only be replayed to datafile pages which are ‘clean’ -and that’s true for Oracle, too. You can’t apply Oracle redo to an Oracle database block that is internally corrupted because some of its consituent “os pages” were written at a different time from others. When such partial writes happen, you get what’s called a “fractured block”, warnings in the alert log …and the data file is regarded as corrupt from that point on.

Oracle’s fix to this potential problem, however, is also hinted at in the article you linked to: “Innodb does not log full pages to the log files”. That’s an interesting sentence because , you see, Oracle does write full pages to the logs! I should immediately qualify that: it only does so when you take a “hot backup” using O/S copy commands -because it’s only then that you have to worry about the problem. In other words, you only have to worry about the fact that you can only apply redo to an internally consistent database block if you’re actually in the business of applying redo… and you’re only doing that in the event of a recovery. And complete recoveries in Oracle (as opposed to mere crash recoveries) require you to have restored something from backup. So, it’s only during the backup process that you only have to worry about the problem of fractured blocks -and so it’s only then that Oracle says, ‘if you have put the tablespace into hot backup mode (alter tablespace X begin backup), then the first time a block of data is changed, the entire block should be written in a consistent state into the redo (transaction) logs. Then, if the datafile copy of the block in the backup turns out to be fractured, we’ve got a known good copy in the redo we can restore in its place. And once you have a clean block as a starting point, you can continue to apply redo from that point on’.

Oracle has an alternative (and more up to date) mechanism for achieving this “I know your data block is clean” starting state, though. It’s called RMAN -the Oracle backup and recovery tool. Unlike your OS copy command, it’s an Oracle utility… so it understands the concept of Oracle blocks, and it can therefore check that a block that’s been copied has been copied consistently, with all its constituent OS ‘pages’ written coherently to disk in the same state. It knows how to compare input and output in a way no OS command could ever hope to do. So when RMAN copies a data block hot, it reads the copy, compares it with the original -and if it sees the copy is fractured, it just has another go copying the block again. Repeat until the copy is indeed verified as a good copy of the original. No need to write the block into the transaction log at all, because you know that the backup file itself contains the necessary clean block copy.

So, putting that into practice. Let’s say your server corrupts data on the disk for whatever reason and, in the process, your Oracle instance dies. You try and restart Oracle, but you get told that recovery is needed (you might get a message that file 16, for example, can’t be read). So you restore file 16 from your hot backup taken with OS commands. In that backup, one of the blocks is fractured, because only part of the Oracle block had hit disk at the point the backup was performed. So you restore a fractured block. But that’s not a problem, because as redo is replayed, you’ll find the clean copy of the block in the redo stream, and restore that over the top of the fractured block. The rest of the redo can then be replayed without a problem. Or, you restore file 16 using RMAN… and what it restores cannot be fractured, because it checks for that before it reports the original backup a success. Therefore, you restore a clean copy of file 16, and can apply redo to it without drama. Either way, you get your database recovered.

So, the article you linked to nails the important matter: “It does not matter which page version it is – it could be “current” version in which case Innodb will skip page upate operation or “former” in which case Innodb will perform update. If page is inconsistent recovery can’t proceed.” Absolutely true of Oracle, too. But Oracle has two alternatives for ensuring that a clean version of the block is always available: write a whole block into redo if it’s changed whilst the database is being backed up with OS commands, or make sure you only write clean blocks into the backup if you’re using RMAN -and you achieve that by multiple reads of the block, as many as are necessary to ensure the output is clean.

Oracle’s solutions in these regards are, I think, a lot more efficient than double-writing every block all the time, because the only time you have to worry that what’s on disk isn’t consistent is, as your linked article again points out, when ‘power failure’ or ‘os crash’ happens. That is, during some sort of failure. And the response to failure that involves corruption is always to restore something from backup… so, it’s really only that backup that needs to worry about ‘clean pages’. Instead of writing everything twice to disk during normal running (which sounds like a potentially enormous overhead to me!), therefore, Oracle only has to employ protective measures during the backup process itself (which should, ordinarily, be a mere fraction of ‘normal running’ time). The overhead is therefore only encountered sparingly and not something you need worry about as a potentially-constant performance problem.

In closing, I’ll second Aman’s observation that it is generally and usually the case that any variation away from the default 8K block size is a bad idea. Not always, and there may be justification for it in extremis… but you will certainly be at risk of encountering more and weirder bugs than if you stick to the defaults.

这一段回答可谓一点儿也没讲到重点 主要回答oracle 采用避免partial write的几个方法,注意 这里是在特殊场景下 如何避免partial write.而不是数据库机理. 我们看一下mysql怎么同样避免这个问题 —> ( oracle hot backup/RMAN backup)
传统Oracle的热备,备份读取与DBWR写文件并行执行,因此可能读取到Fractured Block(不一致的块),解决办法是对于备份的文件,DBWR写的Dirty Page同时写入Redo Log,用于Fractured Block的恢复。RMAN备份,会检查每个读取页面的一致性,不一致就重新读取。Percona的XtraBackup,采用类似于RMAN的方案。
如何检查备份读取的页面是否是一致的,其实方法很简单:无论是Oracle/InnoDB的页面,都会在页面的页头与页尾存储相同的SCN /CheckSum。当页头/页尾的SCN/CheckSum相同,说明页面处于一致状态,否则就说明页面损坏(Fractured Block),备份重新读取损坏页面即可。
所以这一段可以理解为当你需要备份的时候 如何避免partial write 因为在这个时候最容易发生断裂的块或者页。而此前别人提问的是oracle是否有类似double write的手法避免常规的partial write.

我们继续看下面的回答:

user646745 wrote:
Thanks HJR for detailed analysis.

But the double write mechanism works without restore from a backup an apply redo: before write the dirty buffer blocks, innodb flush the blocks in the double write buffer to disk which is a sequential disk area (so it’s fast),
————————-
before write the dirty buffer blocks, logwr flush the blocks in the redo buffer to disk which is a sequential disk area (so it’s fast),

so even if partial write happen, the the blocks in double write buffer already persistent in disk,
————————-
so even if partial write happen, the the blocks in redo buffer already persistent in disk,

and when mysql restart, innodb can compare the blocks flushed from double write buffer with the datafile blocks,
————————-
and when mysql restart, smon can compare control file scn with the datafile blocks scn,

if they matched (means no partial write), then everything is fine, if not matched (partial write happen), just overwrite the datafile blocks with the blocks from double write buffer.
————————-
if they matched (means no partial write), then everything is fine, if not matched (partial write happen), just apply the redo from the redo logs.

So no media recover is required.
————————-
sounds like media recovery to me

Based on your anaysis, oracle needs media recover.
————————-
Based on your analysis, so does mysql. It just applies it in a very slightly different fashion, and calls it all something else.

这里的回答又有误点. 他说道 “so even if partial write happen, the the blocks in double write redo buffer already persistent in disk” 这句话存在明显的误导,首先partial write发生的时候,redo是无法恢复
一个块内不一致的块的,redo只能负责recover作用,但这不是针对块内部的恢复 而是trasaction或者media的.

oracle recovery 分为 instance/crash recovery and media recovery 其本质区别在于instance recovery 需要online redo logfile 即apply 增量检查点之后的redolog. 而media recovery又分为 datafile /block media
recovery 其本质为恢复一个旧的datafile或者block 可能需要用到归档日志。注意这里的前提为需要有备份。restore datafile /block 只能在备份中恢复,而之后的recover是要用到archive log。 这里的media recover可以
对应为解决partial write的方法。

对于一个内部不一致的page mysql无法通过redo去恢复 :redo有效应用的前提是data的一致性,当data flush到磁盘时发生故障,比如16K只写了4K并且redo条目是change vector形式,属于逻辑记录,那么在InnoDB recovery时,redo就不知道从哪开始跑起。

对于oracle来说 内部不一致的block.仅仅有 redo也是不够的.我们可以通过dbf检查到逻辑问题的block,去repair这个block,前提是需要有备份。这里用到的方法是media recovery里的block recovery.
我们可以使用bbed 或者一些OS的工具来模拟这些逻辑坏块,并且往往发现数据库是可以正常open的,只有在访问到这些block的时候才会出现逻辑问题。所以说备份重于一切,oracle在处理这些”半个IO”或者一些静默丢失(storage),在没有备份的情况下也是很乏力的。

[转] Autotrace with bind variables

September 27, 2013 Internals, oracle No comments

JL刚刚写的一篇autotrace与bind variables 的行为:Autotrace trick
之前我们也发现过这个问题具体SR在这里:something seems confused

CTAS导致严重BUG一则

September 25, 2013 Internals, oracle No comments

某国内巨头的一起CASE,在系统繁忙情况下大量的CTAS 导致数据字典表的不一致,由原来10.0.2.4版本升级到10.0.2.5之后问题依旧。
具体表现为:
1. 尝试drop这些不一致的tablespace

15:00:18 SQL> @dropts_2012.0319.1500.sql
15:00:18 SQL> drop tablespace ODSCP1003 including contents;
drop tablespace ODSCP1003 including contents
*
ERROR at line 1:
ORA-00600: internal error code, arguments: [ktssdrp1], [11], [0], [0], [], [],
[], []

Elapsed: 00:01:52.44
15:02:11 SQL> drop tablespace ODSCP1004 including contents;
drop tablespace ODSCP1004 including contents
*
ERROR at line 1:
ORA-00600: internal error code, arguments: [ktssdrp1], [12], [0], [0], [], [],
[], []
Elapsed: 00:00:03.51

2.TAB$,seg$数据的不一致 hcheck检查结果

3.无法shutdown database

BUG: Various ORA-600 errors / dictionary inconsistency from CTAS / DROP

目前处理方案为: 针对这个库的大小接近30TB,建议整理出问题的TS,将这些TS的数据整理到新的TS中,将没有问题的TS TTS到新的DB中,再将这个DB升级到11g,由于oracle已经对10g的DB不再提供support(需要购买extended 服务),建议对一些残留问题的10gDB尽快提上升级议程。另外针对本库的ODS应用,不建议采用32K blocksize,11gR2下的32k blocksize依然存在很多问题:

Bug 7605618
Bug 11799496
Bug 12680822

Think of bind value

September 17, 2013 Internals, oracle No comments

一次诡异的缓慢”死亡” CASE

September 13, 2013 Internals, oracle No comments

今天下午遇到的一个诡异CASE,现象为系统诡异的死亡,目前判定为sqluldr2工具导致 (同事使用sqluldr2 用具 执行了sqluldr2_linux64_10204.bin -help 命令)
抛出的错误如下 (/var/log/message)

Sep 11 09:00:54 db18 setroubleshoot: SELinux is preventing sqluldr2_linux6 from loading /home/oracle/product/10205/db1/lib/libnnz10.so which requires text relocation. For complete SELinux messages. run sealert -l 012919a7-837b-42af-9509-367d9d743808
Sep 11 09:07:08 db18 last message repeated 2 times
Sep 11 09:08:47 db18 setroubleshoot: SELinux is preventing sqluldr2_linux6 from loading /home/oracle/product/10205/db1/lib/libnnz10.so which requires text relocation. For complete SELinux messages. run sealert -l 012919a7-837b-42af-9509-367d9d743808
Sep 11 09:09:54 db18 setroubleshoot: SELinux is preventing sqluldr2_linux6 from loading /home/oracle/product/10205/db1/lib/libnnz10.so which requires text relocation. For complete SELinux messages. run sealert -l 012919a7-837b-42af-9509-367d9d743808
Sep 11 09:10:07 db18 setroubleshoot: SELinux is preventing sqluldr2_linux6 from loading /home/oracle/product/10205/db1/lib/libnnz10.so which requires text relocation. For complete SELinux messages. run sealert -l 012919a7-837b-42af-9509-367d9d743808
Sep 11 09:10:46 db18 kernel: tnslsnr[19787]: segfault at 0000000000000000 rip 00002b92c7d5bebe rsp 00007fff0a1fea90 error 4
Sep 11 10:49:59 db18 auditd[8397]: Audit daemon rotating log files
Sep 11 10:53:41 db18 setroubleshoot: SELinux is preventing sqluldr2_linux6 from loading /home/oracle/product/10205/db1/lib/libnnz10.so which requires text relocation. For complete SELinux messages. run sealert -l 012919a7-837b-42af-9509-367d9d743808
Sep 11 10:54:23 db18 setroubleshoot: SELinux is preventing sqluldr2_linux6 from loading /home/oracle/product/10205/db1/lib/libnnz10.so which requires text relocation. For complete SELinux messages. run sealert -l 012919a7-837b-42af-9509-367d9d743808
Sep 11 12:50:44 db18 setroubleshoot: SELinux is preventing sqluldr2_linux6 from loading /home/oracle/product/10205/db1/lib/libnnz10.so which requires text relocation. For complete SELinux messages. run sealert -l 012919a7-837b-42af-9509-367d9d743808
Sep 11 12:54:37 db18 setroubleshoot: SELinux is preventing sqluldr2_linux6 from loading /home/oracle/product/10205/db1/lib/libnnz10.so which requires text relocation. For complete SELinux messages. run sealert -l 012919a7-837b-42af-9509-367d9d743808
Sep 11 13:56:56 db18 nrpe[15359]: Host 10.4.0.238 is not allowed to talk to us!
Sep 11 13:57:08 db18 nrpe[15425]: Host 10.4.0.238 is not allowed to talk to us!
Sep 11 14:17:49 db18 setroubleshoot: SELinux is preventing sqluldr2_linux6 from loading /home/oracle/product/10205/db1/lib/libnnz10.so which requires text relocation. For complete SELinux messages. run sealert -l 012919a7-837b-42af-9509-367d9d743808

QQ图片20130913002200

当时系统的SELinux为enable,阻止了sqluldr2对于oracle库文件的操作,多次尝试之后在09:10:46系统出现了异常,当时仅仅表现为listener的异常关闭,自此系统开始了”慢性”死忙

Last login: Tue Sep 10 14:59:41 2013 from 10.4.0.232
[oracle@db18 ~]$ lsnrctl status

LSNRCTL for Linux: Version 10.2.0.5.0 - Production on 11-SEP-2013 09:18:47

Copyright (c) 1991, 2010, Oracle.  All rights reserved.

Connecting to (DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=0.0.0.0)(PORT=1522)))
TNS-12541: TNS:no listener
TNS-12560: TNS:protocol adapter error
  TNS-00511: No listener
   Linux Error: 111: Connection refused
[oracle@db18 ~]$ 
[oracle@db18 ~]$ 
[oracle@db18 ~]$ ora active
09:18:54 up 702 days, 14:57,  2 users,  load average: 3.06, 2.08, 0.93

no rows selected

[oracle@db18 ~]$ lsnrctl start

LSNRCTL for Linux: Version 10.2.0.5.0 - Production on 11-SEP-2013 09:19:16

Copyright (c) 1991, 2010, Oracle.  All rights reserved.

Starting /home/oracle/product/10205/db1/bin/tnslsnr: please wait...

TNSLSNR for Linux: Version 10.2.0.5.0 - Production
System parameter file is /home/oracle/product/10205/db1/network/admin/listener.ora
Log messages written to /home/oracle/product/10205/db1/network/log/listener.log
Listening on: (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=0.0.0.0)(PORT=1522)))

Connecting to (DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=0.0.0.0)(PORT=1522)))
STATUS of the LISTENER
------------------------
Alias                     LISTENER
Version                   TNSLSNR for Linux: Version 10.2.0.5.0 - Production
Start Date                11-SEP-2013 09:19:16
Uptime                    0 days 0 hr. 0 min. 0 sec
Trace Level               off
Security                  ON: Local OS Authentication
SNMP                      OFF
Listener Parameter File   /home/oracle/product/10205/db1/network/admin/listener.ora
Listener Log File         /home/oracle/product/10205/db1/network/log/listener.log
Listening Endpoints Summary...
  (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=0.0.0.0)(PORT=1522)))
Services Summary...
Service "misc" has 1 instance(s).
  Instance "misc", status UNKNOWN, has 1 handler(s) for this service...
The command completed successfully
[oracle@db18 ~]$ 
[oracle@db18 ~]$ ora active   
 09:19:22 up 702 days, 14:57,  2 users,  load average: 3.09, 2.18, 1.00

12号下午出现的问题,很多process占用大量的CPU资源,包括background process,多个进程处于running假死阶段,sys占用大量系统资源。

QQ图片20130913000507

TBQTS6(34${(BX$GLS~@PVP

database alert log如下:

Errors in file /data/oracle/admin/misc/udump/misc_ora_28401.trc:
ORA-27300: OS system dependent operation:invalid_process_id failed with status: 0
ORA-27301: OS failure message: Error 0
ORA-27302: failure occurred at: skgpalive1

分析问题,查看ora background process进程状态 发现如下异常

[oracle@db18 arch]$ ps -ef | grep -i ora_
oracle   24019     1  0 May18 ?        01:49:30 ora_pmon_misc
oracle   24023     1  0 May18 ?        00:33:02 ora_psp0_misc
oracle   24027     1  0 May18 ?        00:34:32 ora_mman_misc
oracle   24031     1  0 May18 ?        00:59:23 ora_dbw0_misc
oracle   24033     1  0 May18 ?        01:02:13 ora_dbw1_misc
oracle   24035     1  0 May18 ?        00:58:57 ora_dbw2_misc
oracle   24037     1  0 May18 ?        00:59:24 ora_dbw3_misc
oracle   24039     1  0 May18 ?        00:59:06 ora_dbw4_misc
oracle   24043     1  0 May18 ?        00:59:35 ora_dbw5_misc
oracle   24047     1  0 May18 ?        04:28:36 ora_lgwr_misc
oracle   24049     1  0 May18 ?        09:41:15 ora_ckpt_misc
oracle   24053     1  0 May18 ?        00:36:44 ora_smon_misc
oracle   24057     1  0 May18 ?        00:28:39 ora_reco_misc
oracle   24061     1  1 May18 ?        1-06:42:09 ora_mmon_misc
oracle   24065     1  0 May18 ?        22:12:24 ora_mmnl_misc
oracle   24347     1  0 May18 ?        00:37:10 ora_arc0_misc
oracle   24351     1  0 May18 ?        00:43:10 ora_arc1_misc
oracle   24355     1  0 May18 ?        01:42:59 ora_lns1_misc
oracle   24543     1  0 May18 ?        00:28:31 ora_qmnc_misc
oracle   24780     1  0 May18 ?        00:10:05 ora_q000_misc
oracle   24969     1  0 May18 ?        00:09:25 ora_q001_misc
oracle   26077 25306  0 15:31 pts/8    00:00:00 tail -f misc_ora_21763.trc
oracle   27634 22109  0 15:44 pts/6    00:00:00 grep -i ora_
oracle   27867 24023 98 Sep11 ?        1-06:03:47 ora_psp0_misc   ------------刷新时间为Sep11
oracle   27868 27867 98 Sep11 ?        1-06:03:38 ora_psp0_misc   ------------为衍生进程

PSP进程出现异常状态,对于PSP0进程的解释如下:

PSP stands for Process SPawner and this process has the job of creating and managing other Oracle processes.
The initial set of background processes are spawned by the startup process. After the instance is started, PSP0 is responsible for spawning any
required background processes.

当时系统的状态为无法checkpoint 无法 syn log ,包括alter database set 等命令也出现hang死状态

[oracle@db18 ~]$ ora active
15:57:03 up 703 days, 21:35, 12 users,  load average: 24.11, 26.99, 50.27

   SID USERNAME   MACHINE          EVENT                      PARAM                W   WT SQL                        ST     LT LOGON_TIME
------ ---------- ---------------------- ------------------------------ -------------------- ---- ---- ---------------------------- -- ------ ----------
  3601 LOG_DATA2  JDBC Thin Client       log file sync                     10795/2810006158/0       0    0 /frw83qw2fvsa1             K        0    1088732
  3607 LOG_DATA2  JDBC Thin Client       log file sync                     10795/2810006175/0       0    0 /frw83qw2fvsa1             K        0        3857
  3611 LOG_DATA2  JDBC Thin Client       log file sync                     10795/2810006137/0       0    0 /frw83qw2fvsa1             K        0    1088741
  3616 LOG_DATA2  JDBC Thin Client       log file sync                     10795/2810006169/0       0    0 /frw83qw2fvsa1             K        0        3895
  3634 LOG_DATA2  JDBC Thin Client       log file sync                     10795/2810006140/0       0    0 /frw83qw2fvsa1             K        0    1088735
  3635 LOG_DATA2  JDBC Thin Client       log file sync                     10795/2810006173/0       0    0 /frw83qw2fvsa1             K        0        3889
  3646 LOG_DATA2  JDBC Thin Client       log file sync                     10795/2810006209/0       0    0 /frw83qw2fvsa1             A        0        1023
  3650 LOG_DATA2  JDBC Thin Client       log file sync                     10795/2810006174/0       0    0 /frw83qw2fvsa1             K        0        3882
  3661 LOG_DATA2  JDBC Thin Client       log file sync                     10795/2810006152/0       0    0 /frw83qw2fvsa1             K        0        5346
  3664 LOG_DATA2  JDBC Thin Client       log file sync                     10795/2810006138/0       0    0 /frw83qw2fvsa1             K        0      877657
  3668 LOG_DATA2  JDBC Thin Client       log file sync                     10795/2810006200/0       0    0 /frw83qw2fvsa1             A        0        1027
  3671 LOG_DATA2  JDBC Thin Client       log file sync                     10795/2810006198/0       0    0 /frw83qw2fvsa1             A        0        1027
  3676 LOG_DATA2  JDBC Thin Client       log file sync                     10795/2810006184/0       0    0 /6p07vqnwqrsyw            A        0        1032
  3691 LOG_DATA2  JDBC Thin Client       log file sync                     10795/2810006210/0       0    0 /frw83qw2fvsa1             A        0        1011
  3696 LOG_DATA2  JDBC Thin Client       log file sync                     10795/2810006141/0       0    0 /frw83qw2fvsa1             K        0        5346
  3698 SYS     sqlplus@db18         reliable message            70088494424/70158516   0    0 /                         A           0         390

表现为ckpt lgwr dbwn进程全部死亡 .. oradebug dump systemstat 速度极其缓慢 oradebug dump errorstack hang —- sys 占用大量资源

awk -f ass109.awk systemdump.txt > systemdump.log —并没有发现任何异常等待以及blocker信息

[oracle@db18 ~]$ tail -f -n 50 systemdump.log 
     Cmd: Select
296:waiting for 'SQL*Net message from client' 
     Cmd: Select
297:waiting for 'SQL*Net message from client' 
     Cmd: Select
298:waiting for 'SQL*Net message from client' 
     Cmd: PL/SQL Execute
299:waiting for 'SQL*Net message from client' 
     Cmd: PL/SQL Execute
300:waiting for 'SQL*Net message from client' 
     Cmd: PL/SQL Execute
302:waiting for 'SQL*Net message from client' 
     Cmd: Insert
444:waiting for 'SQL*Net message from client' 
     Cmd: Select
532:waiting for 'SQL*Net message from client' 
     Cmd: Select
536:waiting for 'SQL*Net message from client' 
     Cmd: Select
538:waiting for 'SQL*Net message from client' 
     Cmd: Select

NO BLOCKING PROCESSES FOUND

System State 2
~~~~~~~~~~~~~~~~
1:                                      
2:  waiting for 'pmon timer'            
3:  waiting for 'rdbms ipc message'     
4:  last wait for 'rdbms ipc message'   
5:  waiting for 'rdbms ipc message'     
6:  waiting for 'rdbms ipc message'     
7:  waiting for 'rdbms ipc message'     
8:  waiting for 'rdbms ipc message'     
9:  waiting for 'rdbms ipc message'     
10: waiting for 'rdbms ipc message'     
11: waiting for 'rdbms ipc message'     
12: waiting for 'rdbms ipc message'     
13: waiting for 'rdbms ipc message'     
14: waiting for 'smon timer'            
15: last wait for 'os thread startup'   
16: last wait for 'rdbms ipc message'   
17: waiting for 'SQL*Net message from client' 
     Cmd: Select
18:                                     

NO BLOCKING PROCESSES FOUND


171760 Lines Processed.

其现象为系统的一个”缓慢”的死亡行为 (通过SAR 定位到一个折点时间)

sar -f /var/log/sa/sar11   ---- sep 11

09:00:01 AM       all      0.01      0.00      0.01      0.00      0.00     99.97
09:10:01 AM       all      0.03      0.00      0.02      0.00      0.00     99.95   ----->   (09:10:46 db18 kernel: tnslsnr[19787]: segfault at 0000000000000000 rip 00002b92c7d5bebe rsp 00007fff0a1fea90 error 4)
09:20:01 AM       all      2.18      0.00      2.16      0.00      0.00     95.66
09:30:01 AM       all      3.15      0.00      3.13      0.00      0.00     93.71
09:40:01 AM       all      3.15      0.00      3.13      0.00      0.00     93.72
09:50:01 AM       all      3.15      0.00      3.13      0.00      0.00     93.72

03:00:01 PM       all      3.99      0.00      4.39      0.00      0.00     91.62
03:10:01 PM       all      4.04      0.00      4.32      0.00      0.00     91.64
03:20:01 PM       all      4.47      0.00      4.87      0.00      0.00     90.66
03:30:01 PM       all      4.98      0.00      5.47      0.00      0.00     89.55
03:40:01 PM       all      4.99      0.00      5.45      0.00      0.00     89.56
03:50:01 PM       all      4.99      0.00      5.45      0.00      0.00     89.56
04:00:01 PM       all      4.99      0.00      5.45      0.00      0.00     89.56
04:10:01 PM       all      4.94      0.00      5.50      0.00      0.00     89.56
04:20:01 PM       all      4.82      0.00      5.62      0.00      0.00     89.56


10:20:01 PM       all     10.72      0.00     12.22      0.00      0.00     77.06
10:30:01 PM       all     10.73      0.00     12.21      0.00      0.00     77.06
10:40:01 PM       all     10.69      0.00     12.25      0.00      0.00     77.06
10:50:01 PM       all     10.67      0.00     12.27      0.00      0.00     77.06
11:00:01 PM       all     10.52      0.00     12.41      0.00      0.00     77.06
11:10:01 PM       all     10.42      0.00     12.52      0.00      0.00     77.06
11:20:01 PM       all     10.38      0.00     12.56      0.00      0.00     77.06


sar -f /var/log/sa/sar12   ---- sep 12

12:00:01 AM       CPU     %user     %nice   %system   %iowait    %steal     %idle
12:10:01 AM       all     11.23      0.00     11.72      0.10      0.00     76.95
12:20:02 AM       all     11.29      0.00     11.65      0.10      0.00     76.96
12:30:01 AM       all     11.30      0.00     11.63      0.09      0.00     76.96
12:40:01 AM       all     11.71      0.00     12.36      0.10      0.00     75.83
12:50:01 AM       all     12.14      0.00     12.88      0.10      0.00     74.87
01:00:01 AM       all     12.08      0.00     12.94      0.10      0.00     74.88
01:10:01 AM       all     12.24      0.00     12.79      0.10      0.00     74.88
01:20:01 AM       all     12.25      0.00     12.77      0.10      0.00     74.88

10:00:01 AM       all     12.08      0.00     37.94      0.10      0.00     49.88
10:10:01 AM       all     12.07      0.00     37.95      0.09      0.00     49.88
10:20:01 AM       all     12.09      0.00     37.94      0.11      0.00     49.87
10:30:01 AM       all     12.10      0.00     37.93      0.10      0.00     49.88
10:40:01 AM       all     12.02      0.00     38.00      0.11      0.00     49.87
10:50:01 AM       all     12.06      0.00     37.96      0.10      0.00     49.88
11:00:01 AM       all     12.37      0.00     37.65      0.10      0.00     49.88
11:10:01 AM       all     12.32      0.00     37.70      0.10      0.00     49.88
11:20:01 AM       all     12.50      0.00     37.52      0.10      0.00     49.88
11:30:01 AM       all     12.59      0.00     37.43      0.10      0.00     49.88
11:40:01 AM       all     12.77      0.00     37.25      0.11      0.00     49.87

01:50:01 PM       all     13.41      0.00     40.77      0.10      0.00     45.71
02:00:01 PM       all     13.41      0.00     40.78      0.11      0.00     45.71
02:10:01 PM       all     13.41      0.00     40.78      0.10      0.00     45.71
02:20:01 PM       all     13.17      0.00     63.22      0.05      0.00     23.55
02:30:02 PM       all     13.21      0.00     86.78      0.00      0.00      0.00
02:40:01 PM       all     13.29      0.00     86.71      0.00      0.00      0.00
02:50:01 PM       all     13.42      0.00     86.58      0.00      0.00      0.00
03:00:01 PM       all     13.58      0.00     86.42      0.00      0.00      0.00
03:10:01 PM       all     13.38      0.00     86.62      0.00      0.00      0.00
03:20:01 PM       all     13.25      0.00     86.75      0.00      0.00      0.00
03:30:01 PM       all     13.19      0.00     86.80      0.00      0.00      0.00
03:40:01 PM       all     12.95      0.00     86.25      0.00      0.00      0.79

假想:sqluldr2 与 Selinux的冲突导致了oracle so文件的异常,表现为segfault–listener的异常关闭,进而系统缓慢的”死亡” (建议不要在生产系统直接使用sqluldr2而使用TNS远程连接)

How to control CF calculation

May 14, 2013 Internals, oracle No comments

根据Richard Foote的文章,做了如下验证:

RefClustering Factor Calculation Improvement and provide some control over DBMS_STATS index clustering factor computation

[oracle@db-42 ~]$ unzip p13262857_112030_Generic.zip 
[oracle@db-42 ~]$ cd 13262857/
[oracle@db-42 13262857]$ ora si 

SQL*Plus: Release 11.2.0.3.0 Production on Tue May 14 10:44:32 2013

Copyright (c) 1982, 2011, Oracle.  All rights reserved.


Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
With the Partitioning, Automatic Storage Management, OLAP, Data Mining
and Real Application Testing options

SQL> shutdown immediate;
Database closed.
Database dismounted.
ORACLE instance shut down.
SQL> exit
Disconnected from Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
With the Partitioning, Automatic Storage Management, OLAP, Data Mining
and Real Application Testing options


[oracle@db-42 13262857]$ /data/app1/oracle/product/11.2.0/db_1/OPatch/opatch apply
Invoking OPatch 11.2.0.1.7

Oracle Interim Patch Installer version 11.2.0.1.7
Copyright (c) 2011, Oracle Corporation.  All rights reserved.


Oracle Home       : /data/app1/oracle/product/11.2.0/db_1
Central Inventory : /data/app/oraInventory
   from           : /etc/oraInst.loc
OPatch version    : 11.2.0.1.7
OUI version       : 11.2.0.3.0
Log file location : /data/app1/oracle/product/11.2.0/db_1/cfgtoollogs/opatch/opatch2013-05-14_10-46-13AM.log

Applying interim patch '13262857' to OH '/data/app1/oracle/product/11.2.0/db_1'
Verifying environment and performing prerequisite checks...

Do you want to proceed? [y|n]
yes
User Responded with: Y
All checks passed.
Backing up files...

Patching component oracle.rdbms.dbscripts, 11.2.0.3.0...
Patch 13262857 successfully applied
Log file location: /data/app1/oracle/product/11.2.0/db_1/cfgtoollogs/opatch/opatch2013-05-14_10-46-13AM.log

OPatch succeeded.

[oracle@db-42 13262857]$ ora si 

SQL*Plus: Release 11.2.0.3.0 Production on Tue May 14 10:46:25 2013

Copyright (c) 1982, 2011, Oracle.  All rights reserved.

Connected to an idle instance.

SQL> startup upgrade;
ORACLE instance started.

Total System Global Area 8017100800 bytes
Fixed Size		    2244192 bytes
Variable Size		 1828716960 bytes
Database Buffers	 6157238272 bytes
Redo Buffers		   28901376 bytes
Database mounted.
Database opened.
SQL> 
SQL> @?/rdbms/admin/dbmsstat.sql

Package created.

No errors.

Synonym created.


Grant succeeded.


Role created.


Grant succeeded.


Grant succeeded.


Library created.

SQL> 
SQL> @?/rdbms/admin/prvtstat.plb

Package body created.

No errors.
SQL> 
SQL> @?/rdbms/admin/execstat.sql

PL/SQL procedure successfully completed.

SQL> 
SQL> 
SQL> shutdown immediate;
Database closed.
Database dismounted.
ORACLE instance shut down.
SQL> 
SQL> 
SQL> startup
ORACLE instance started.

Total System Global Area 8017100800 bytes
Fixed Size		    2244192 bytes
Variable Size		 1828716960 bytes
Database Buffers	 6157238272 bytes
Redo Buffers		   28901376 bytes
Database mounted.
Database opened.

SQL> exec dbms_stats.set_table_prefs(ownname=>user, tabname=>'BOWIE',pname=>'TABLE_CACHED_BLOCKS', pvalue=>42);

PL/SQL procedure successfully completed.


SQL> EXEC dbms_stats.gather_index_stats(ownname=>user, indname=>'BOWIE_ID_I', estimate_percent=> null);

PL/SQL procedure successfully completed.

SQL> SELECT t.table_name, i.index_name, t.blocks, t.num_rows, i.clustering_factor FROM user_tables t, user_indexes i
  WHERE t.table_name = i.table_name AND i.index_name='BOWIE_ID_I';  2  


TABLE_NAME		       INDEX_NAME			  BLOCKS   NUM_ROWS CLUSTERING_FACTOR
------------------------------ ------------------------------ ---------- ---------- -----------------
BOWIE			       BOWIE_ID_I			    1126     300000		 1035

SQL> exec dbms_stats.set_table_prefs(ownname=>user, tabname=>'BOWIE',pname=>'TABLE_CACHED_BLOCKS', pvalue=>255);

PL/SQL procedure successfully completed.


SQL> EXEC dbms_stats.gather_index_stats(ownname=>user, indname=>'BOWIE_ID_I', estimate_percent=> null);

PL/SQL procedure successfully completed.

SQL> SELECT t.table_name, i.index_name, t.blocks, t.num_rows, i.clustering_factor FROM user_tables t, user_indexes i
  WHERE t.table_name = i.table_name AND i.index_name='BOWIE_ID_I';  2  

TABLE_NAME		       INDEX_NAME			  BLOCKS   NUM_ROWS CLUSTERING_FACTOR
------------------------------ ------------------------------ ---------- ---------- -----------------
BOWIE			       BOWIE_ID_I			    1126     300000		 1035

SQL> exec dbms_stats.set_table_prefs(ownname=>user, tabname=>'BOWIE',pname=>'TABLE_CACHED_BLOCKS', pvalue=>1);

PL/SQL procedure successfully completed.

SQL> EXEC dbms_stats.gather_index_stats(ownname=>user, indname=>'BOWIE_ID_I', estimate_percent=> null);

PL/SQL procedure successfully completed.

SQL> SELECT t.table_name, i.index_name, t.blocks, t.num_rows, i.clustering_factor FROM user_tables t, user_indexes i
  WHERE t.table_name = i.table_name AND i.index_name='BOWIE_ID_I';  2  

TABLE_NAME		       INDEX_NAME			  BLOCKS   NUM_ROWS CLUSTERING_FACTOR
------------------------------ ------------------------------ ---------- ---------- -----------------
BOWIE			       BOWIE_ID_I			    1126     300000	       193228

eg: 重复一遍这个过程

SQL> truncate table bowie;

Table truncated.

SQL> exec bowie_proc 

PL/SQL procedure successfully completed.

SQL> exec dbms_stats.set_table_prefs(ownname=>user, tabname=>'BOWIE',pname=>'TABLE_CACHED_BLOCKS', pvalue=>1);

PL/SQL procedure successfully completed.

SQL> EXEC dbms_stats.gather_index_stats(ownname=>user, indname=>'BOWIE_ID_I', estimate_percent=> null);

PL/SQL procedure successfully completed.

SQL> set autot off;
SQL> SELECT t.table_name, i.index_name, t.blocks, t.num_rows, i.clustering_factor FROM user_tables t, user_indexes i
  WHERE t.table_name = i.table_name AND i.index_name='BOWIE_ID_I';  2  

TABLE_NAME		       INDEX_NAME			  BLOCKS   NUM_ROWS CLUSTERING_FACTOR
------------------------------ ------------------------------ ---------- ---------- -----------------
BOWIE			       BOWIE_ID_I			    1126     300000	       357685


SQL> set autot trace ;

SQL> select * from bowie where id between 42 and 429;

388 rows selected.


Execution Plan
----------------------------------------------------------
Plan hash value: 1845943507

---------------------------------------------------------------------------
| Id  | Operation	  | Name  | Rows  | Bytes | Cost (%CPU)| Time	  |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |	  |   389 |  7780 |   310   (1)| 00:00:04 |
|*  1 |  TABLE ACCESS FULL| BOWIE |   389 |  7780 |   310   (1)| 00:00:04 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("ID"<=429 AND "ID">=42)


Statistics
----------------------------------------------------------
	  9  recursive calls
	  1  db block gets
 1532  consistent gets
	  0  physical reads
	  0  redo size
 8659  bytes sent via SQL*Net to client
	799  bytes received via SQL*Net from client
	 27  SQL*Net roundtrips to/from client
	  4  sorts (memory)
	  0  sorts (disk)
	388  rows processed

SQL> exec dbms_stats.set_table_prefs(ownname=>user, tabname=>'BOWIE',pname=>'TABLE_CACHED_BLOCKS', pvalue=>255);

PL/SQL procedure successfully completed.

SQL> EXEC dbms_stats.gather_index_stats(ownname=>user, indname=>'BOWIE_ID_I', estimate_percent=> null);

PL/SQL procedure successfully completed.


SQL> SELECT t.table_name, i.index_name, t.blocks, t.num_rows, i.clustering_factor FROM user_tables t, user_indexes i
  WHERE t.table_name = i.table_name AND i.index_name='BOWIE_ID_I';  2  

TABLE_NAME		       INDEX_NAME			  BLOCKS   NUM_ROWS CLUSTERING_FACTOR
------------------------------ ------------------------------ ---------- ---------- -----------------
BOWIE			       BOWIE_ID_I			    1126     300000		 1396

SQL> set autot trace ;
SQL> select * from bowie where id between 42 and 429;

Execution Plan
----------------------------------------------------------
Plan hash value: 3472402785

------------------------------------------------------------------------------------------
| Id  | Operation		    | Name	 | Rows  | Bytes | Cost (%CPU)| Time	 |
------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT	    |		 |   389 |  7780 |     7   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID| BOWIE	 |   389 |  7780 |     7   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN	    | BOWIE_ID_I |   519 |	 |     5   (0)| 00:00:01 |
------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

Statistics
----------------------------------------------------------
	  1  recursive calls
	  0  db block gets
	 57  consistent gets
	  0  physical reads
	  0  redo size
14532  bytes sent via SQL*Net to client
	799  bytes received via SQL*Net from client
	 27  SQL*Net roundtrips to/from client
	  0  sorts (memory)
	  0  sorts (disk)
	388  rows processed

Block split and entry reorder

December 12, 2012 Internals, oracle No comments

今天跟研发的同学讨论到了这个问题,index block split过程中涉及到的重新排序问题.

SQL> create table idx_split (a number, b varchar2(1446), c date);   
 
Table created
 

SQL> create index idx_split_idx on idx_split (a, b) tablespace idx_2k pctfree 10;   
 
Index created
 

 
begin
    for i in 1..1000
    loop
        insert into tx_index_contention (a, b, c) values (i, lpad('A', 10, 'A'), sysdate);
    end loop;
 end;
 /

PL/SQL procedure successfully completed
 
SQL> commit;
 
Commit complete
 

SQL> SELECT height,blocks,lf_blks,br_blks,btree_space,used_space FROM INDEX_STATS WHERE NAME='IDX_SPLIT_IDX';
 
    HEIGHT     BLOCKS    LF_BLKS    BR_BLKS BTREE_SPACE USED_SPACE
---------- ---------- ---------- ---------- ----------- ----------
         1          8          1          0        7996       2500
 
SQL> ANALYZE INDEX idx_split_idx VALIDATE STRUCTURE;
 
Index analyzed
 

 SQL> SELECT height,blocks,lf_blks,br_blks,btree_space,used_space FROM INDEX_STATS WHERE NAME='IDX_SPLIT_IDX';
 
    HEIGHT     BLOCKS    LF_BLKS    BR_BLKS BTREE_SPACE USED_SPACE
---------- ---------- ---------- ------------------- ----------
         2          8          4          1       40012      25924
 
SQL> select extent_id,file_id,block_id,blocks from dba_extents where segment_name='IDX_SPLIT_IDX' AND OWNER='SYS';
 
 EXTENT_ID    FILE_ID   BLOCK_ID     BLOCKS
---------- ---------- ---------- ----------
         0          6        128          8

插入1000条数据,leaf blocks=4 branch block=1

SQL> alter session set events '10224 trace name context forever,level 1';   
 
Session altered
 
SQL>  SELECT height,blocks,lf_blks,br_blks,btree_space,used_space FROM INDEX_STATS WHERE NAME='IDX_SPLIT_IDX' ;
 
    HEIGHT     BLOCKS    LF_BLKS    BR_BLKS BTREE_SPACE USED_SPACE
---------- ---------- ---------- ---------- ----------- ----------
         2          8          4          1       40012      25924
 
SQL> insert into idx_split (a, b, c) values (1, lpad('A', 10, 'A'), sysdate);
 
1 row inserted
 
SQL> commit;
 
Commit complete
 
SQL> 
SQL> select sysdate from dual;
 
SYSDATE
-----------
2012/12/12
------------------------------------------------------------
*** 2012-12-12 13:44:52.311
*** SESSION ID:(668.2355) 2012-12-12 13:44:52.311
*** CLIENT ID:() 2012-12-12 13:44:52.311
*** SERVICE NAME:(yhdtest) 2012-12-12 13:44:52.311
*** MODULE NAME:(PL/SQL Developer) 2012-12-12 13:44:52.311
*** ACTION NAME:(Command Window - New) 2012-12-12 13:44:52.311
 
splitting leaf,dba 0x01800087,time 13:44:52.310
kdisnew_bseg_srch_cbk reject block -mark full,dba 0x01800086,time 13:44:52.311
kdisnew_bseg_srch_cbk rejecting block ,dba 0x01800086,time 13:44:52.311
kdisnew_bseg_srch_cbk reject block -mark full,dba 0x01800085,time 13:44:52.311
kdisnew_bseg_srch_cbk rejecting block ,dba 0x01800085,time 13:44:52.311
kdisnew_bseg_srch_cbk using block,dba 0x0180008f,time 13:44:52.313

---------------------------------------------------------------

leaf,dba 0x01800087 发生了分裂 (第一个leaf block)

dump 第一个branch block

SQL> alter system dump datafile 6 block 131;
 
System altered

----------------------------------------------------------------
Block header dump:  0x01800083
 Object id on Block? Y
 seg/obj: 0x12ef4  csc: 0x00.17756d  itc: 1  flg: E  typ: 2 - INDEX
     brn: 0  bdba: 0x1800080 ver: 0x01 opc: 0
     inc: 0  exflg: 0

 Itl           Xid                  Uba         Flag  Lck        Scn/Fsc
0x01   0x0009.01d.00000389  0x00c003a5.00a7.02  C---    0  scn 0x0000.0017756d
Branch block dump
=================
header address 140678466630220=0x7ff24207f24c
kdxcolev 1
KDXCOLEV Flags = - - -
kdxcolok 0
kdxcoopc 0x80: opcode=0: iot flags=--- is converted=Y
kdxconco 3
kdxcosdc 1
kdxconro 3
kdxcofbo 34=0x22
kdxcofeo 8029=0x1f5d
kdxcoavs 7995
kdxbrlmc 25165959=0x1800087 -----第一个leaf block
kdxbrsno 2
kdxbrbksz 8056
kdxbr2urrc 0
row#0[8047] dba: 25165956=0x1800084  -----第二个
col 0; len 3; (3):  c2 04 0d
col 1; TERM
row#1[8038] dba: 25165957=0x1800085  ..3
col 0; len 3; (3):  c2 07 14
col 1; TERM
row#2[8029] dba: 25165958=0x1800086  ..4
col 0; len 3; (3):  c2 0a 1b
col 1; TERM
----- end of branch block dump -----
End dump data blocks tsn: 7 file#: 6 minblk 131 maxblk 131

-------------------------------------------------------------------
SQL> variable file# number
variable block# number
execute :file#:=dbms_utility.data_block_address_file(to_number('1800087','xxxxxxx'));
execute :block#:=dbms_utility.data_block_address_block(to_number('1800087','xxxxxxx'));SQL> SQL> 
PL/SQL procedure successfully completed.


PL/SQL procedure successfully completed.

SQL> 
SQL> 
SQL> 
SQL> print file#;

     FILE#
----------
	 6

SQL> print block#;

    BLOCK#
----------
       135

 
SQL> alter system dump datafile 6 block 135;
 
System altered
 
SQL> alter system dump datafile 6 block 132;
 
System altered


SQL> select  display_raw('c2040d','NUMBER') from dual;

DISPLAY_RAW('C2040D','NUMBER')
----------------------------------------------------------------------------------------------------
312

第二个leaf block 的min value = 312

Block header dump:  0x01800084
 Object id on Block? Y
 seg/obj: 0x12ef4  csc: 0x00.177610  itc: 2  flg: E  typ: 2 - INDEX
     brn: 0  bdba: 0x1800080 ver: 0x01 opc: 0
     inc: 0  exflg: 0

 Itl           Xid                  Uba         Flag  Lck        Scn/Fsc
0x01   0x0009.01e.00000389  0x00c003a7.00a7.02  --U-    1  fsc 0x0000.00177611
0x02   0x000a.006.00000756  0x00c00768.0172.19  C---    0  scn 0x0000.00177571
Leaf block dump
===============
header address 140678466630244=0x7ff24207f264
kdxcolev 0
KDXCOLEV Flags = - - -
kdxcolok 1
kdxcoopc 0x89: opcode=9: iot flags=--- is converted=Y
kdxconco 3
kdxcosdc 2
kdxconro 307
kdxcofbo 650=0x28a
kdxcofeo 667=0x29b
kdxcoavs 17
kdxlespl 0
kdxlende 0
kdxlenxt 25165957=0x1800085
kdxleprv 25165967=0x180008f
kdxledsz 0
kdxlebksz 8032
row#0[667] flag: ------, lock: 0, len=24
col 0; len 3; (3):  c2 04 0d          ------------>312
col 1; len 10; (10):  41 41 41 41 41 41 41 41 41 41
col 2; len 6; (6):  00 41 74 8a 00 30
row#1[691] flag: ------, lock: 0, len=24
.....

第一个leaf block 的range <312

Block header dump:  0x01800087
 Object id on Block? Y
 seg/obj: 0x12ef4  csc: 0x00.177576  itc: 2  flg: E  typ: 2 – INDEX
     brn: 0  bdba: 0x1800080 ver: 0x01 opc: 0
     inc: 0  exflg: 0

 Itl           Xid                  Uba         Flag  Lck        Scn/Fsc
0x01   0x0009.01e.00000389  0x00c003a6.00a7.01  -B–    1  fsc 0x0000.00000000
0x02   0x000a.010.00000758  0x00c00770.0172.28  –U-    1  fsc 0x0000.00177649
Leaf block dump
===============
header address 140678466630244=0x7ff24207f264
kdxcolev 0
KDXCOLEV Flags = – – –
kdxcolok 1
kdxcoopc 0x80: opcode=0: iot flags=— is converted=Y
kdxconco 3
kdxcosdc 2
kdxconro 163
kdxcofbo 362=0x16a
kdxcofeo 4221=0x107d
kdxcoavs 3859
kdxlespl 0
kdxlende 0
kdxlenxt 25165967=0x180008f
kdxleprv 0=0x0
kdxledsz 0
kdxlebksz 8032
row#0[4221] flag: ——, lock: 0, len=23
col 0; len 2; (2):  c1 02
col 1; len 10; (10):  41 41 41 41 41 41 41 41 41 41
col 2; len 6; (6):  00 41 74 89 00 00
row#1[4244] flag: ——, lock: 2, len=23
col 0; len 2; (2):  c1 02
col 1; len 10; (10):  41 41 41 41 41 41 41 41 41 41
……

row#0,row#1 的偏移量发生了重组,[4221]->[4244] 按照len=23递增 即物理address 发生了排序
row#0,row#1 被reorder为1,1,2,x 而row#1对应的rowid为max block address

SQL> SELECT height,blocks,lf_blks,br_blks,btree_space,used_space FROM INDEX_STATS WHERE NAME='IDX_SPLIT_IDX';
 
    HEIGHT     BLOCKS    LF_BLKS    BR_BLKS BTREE_SPACE USED_SPACE
---------- ---------- ---------- ---------- ----------- ----------
         2          8          4          1       40012      25924
 
SQL>  ANALYZE INDEX idx_split_idx VALIDATE STRUCTURE;
 
Index analyzed
 
SQL> 
SQL>  SELECT height,blocks,lf_blks,br_blks,btree_space,used_space FROM INDEX_STATS WHERE NAME='IDX_SPLIT_IDX'

  2  ;
 
    HEIGHT     BLOCKS    LF_BLKS    BR_BLKS BTREE_SPACE USED_SPACE
---------- ---------- ---------- ---------- ----------- ----------
         2         16          5          1       48008      25960

leaf block 从4增加到5,第一个leaf block 发生了分裂,具体过程为:

1. Allocate new index block from index freelist
2. Redistribute block so the lower half (by volume) of index entries remain in current block and move the other half into the new block
3. Insert the new index entry into appropriate leaf block
4. Update the previously full block such that its “next leaf block pointer” (kdxlenxt) references the new block
5. Update the leaf block that was the right of the previously full block such that its “previous leaf block pointer”(kdxleprv) also points to the new block
6. Update the branch block that references the full block and add a new entry to point to the new leaf block (effectively the lowest value in the new leaf block)
7. –add this one , split leaf block will reorder by value this is a behavior like rebuild.

Scripts: dbms_backup_restore modify dbid and db_name

December 5, 2012 Internals, oracle 1 comment

Scripts:dbms_backup_restore_dbid

eg:

Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options

SQL> @dbid
Database closed.
Database dismounted.
ORACLE instance shut down.
ORACLE instance started.

Total System Global Area 2.0176E+10 bytes
Fixed Size		    2237048 bytes
Variable Size		 2483031432 bytes
Database Buffers	 1.7650E+10 bytes
Redo Buffers		   41488384 bytes
Database mounted.
Database opened.

PL/SQL procedure successfully completed.


OLD_NAME
--------------------------------------------------------------------------------
LOUIS

Enter the new Database Name:louisliu
Enter the new Database ID:100


PL/SQL procedure successfully completed.


PL/SQL procedure successfully completed.

Convert LOUIS(1153888553) to louisliu(100)

PL/SQL procedure successfully completed.

ControlFile:
=> Change Name:1
=> Change DBID:1
DataFile: /data/oracle/oradata/louis/system01.dbf
=> Skipped:0
=> Change Name:1
=> Change DBID:1
DataFile: /data/oracle/oradata/louis/sysaux01.dbf
=> Skipped:0
=> Change Name:1
=> Change DBID:1
DataFile: /data/oracle/oradata/louis/undotbs01.dbf
=> Skipped:0
=> Change Name:1
=> Change DBID:1
DataFile: /data/oracle/oradata/louis/users01.dbf
=> Skipped:0
=> Change Name:1
=> Change DBID:1
DataFile: /data/oracle/oradata/louis/temp01.dbf
=> Skipped:0
=> Change Name:1
=> Change DBID:1

PL/SQL procedure successfully completed.


[oracle@db-2-16 dbs]$ cat initlouis.ora | sed "s/db_name='louis'/db_name='louisliu'/" > initlouisliu.ora
[oracle@db-2-16 dbs]$ export ORACLE_SID=louisliu
[oracle@db-2-16 dbs]$ ora si 

SQL*Plus: Release 11.2.0.3.0 Production on Wed Dec 5 00:17:20 2012

Copyright (c) 1982, 2011, Oracle.  All rights reserved.

Connected to an idle instance.

SQL> startup
ORACLE instance started.

Total System Global Area 2.0176E+10 bytes
Fixed Size		    2237048 bytes
Variable Size		 2483031432 bytes
Database Buffers	 1.7650E+10 bytes
Redo Buffers		   41488384 bytes
Database mounted.
ORA-01589: must use RESETLOGS or NORESETLOGS option for database open


SQL> alter database openresetlogs;
alter database openresetlogs
                           *
ERROR at line 1:
ORA-02231: missing or invalid option to ALTER DATABASE


SQL> alter database open resetlogs;

Database altered.


SQL> select dbid ,name from v$database;

      DBID NAME
---------- ---------------------------
       100 louisliu

SQL> alter system switch logfile;

System altered.

Oracle Bind Graduation 测试

November 28, 2012 Internals, oracle No comments

最近team 讨论 Bind Graduation 比较激烈,详细可以见上一篇文章,基于Bind Graduation 做了以下测试:

主要目的:

测试基于OCI JDBC 等接口的 Bind Graduation行为.针对目前Bind Graduation的行为,以及11.2.0.3出现的purge问题,由于bind graduation导致的child cursor过多问题,暂时没有好的solution(_cursor_obsolete_threshold ?).
建议对问题语句涉及到的表做水平拆分。

测试版本11.2.0.3

[oracle@testdb ~]$ ora si 

SQL*Plus: Release 11.2.0.3.0 Production on Wed Nov 28 16:39:30 2012

Copyright (c) 1982, 2011, Oracle.  All rights reserved.


Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options

SQL> 
SQL> show user
USER is "SYS"
SQL> alter system flush shared_pool;

System altered.

1. sqlplus OCI

SQL> VARIABLE n NUMBER
VARIABLE v VARCHAR2(32)

EXECUTE :n := 1; :v := 'Helicon';

INSERT INTO t (n, v) VALUES (:n, :v);SQL> SQL> SQL> 
PL/SQL procedure successfully completed.

SQL> SQL> 

1 row created.

SQL> commit;

Commit complete.

SQL>  SELECT sql_id, child_number, executions
FROM v$sql
WHERE sql_text = 'INSERT INTO t (n, v) VALUES (:n, :v)';   2    3  

SQL_ID	      CHILD_NUMBER EXECUTIONS
------------- ------------ ----------
6cvmu7dwnvxwj		 0	    1


SQL> VARIABLE v VARCHAR2(33)

EXECUTE :n := 4; :v := 'Terminus';

INSERT INTO t (n, v) VALUES (:n, :v);SQL> SQL> 
PL/SQL procedure successfully completed.

SQL> SQL> 

1 row created.

SQL> commit;

Commit complete.


SQL> VARIABLE v VARCHAR2(129)

EXECUTE :n := 4; :v := 'Terminus';

INSERT INTO t (n, v) VALUES (:n, :v);SQL> SQL> 
PL/SQL procedure successfully completed.

SQL> SQL> 

1 row created.

SQL> commit;

Commit complete.

SQL> /

SQL_ID	      CHILD_NUMBER EXECUTIONS
------------- ------------ ----------
6cvmu7dwnvxwj		 0	    2
6cvmu7dwnvxwj		 1	    1                  -----------------129 产生第一个child cursor

SQL> SELECT s.child_number, m.position, m.max_length, 
       decode(m.datatype,1,'VARCHAR2',2,'NUMBER',m.datatype) aS datatype
FROM v$sql s, v$sql_bind_metadata m
WHERE s.sql_id = '&sql_id'
AND s.child_address = m.address
ORDER BY 1, 2;  2    3    4    5    6  
Enter value for sql_id: 6cvmu7dwnvxwj
old   4: WHERE s.sql_id = '&sql_id'
new   4: WHERE s.sql_id = '6cvmu7dwnvxwj'

CHILD_NUMBER   POSITION MAX_LENGTH DATATYPE
------------ ---------- ---------- ----------------------------------------
	   0	      1 	22 NUMBER
	   0	      2        128 VARCHAR2
	   1	      1 	22 NUMBER
	   1	      2       2000 VARCHAR2     
	   
丢失了 32这个区间

2 OCI pl/sqldeveloper 操作

SQL> VARIABLE n NUMBER
SQL> VARIABLE v VARCHAR2(32)
SQL> EXECUTE :n := 1; :v := 'Helicon';
 
PL/SQL procedure successfully completed
n
---------
1
v
---------
Helicon
SQL> INSERT INTO t (n, v) VALUES (:n, :v);
 
1 row inserted
n
---------
1
v
---------
Helicon
 
SQL> commit;
 
Commit complete
 
SQL>  SELECT sql_id,child_number, executions
FROM v$sql
WHERE sql_text = ' INSERT INTO t (n, v) VALUES (:n, :v) ';  2    3  

SQL_ID	      CHILD_NUMBER EXECUTIONS
------------- ------------ ----------
fp1vwg5jfpk4b		 0	    1


SQL> VARIABLE v VARCHAR2(33)
SQL> EXECUTE :n := 4; :v := 'Terminus';
 
PL/SQL procedure successfully completed
n
---------
4
v
---------
Terminus
SQL> INSERT INTO t (n, v) VALUES (:n, :v);
 
1 row inserted
n
---------
4
v
---------
Terminus
 
SQL> commit;
 
Commit complete


SQL> /

SQL_ID	      CHILD_NUMBER EXECUTIONS
------------- ------------ ----------
fp1vwg5jfpk4b		 0	    2

SQL> VARIABLE v VARCHAR2(129)
SQL> EXECUTE :n := 4; :v := 'Terminus';
 
PL/SQL procedure successfully completed
n
---------
4
v
---------
Terminus
SQL> INSERT INTO t (n, v) VALUES (:n, :v);
 
1 row inserted
n
---------
4
v
---------
Terminus
 
SQL> 
SQL> commit;
 
Commit complete


SQL> /

SQL_ID	      CHILD_NUMBER EXECUTIONS
------------- ------------ ----------
fp1vwg5jfpk4b		 0	    3                  -------------------没有产生child cursor


SQL> SELECT s.child_number, m.position, m.max_length, 
       decode(m.datatype,1,'VARCHAR2',2,'NUMBER',m.datatype) aS datatype
FROM v$sql s, v$sql_bind_metadata m
WHERE s.sql_id = '&sql_id'
AND s.child_address = m.address
ORDER BY 1, 2;    2    3    4    5    6  
Enter value for sql_id: fp1vwg5jfpk4b
old   4: WHERE s.sql_id = '&sql_id'
new   4: WHERE s.sql_id = 'fp1vwg5jfpk4b'

CHILD_NUMBER   POSITION MAX_LENGTH DATATYPE
------------ ---------- ---------- ----------------------------------------
	   0	      1 	22 NUMBER
	   0	      2       4000 VARCHAR2

默认产生了4000的max值区间。

这个测试不具备任何意义,经过确认pl/sql developer经过了封装,导致oracle 端默认为4000的max区间。

3. OCI JAVA -(模拟真实环境)

代码如下:

        oracle_conn = DriverManager.getConnection("jdbc:oracle:oci:@xxx", "xxx", "xxx");
             
            oracle_stmt = oracle_conn.prepareStatement("insert into t values(?, ?)"); 
     
         oracle_stmt.setInt(1, 1);  
         oracle_stmt.setString(2, "Helicon"); 
         oracle_stmt.execute();
        
         oracle_stmt.setInt(1, 2);  
         oracle_stmt.setString(2, "Helicon33333333333333333333333333333"); 
         oracle_stmt.execute();
        
         oracle_stmt.setInt(1, 3);  
         oracle_stmt.setString(2, "Helicon3333333333333333333333333333323sssssssssssssssssssssssssssssssssssssssssssssssssssssssddddddddddddddddddddddddddddddddddssssssssssssssssssssssss"); 
         oracle_stmt.execute();
SQL> SELECT s.child_number, m.position, m.max_length,
  2         decode(m.datatype,1,'VARCHAR2',2,'NUMBER',m.datatype) aS datatype
  3  FROM v$sql s, v$sql_bind_metadata m
  4  WHERE s.sql_id = 'dw481sdb5fkkt'
  5  AND s.child_address = m.address
  6  ORDER BY 1, 2;
 
CHILD_NUMBER   POSITION MAX_LENGTH DATATYPE
------------ ---------- ---------- ----------------------------------------
           0          1         22 NUMBER
           0          2         32 VARCHAR2                  
           1          1         22 NUMBER
           1          2       2000 VARCHAR2                
 

    	SQL_TEXT 	SQL_ID 	EXECUTIONS 	LOADS 	FIRST_LOAD_TIME
 	insert into t values(:1, :2) 	dw481sdb5fkkt 	1 	1 	2012-11-28/20:30:05
	insert into t values(:1, :2) 	dw481sdb5fkkt 	2 	1 	2012-11-28/20:30:05

产生了32,2000的区间, 但是缺少了128的区间。

4 JDBC JAVA (目前使用的场景)

代码如下:

package com;
import java.sql.Connection;
import java.sql.DriverManager;
import java.sql.PreparedStatement;
import java.sql.ResultSet;
import java.sql.SQLException;
import oracle.jdbc.driver.*;


public class test {  
  
    public static void main(String[] args) {  
        Connection oracle_conn = null;  
        PreparedStatement oracle_stmt = null;  
        ResultSet oracle_rs = null;  

                  
        try {  
            Class.forName("oracle.jdbc.driver.OracleDriver");  
            oracle.jdbc.driver.OracleDriver a;
            
            oracle_conn = DriverManager.getConnection("jdbc:oracle:thin:@xx.xx.xx.xx:xx:yy", "xx", "xx");  
              
            oracle_stmt = oracle_conn.prepareStatement("insert into t values(?, ?)");  
      
	        oracle_stmt.setInt(1, 1);   
	        oracle_stmt.setString(2, "Helicon");  
	        oracle_stmt.execute();
	        
	        oracle_stmt.setInt(1, 2);   
	        oracle_stmt.setString(2, "Helicon33333333333333333333333333333");  
	        oracle_stmt.execute();
	        
	        oracle_stmt.setInt(1, 3);   
	        oracle_stmt.setString(2, "Helicon3333333333333333333333333333323sssssssssssssssssssssssssssssssssssssssssssssssssssssssddddddddddddddddddddddddddddddddddssssssssssssssssssssssss");  
	        oracle_stmt.execute();		        

        } catch (ClassNotFoundException e) {  
            e.printStackTrace();  
        } catch (SQLException e) {  
            e.printStackTrace();  
        } finally {  
            try {  
                if(oracle_rs != null) {  
                    oracle_rs.close();  
                    oracle_rs = null;  
                }  
                  
                if(oracle_stmt != null) {  
                    oracle_stmt.close();  
                    oracle_stmt = null;  
                }  
                  
                if(oracle_conn != null) {  
                    oracle_conn.close();  
                    oracle_conn = null;  
                }    
            } catch (SQLException e) {  
                e.printStackTrace();  
            }  
        }  
    }  
  
}  
SQL> SELECT s.child_number, m.position, m.max_length,
  2         decode(m.datatype,1,'VARCHAR2',2,'NUMBER',m.datatype) aS datatype
  3  FROM v$sql s, v$sql_bind_metadata m
  4  WHERE s.sql_id = 'fw60v89km14c9'
  5  AND s.child_address = m.address
  6  ORDER BY 1, 2;
  
  


CHILD_NUMBER   POSITION MAX_LENGTH DATATYPE
------------ ---------- ---------- ----------------------------------------
           0          1         22 NUMBER
           0          2         32 VARCHAR2                  
           1          1         22 NUMBER
           1          2        128 VARCHAR2  
           2          1         22 NUMBER
           2          2        2000 VARCHAR2
           
           
    	SQL_TEXT 	SQL_ID 	EXECUTIONS 	LOADS 	FIRST_LOAD_TIME
 	insert into t values(:1, :2) 	fw60v89km14c9 	1 	1 	2012-11-28/16:16:46
	insert into t values(:1, :2) 	fw60v89km14c9 	2 	1 	2012-11-28/16:16:46
	insert into t values(:1, :2) 	fw60v89km14c9 	3 	1 	2012-11-28/16:16:46

JDBC 行为正常 32 128 2000的区间符合默认行为。

总结:

1. pl/sql developer 测试不具备任何价值。(各位同学也不要基于这个去测试了)

2. OCI sqlplus 缺少32区间, JAVA缺少 128区间 这个问题比较疑惑。

3. JDBC 目前正常

Bind Graduation oracle的本意是为了更详细的区分cursor,多次peeking 达到最佳的执行计划。但是对于一些设计很烂的表,将会出现child cursor暴增的可能

Reference:Oracle Call Interface

Exchange outlines

September 14, 2012 Internals, oracle No comments

参见昨天Roger的文章,exchange outline可以同样实现这种需求

参考JL的文章:
Stored Outlines in Oracle 8
Stored Outlines in Oracle 9

测试环境:rhel 5.8 oracle 11.2.0.3 64bit

[root@db-2-15 ~]# su - oracle
-bash-3.2$ sqlplus  'liu/liu'

SQL*Plus: Release 11.2.0.3.0 Production on Fri Sep 14 10:28:51 2012

Copyright (c) 1982, 2011, Oracle.  All rights reserved.


Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options

SQL> drop table liu;

Table dropped.

SQL> create table liu (id int);

Table created.

SQL> begin
  2  for 
  3  i in 1..1000000 loop 
  4  insert into liu values (i);
  5  commit;
  6  end loop;
  7  end;
  8  /

PL/SQL procedure successfully completed.

SQL> 
SQL> 
SQL>    
SQL> create index idx_test on liu (id);

Index created.

SQL> commit;

Commit complete.

SQL> EXEC DBMS_STATS.GATHER_TABLE_STATS(OWNNAME=>'LIU',TABNAME=>'LIU', CASCADE=>TRUE);

PL/SQL procedure successfully completed.
SQL> EXPLAIN PLAN FOR SELECT * FROM  LIU WHERE ID =7000;

Explained.

SQL> select plan_table_output from table(dbms_xplan.display());

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 1128569081

-----------------------------------------------------------------------------
| Id  | Operation	 | Name     | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------
|   0 | SELECT STATEMENT |	    |	  1 |	  5 |	  3   (0)| 00:00:01 |
|*  1 |  INDEX RANGE SCAN| IDX_TEST |	  1 |	  5 |	  3   (0)| 00:00:01 |
-----------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

   1 - access("ID"=7000)

13 rows selected.

SQL> EXPLAIN PLAN FOR SELECT /*+full(a) */ * FROM  LIU a WHERE ID =7000;

Explained.

SQL> select plan_table_output from table(dbms_xplan.display());

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 3882242470

--------------------------------------------------------------------------
| Id  | Operation	  | Name | Rows  | Bytes | Cost (%CPU)| Time	 |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |	 |     1 |     5 |   459   (4)| 00:00:06 |
|*  1 |  TABLE ACCESS FULL| LIU  |     1 |     5 |   459   (4)| 00:00:06 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

   1 - filter("ID"=7000)

13 rows selected.

创建两个outline(nohint,withhint) -Category TEST

SQL> CREATE OR REPLACE OUTLINE nohint for  category test  ON SELECT * FROM  LIU a WHERE ID =7000;

Outline created.

SQL> create or replace outline withhint for category test on SELECT /*+full(a) */ * FROM  LIU a WHERE ID =7000;

Outline created.


NAME			       CATEGORY 		      USED		 SQL_TEXT
------------------------------ ------------------------------ ------------------ --------------------------------------------------------------------------------
NOHINT			       TEST			      UNUSED		 SELECT * FROM	LIU a WHERE ID =7000
WITHHINT		       TEST			      UNUSED		 SELECT /*+full(a) */ * FROM  LIU a WHERE ID =7000

SQL> alter session set use_stored_outlines = TEST;   

Session altered.

进入outln Schema 的outln相关表, 调换两个存储概要的实际内容.

SQL> update outln.ol$ ol1
set hintcount = (
        select  count(*) 
        from    outln.ol$hints ol2
        where   ol2.ol_name = ol1.ol_name
        )
where
	ol1.ol_name in ('NOHINT','WITHHINT')  2    3    4    5    6    7    8  
  9  ;

2 rows updated.

SQL> commit;

Commit complete.

SQL> UPDATE OUTLN.OL$HINTS
SET OL_NAME=DECODE(OL_NAME, 'NOHINT', 'WITHHINT','WITHHINT','NOHINT')
WHERE OL_NAME IN ('NOHINT','WITHHINT');  2    3  

12 rows updated.

SQL> commit;

Commit complete.

SQL> conn liu/liu
Connected.
SQL> alter session set use_stored_outlines = TEST; 

Session altered.

SQL> alter outline NOHINT  disable;
alter outline NOHINT  enable;
alter outline WITHHINT  disable;
alter outline WITHHINT  enable;
Outline altered.

SQL> 
Outline altered.

SQL> 
Outline altered.

SQL> 


Outline altered.


SQL> EXPLAIN PLAN FOR SELECT /*+full(a) */ * FROM  LIU a WHERE ID =7000;

Explained.

SQL> select plan_table_output from table(dbms_xplan.display());

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 1128569081

-----------------------------------------------------------------------------
| Id  | Operation	 | Name     | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------
|   0 | SELECT STATEMENT |	    |	  1 |	  5 |	  3   (0)| 00:00:01 |
|*  1 |  INDEX RANGE SCAN| IDX_TEST |	  1 |	  5 |	  3   (0)| 00:00:01 |
-----------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

   1 - access("ID"=7000)

Note
-----
   - outline "WITHHINT" used for this statement

17 rows selected.

SQL> EXPLAIN PLAN FOR SELECT  * FROM  LIU a WHERE ID =7000;

Explained.

SQL> select plan_table_output from table(dbms_xplan.display());

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 3882242470

--------------------------------------------------------------------------
| Id  | Operation	  | Name | Rows  | Bytes | Cost (%CPU)| Time	 |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |	 |     1 |     5 |   459   (4)| 00:00:06 |
|*  1 |  TABLE ACCESS FULL| LIU  |     1 |     5 |   459   (4)| 00:00:06 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

   1 - filter("ID"=7000)

Note
-----
   - outline "NOHINT" used for this statement

17 rows selected.

此时已经发生了交换

同理如果没有使用绑定变量同样可以交换,我们假设两个不同的值 这里使用Enkitec提供的脚本

SQL> CREATE OR REPLACE OUTLINE nohint1 for  category test  ON SELECT * FROM  LIU a WHERE ID =6000;
Outline created.

SQL> CREATE OR REPLACE OUTLINE withhint1 for  category test  ON SELECT /*+full(a) */ * FROM  LIU a WHERE ID =500;

Outline created.

SQL> alter session set use_stored_outlines = TEST;

SQL> @exchange_ol
Enter value for from_name: NOHINT1
old   6: fromname := '&from_name';
new   6: fromname := 'NOHINT1';
Enter value for to_name: WITHHINT1
old   7: toname := '&to_name';
new   7: toname := 'WITHHINT1';

PL/SQL procedure successfully completed.


Commit complete.

SQL> 
SQL> 
SQL> 
SQL> EXPLAIN PLAN FOR SELECT /*+full(a) */ * FROM  LIU a WHERE ID =500;

Explained.

SQL> select plan_table_output from table(dbms_xplan.display());

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 1128569081

-----------------------------------------------------------------------------
| Id  | Operation	 | Name     | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------
|   0 | SELECT STATEMENT |	    |	  1 |	  5 |	  3   (0)| 00:00:01 |
|*  1 |  INDEX RANGE SCAN| IDX_TEST |	  1 |	  5 |	  3   (0)| 00:00:01 |
-----------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

   1 - access("ID"=500)

Note
-----
   - outline "WITHHINT1" used for this statement

17 rows selected.

SQL> EXPLAIN PLAN FOR SELECT  * FROM  LIU a WHERE ID =6000;

Explained.

SQL> select plan_table_output from table(dbms_xplan.display());

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 3882242470

--------------------------------------------------------------------------
| Id  | Operation	  | Name | Rows  | Bytes | Cost (%CPU)| Time	 |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |	 |     1 |     5 |   459   (4)| 00:00:06 |
|*  1 |  TABLE ACCESS FULL| LIU  |     1 |     5 |   459   (4)| 00:00:06 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

   1 - filter("ID"=6000)

Note
-----
   - outline "NOHINT1" used for this statement

17 rows selected.

此时同样发生了交换