structure

How to create very big index

December 29, 2011 11g, maintain, oracle No comments

how to create very big index

我们总会碰到各种case 需要建立索引 有些索引大至1T,这些索引的创建是个非常棘手的问题,oracle 11g 在建立partition index时,给出了不错的解决方案


Let’s now re-create the index as a globally partitioned index, with one partition defined to contain all values of interest and another partition defined to contain the vast number of processed values. Initially, the index is created in an unusable state so no segments and no storage is allocated to any of the partitions

在创建partition index 时候使用unusable 可以不立刻分配storage和segment空间,从而解决了空间分配这个问题:

通过Richard Foote提供的例子,我们可以很好的看出效果:

SQL> create table bowie_stuff (id number, processed varchar2(10));

Table created.

SQL> insert into bowie_stuff select rownum, ‘YES’ from dual connect by level <= 1000000; 1000000 rows created. SQL> commit;

Commit complete.

SQL> update bowie_stuff set processed = ‘NO’ where id in (999990, 999992, 999994, 999996, 999998);

5 rows updated.

SQL> commit;

Commit complete.

SQL> create index bowie_stuff_i on bowie_stuff(processed) pctfree 0;

Index created.

SQL> select index_name, leaf_blocks from dba_indexes where index_name = ‘BOWIE_STUFF_I’;

INDEX_NAME LEAF_BLOCKS

—————————— ———–

BOWIE_STUFF_I 1877

SQL> select segment_name, blocks from dba_segments where segment_name = ‘BOWIE_STUFF_I’;

SEGMENT_NAME BLOCKS

——————– ———-

BOWIE_STUFF_I 1920

建立了一个normal index 分配了1920个块

——————————————————


SQL> drop index bowie_stuff_i;

Index dropped.

SQL> create index bowie_stuff_i on bowie_stuff(processed)

2 global partition by range (processed)

3 (partition not_processed_part values less than (‘YES’),

4 partition processed_part values less than (MAXVALUE))

5 unusable;

Index created.

SQL> alter index bowie_stuff_i rebuild partition not_processed_part;

Index altered.

SQL> select index_name, partition_name, leaf_blocks from dba_ind_partitions where index_name = ‘BOWIE_STUFF_I’;

INDEX_NAME PARTITION_NAME LEAF_BLOCKS

——————– ——————– ———–

BOWIE_STUFF_I PROCESSED_PART 0

BOWIE_STUFF_I NOT_PROCESSED_PART 1

SQL> select segment_name, partition_name, blocks from dba_segments where segment_name = ‘BOWIE_STUFF_I’;

SEGMENT_NAME PARTITION_NAME BLOCKS

——————– ——————– ———-

BOWIE_STUFF_I NOT_PROCESSED_PART 8

这里仅仅分配了 8个block

根据Richard Foote 的分析,在index create 完成之后alter index xx unusable 在10g,11g 也是截然不同的:

Starting with a simple Oracle 10g example, we create a table and associated index:

SQL> create table bowie as select rownum id, ‘BOWIE’ name from dual connect by level <= 1000000; Table created. SQL> create index bowie_id_i on bowie(id);

Index created.

SQL> exec dbms_stats.gather_table_stats(ownname=>null, tabname=>’BOWIE’, cascade=> true, estimate_percent=> null, method_opt=> ‘FOR ALL COLUMNS SIZE 1’);

PL/SQL procedure successfully completed.

If we now make the index unusable:

view sourceprint?

SQL> alter index bowie_id_i unusable;

Index altered.

SQL> select index_name, blevel, leaf_blocks, num_rows, status, dropped from dba_indexes where index_name = ‘BOWIE_ID_I’;

INDEX_NAME BLEVEL LEAF_BLOCKS NUM_ROWS STATUS DRO

———- ———- ———– ———- ——– —

BOWIE_ID_I 2 2226 1000000 UNUSABLE NO

SQL> select segment_name, bytes, blocks, extents from dba_segments where segment_name = ‘BOWIE_ID_I’;

SEGMENT_NAME BYTES BLOCKS EXTENTS

———— ———- ———- ———-

BOWIE_ID_I 18874368 2304 18

当index 置于unusable时候 blocks 依然存在

Using the same demo as before but running Oracle11g R2:

SQL> create table bowie as select rownum id, ‘BOWIE’ name from dual connect by level <= 1000000; Table created. SQL> create index bowie_id_i on bowie(id);

Index created.

SQL> exec dbms_stats.gather_table_stats(ownname=>null, tabname=>’BOWIE’, cascade=> true, estimate_percent=> null, method_opt=> ‘FOR ALL COLUMNS SIZE 1’);

PL/SQL procedure successfully completed.

SQL> alter index bowie_id_i unusable;

Index altered.

SQL> select index_name, blevel, leaf_blocks, num_rows, status, dropped from dba_indexes where index_name = ‘BOWIE_ID_I’;

INDEX_NAME BLEVEL LEAF_BLOCKS NUM_ROWS STATUS DRO

———- ———- ———– ———- ——– —

BOWIE_ID_I 2 2226 1000000 UNUSABLE NO

SQL> select segment_name, bytes, blocks, extents from dba_segments where segment_name = ‘BOWIE_ID_I’;

no rows selected

在11gR2中 alter index unusable 发现index 所占用的blocks 完全消失了

以下是分区表的测试:

SQL> CREATE TABLE big_album_sales(id number, album_id number, country_id number,

release_date date, total_sales number) PARTITION BY RANGE (release_date)

(PARTITION ALBUMS_2007 VALUES LESS THAN (TO_DATE(’01-JAN-2008′, ‘DD-MON-YYYY’)),

PARTITION ALBUMS_2008 VALUES LESS THAN (TO_DATE(’01-JAN-2009′, ‘DD-MON-YYYY’)),

PARTITION ALBUMS_2009 VALUES LESS THAN (TO_DATE(’01-JAN-2010′, ‘DD-MON-YYYY’)),

PARTITION ALBUMS_2010 VALUES LESS THAN (MAXVALUE));

Table created.

SQL> INSERT INTO big_album_sales SELECT rownum, mod(rownum,5000)+1, mod(rownum,100)+1, sysdate-mod(rownum,2000), ceil(dbms_random.value(1,500000)) FROM dual CONNECT BY LEVEL <= 1000000; 1000000 rows created. SQL> commit;

Commit complete.

We first create a Non-Partitioned Index:

SQL> CREATE INDEX big_album_tot_sales_i ON big_album_sales(total_sales);

Index created.

Next a Global Partitioned Index:

SQL> CREATE INDEX big_album_country_id_i ON big_album_sales(country_id)

GLOBAL PARTITION BY RANGE (country_id)

(PARTITION TS1 VALUES LESS THAN (26),

PARTITION TS2 VALUES LESS THAN (51),

PARTITION TS3 VALUES LESS THAN (76),

PARTITION TS4 VALUES LESS THAN (MAXVALUE));

Index created.

Finally, a Local Partitioned index:

SQL> CREATE INDEX big_album_album_id_i ON big_album_sales(album_id) local;

Index created.

SQL> exec dbms_stats.gather_table_stats(ownname=> ‘BOWIE’, tabname=> ‘BIG_ALBUM_SALES’, estimate_percent=> null, method_opt=> ‘FOR ALL COLUMNS SIZE 1′);

PL/SQL procedure successfully completed.

If we now split the last table partition, this will effectively make the:

1) Non-Partitioned Unusable
2) All partitions of the Global Partitioned index unusable
3) Just the last 2 partitions of the Local Partitioned Index unusable

SQL> ALTER TABLE big_album_sales SPLIT PARTITION ALBUMS_2010

AT (TO_DATE(’01-JAN-2011’, ‘DD-MON-YYYY’))

INTO (PARTITION ALBUMS_2010, PARTITION ALBUMS_2011);

Table altered.

SQL> select index_name, status from dba_indexes where table_name = ‘BIG_ALBUM_SALES’;

INDEX_NAME STATUS

———————— ——–

BIG_ALBUM_TOT_SALES_I UNUSABLE

BIG_ALBUM_COUNTRY_ID_I N/A

BIG_ALBUM_ALBUM_ID_I N/A

SQL> select index_name, partition_name, status, leaf_blocks from dba_ind_partitions where index_name like ‘BIG_ALBUM_%’;

INDEX_NAME PARTITION_NAME STATUS LEAF_BLOCKS

———————– ————– ——– ———–

BIG_ALBUM_ALBUM_ID_I ALBUMS_2007 USABLE 807

BIG_ALBUM_ALBUM_ID_I ALBUMS_2008 USABLE 381

BIG_ALBUM_ALBUM_ID_I ALBUMS_2009 USABLE 383

BIG_ALBUM_ALBUM_ID_I ALBUMS_2010 UNUSABLE

BIG_ALBUM_ALBUM_ID_I ALBUMS_2011 UNUSABLE

BIG_ALBUM_COUNTRY_ID_I TS1 UNUSABLE 629

BIG_ALBUM_COUNTRY_ID_I TS2 UNUSABLE 629

BIG_ALBUM_COUNTRY_ID_I TS3 UNUSABLE 629

BIG_ALBUM_COUNTRY_ID_I TS4 UNUSABLE 629

SQL> select segment_name, partition_name, bytes, blocks from dba_segments where segment_name like ‘BIG_ALBUM_%’ and segment_type like ‘INDEX%’;

SEGMENT_NAME PARTITION_NAME BYTES BLOCKS

——————— ————– ——– ——

BIG_ALBUM_ALBUM_ID_I ALBUMS_2007 7340032 896

BIG_ALBUM_ALBUM_ID_I ALBUMS_2008 3145728 384

BIG_ALBUM_ALBUM_ID_I ALBUMS_2009 4194304 512

BIG_ALBUM_TOT_SALES_I 23068672 2816

可以发现 BIG_ALBUM_ALBUM_ID_I 状态为UNUSABLE的 blocks已经消失,BIG_ALBUM_COUNTRY_ID_I 由于所有partition都变成了 UNUSABLE 所以整个segment占用的blocks都被取消
BIG_ALBUM_TOT_SALES_I由于是normal index blocks 依然存在,这跟上面的heap表测试出现了矛盾 heap table 在index unusable 时候 blocks 将被drop掉,而在分区表中blocks依然存在!

[转]Local Index Issue With Partitioned PK and Unique Key Constraints

December 27, 2011 maintain, oracle

看到一篇不错的文章 转载下


Nuno Souto (Noons) also asked a really interesting question on my Differences between Unique and Non-Unique Indexes blog entry (comment 4) that I thought it worthy of a separate blog entry to do the answer justice. The question was:

“Isn’t it still the case that unique indexes cannot be locally partitioned unless the partition key is part of the index key? Not sure if 11g removes this. If still so, that would weigh heavily in favour of non-unique indexing for PK on a table potentially requiring local index partitions.”

Simplistically, the answer to the first part is Yes it is still the case, even in 11g and the answer to the second part is No, it wouldn’t weigh heavily in favour of non-unique indexing for PK on a table requiring local index partitions. It wouldn’t actually be a consideration at all.

Let me explain why.

Firstly, there is a really really good reason why Oracle doesn’t allow us to create a Unique Index in which the Partition key is not part of a Local Index. It’s called protecting us from ourselves !!

Let’s start by mentioning constraints again.

Remember, the main reason we have indexes policing PK and Unique constraints is so that Oracle can very quickly and efficiently determine whether or not a new value already exists. Do a quick index look-up, is the value there, yes or no, allow the insert (or update), yes or no.

Just imagine for one moment what would happen if Oracle actually allowed us to create a Unique Local index in which the index didn’t include the partitioned column(s).

Lets say a table is Range Partitioned on column ’A’ and we try and create a Unique Local index on just column ‘B’. Let’s assume we have (say) 500 table partitions meaning we must therefore have 500 local index partitions as well. When we insert a new value for our unique index for value B, it will attempt to do so in the corresponding local index partition as governed by the value A for the new row. However Oracle can’t just check this one index partition for uniqueness to ensure value of column B doesn’t already exist, Oracle would need to check all 500 index partitions because it would be possible for our new value of column B to potentially have previously been inserted into any of the other 499 partitions !!

Each and every insert into our partitioned table (partitioned by column A) therefore would require Oracle to check all (say)500 index partitions each and every time to check for duplicates of column B. Again, it’s important to understand that any given value of column B could potentially be in any of the 500 partitions, IF Oracle allowed us to create a Local Partitioned Index just on column B.

Checking all 500 index partitions looking for a specific value of column B would obviously be impractical, inefficient and totally un-scalable. Therefore Oracle doesn’t allow us to do this. It doesn’t allow us to create a Local index in which the indexed columns does’t include the partitioning columns as well.

This is actually a good thing.

If you want to create a Unique index in a partitioned table, you MUST either add all the partitioned columns and make it part of the LOCAL unique index (so that way each and every insert would only have to check the one local partition as this value is known now it’s part of the index) or you must create it as a GLOBAL index (in which again, Oracle only has to check the one index structure).

It actually makes a lot of sense to do this.

Moving onto the second part of the question. Let’s just use a Local Non-Unique index to police our PK constraints then.

Fortunately this isn’t allowed either for exactly the same reasons. You can’t create a Local Non-unique index to police a PK (or Unique) constraint if the Constraint does not also include the partitioned columns. Otherwise again, Oracle would need to check each and every index partition to determine whether the constraint has been violated or not.

If you attempt to use an existing Local Non-Unique index to police a PK or Unique constraint that does not contain the partitioned columns, you will get an error saying it can’t create the (by default Global index) because the useless Local Non-Unique index (from a policing the constraint point of view) already exists.

Again if you want to create a Non-Unique index to police a PK or Unique constraint you must either ensure the constraint includes all the partitioned columns in which case it can be Local or you must use a Global Non-Unique index.

In other words, the rules apply equally to both Unique and Non-Unique indexes.

So it’s not really a case of Oracle not allowing one to create a Local Unique index without including the partitioned columns (although that’s of course true) but really a case of Oracle not allowing a PK or Unique *constraint* to be policed via *any* Local index (whether Unique or Non-Unique), unless the partitioned columns are also included.

reference:http://richardfoote.wordpress.com/2007/12/20/local-index-issue-with-partitioned-pk-and-unique-key-constraints/

BTREE索引浅解

December 22, 2011 maintain, oracle No comments

结构如下图所示:

对于分支节点块(包括根节点块)来说,其所包含的索引条目都是按照顺序排列的(缺省是升序排列,也可以在创建索引时指定为降序排列)。每个索引条目(也可以叫做每条记录)都具有两个字段。第一个字段表示当前该分支节点块下面所链接的索引块中所包含的最小键值;第二个字段为四个字节,表示所链接的索引块的地址,该地址指向下面一个索引块。在一个分支节点块中所能容纳的记录行数由数据块大小以及索引键值的长度决定。比如从上图一可以看到,对于根节点块来说,包含三条记录,分别为(0 B1)、(500 B2)、(1000 B3),它们指向三个分支节点块。其中的0、500和1000分别表示这三个分支节点块所链接的键值的最小值。而B1、B2和B3则表示所指向的三个分支节点块的地址。

从物理位置上来讲,index block 内部是没有顺序的,不重整的话,是按照进入先后顺序排的, 也就是我们通常讲 块间有序,块内无序。
对于叶子节点块来说,其所包含的索引条目与分支节点一样,都是按照顺序排列的(缺省是升序排列,也可以在创建索引时指定为降序排列)。每个索引条目(也可以叫做每条记录)也具有两个字段。第一个字段表示索引的键值,对于单列索引来说是一个值;而对于多列索引来说则是多个值组合在一起的。第二个字段表示键值所对应的记录行的ROWID,该ROWID是记录行在表里的物理地址。如果索引是创建在非分区表上或者索引是分区表上的本地索引的话,则该ROWID占用6个字节;如果索引是创建在分区表上的全局索引的话,则该ROWID占用10个字节。