11g new feature

11g nested loop faster than 10g

February 3, 2012 11g, oracle No comments

在pythian看到了一个测试,是否证明了oracle 11g is x% faster than 10g?(oracle官方的说法)

First, you have to create and fill a table to run your query :

create table gark
(id1 number not null,
id2 number not null,
id3 number not null);

begin
for i in 1..100000 loop
insert into gark(id1, id2, id3)
values (i, i, i);
end loop;
commit;
end;
/

create unique index gark_idx on gark(id1, id3);

begin
dbms_stats.gather_table_stats(
user,
‘GARK’,
cascade=>true,
estimate_percent=>100,
method_opt=>’FOR ALL COLUMNS SIZE 254′,
no_invalidate=> false);
end;
/
Then, run the query below on a 10g database :

set autotrace traceonly

select /*+ use_nl(A B) */count(a.id3)
from gark a, gark b
where a.id1=b.id2;

Execution Plan
—————————-
Plan hash value: 3137705415

——————————————————–
| Id | Operation | Name | Rows | Cost |
——————————————————–
| 0 | SELECT STATEMENT | | 1 | 100K |
| 1 | SORT AGGREGATE | | 1 | |
| 2 | NESTED LOOPS | | 100K| 100K |
| 3 | TABLE ACCESS FULL| GARK | 100K| 65 |
|* 4 | INDEX RANGE SCAN | GARK_IDX | 1 | 1 |
——————————————————–

Predicate Information (identified by operation id):
—————————————————

4 – access(“A”.”ID1″=”B”.”ID2″)

Statistics
———————————————————-
0 recursive calls
0 db block gets
100556 consistent gets
0 physical reads
0 redo size
415 bytes sent via SQL*Net to client
385 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed
Run the same query on 11g (There is no trick here, the plan is the right plan, I’ve check it with dbms_xplan.display_cursor) :

set autotrace traceonly

select /*+ use_nl(A B) */count(a.id3)
from gark a, gark b
where a.id1=b.id2;

Execution Plan
———————————————————-
Plan hash value: 3137705415

——————————————————-
| Id | Operation | Name | Rows | Cost |
——————————————————-
| 0 | SELECT STATEMENT | | 1 | 100K |
| 1 | SORT AGGREGATE | | 1 | |
| 2 | NESTED LOOPS | | 100K| 100K |
| 3 | TABLE ACCESS FULL| GARK | 100K| 105 |
|* 4 | INDEX RANGE SCAN | GARK_IDX | 1 | 11 |
——————————————————-

Predicate Information (identified by operation id):
—————————————————

4 – access(“A”.”ID1″=”B”.”ID2″)

Statistics
———————————————————-
0 recursive calls
0 db block gets
3373 consistent gets
0 physical reads
0 redo size
422 bytes sent via SQL*Net to client
415 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed
If you know how the NL algorithm works in 10g, it’s quite easy to understand how it has been modified. The good news with that is that it won’t only speed up queries with optimal plans but also speed up queries with sub optimal plans that are using Nested Loops.

这里所谓的算法是什么算法,我们通过查找oracle文档,发现11g nested loop 与 10g 的区别:


Original and New Implementation for Nested Loop Joins

Oracle Database 11g Release 1 (11.1) introduces a new implementation for nested loop joins. As a result, execution plans that include nested loops might appear different than they did in previous releases of Oracle Database. Both the new implementation and the original implementation for nested loop joins are possible in Oracle Database 11g Release 1 (11.1). So, when analyzing execution plans, it is important to understand that the number of NESTED LOOPS join row sources might be different.

Original Implementation for Nested Loop Joins

Consider the following query:

SELECT e.first_name, e.last_name, e.salary, d.department_name
FROM hr.employees e, hr.departments d
WHERE d.department_name IN (‘Marketing’, ‘Sales’)
AND e.department_id = d.department_id;
before Oracle Database 11g Release 1 (11.1), the execution plan for this query might appear similar to the following execution plan:

————————————————————————————————-
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
————————————————————————————————-
| 0 | SELECT STATEMENT | | 19 | 722 | 3 (0)| 00:00:01 |
| 1 | TABLE ACCESS BY INDEX ROWID| EMPLOYEES | 10 | 220 | 1 (0)| 00:00:01 |
| 2 | NESTED LOOPS | | 19 | 722 | 3 (0)| 00:00:01 |
|* 3 | TABLE ACCESS FULL | DEPARTMENTS | 2 | 32 | 2 (0)| 00:00:01 |
|* 4 | INDEX RANGE SCAN | EMP_DEPARTMENT_IX | 10 | | 0 (0)| 00:00:01 |
————————————————————————————————-

Predicate Information (identified by operation id):
—————————————————
3 – filter(“D”.”DEPARTMENT_NAME”=’Marketing’ OR “D”.”DEPARTMENT_NAME”=’Sales’)
4 – access(“E”.”DEPARTMENT_ID”=”D”.”DEPARTMENT_ID”)
In this example, the outer side of the join consists of a scan of the hr.departments table that returns the rows that match the condition department_name IN (‘Marketing’, ‘Sales’). The inner loop retrieves the employees in the hr.employees table that are associated with those departments.

New Implementation for Nested Loop Joins
Oracle Database 11g Release 1 (11.1) introduces a new implementation for nested loop joins to reduce overall latency for physical I/O. When an index or a table block is not in the buffer cache and is needed to process the join, a physical I/O is required. In Oracle Database 11g Release 1 (11.1), Oracle Database can batch multiple physical I/O requests and process them using a vector I/O instead of processing them one at a time. As part of the new implementation for nested loop joins, two NESTED LOOPS join row sources might appear in the execution plan where only one would have appeared in prior releases. In such cases, Oracle Database allocates one NESTED LOOPS join row source to join the values from the table on the outer side of the join with the index on the inner side. A second row source is allocated to join the result of the first join, which includes the rowids stored in the index, with the table on the inner side of the join.

Consider the query in “Original Implementation for Nested Loop Joins”. In Oracle Database 11g Release 1 (11.1), with the new implementation for nested loop joins, the execution plan for this query might appear similar to the following execution plan:

————————————————————————————————
| Id | Operation | Name | Rows | Bytes | Cost(%CPU)| Time |
————————————————————————————————-
| 0 | SELECT STATEMENT | | 19 | 722 | 3 (0)| 00:00:01 |
| 1 | NESTED LOOPS | | | | | |
| 2 | NESTED LOOPS | | 19 | 722 | 3 (0)| 00:00:01 |
|* 3 | TABLE ACCESS FULL | DEPARTMENTS | 2 | 32 | 2 (0)| 00:00:01 |
|* 4 | INDEX RANGE SCAN | EMP_DEPARTMENT_IX | 10 | | 0 (0)| 00:00:01 |
| 5 | TABLE ACCESS BY INDEX ROWID| EMPLOYEES | 10 | 220 | 1 (0)| 00:00:01 |
————————————————————————————————-

Predicate Information (identified by operation id):
—————————————————
3 – filter(“D”.”DEPARTMENT_NAME”=’Marketing’ OR “D”.”DEPARTMENT_NAME”=’Sales’)
4 – access(“E”.”DEPARTMENT_ID”=”D”.”DEPARTMENT_ID”)
In this case, the rows from the hr.departments table constitute the outer side of the first join. The inner side of the first join is the index emp_department_ix. The results of the first join constitute the outer side of the second join, which has the hr.employees table as its inner side.

There are cases where a second join row source is not allocated, and the execution plan looks the same as it did in prior releases. The following list describes such cases:

All of the columns needed from the inner side of the join are present in the index, and there is no table access required. In this case, Oracle Database allocates only one join row source.

The order of the rows returned might be different than it was in previous releases. Hence, when Oracle Database tries to preserve a specific ordering of the rows, for example to eliminate the need for an ORDER BY sort, Oracle Database might use the original implementation for nested loop joins.

The OPTIMIZER_FEATURES_ENABLE initialization parameter is set to a release before Oracle Database 11g Release 1 (11.1). In this case, Oracle Database uses the original implementation for nested loop joins.


在11g中,oracle通过得到的rowid 批量的从磁盘上得到数据,速度>>10g 时候nested loop 的行为。

再回到上面的测试:在11g中 oracle在取得数据时候采用了不同的方法(Oracle Database can batch multiple physical I/O requests and process them using a vector I/O instead of processing them one at a time)使用multiple physical I/O取代了一次返回一行的行为。

11g partition new features

February 3, 2012 11g, oracle No comments

总结了11g partition new feature
11G Introduced partition extensions:
-Interval partitioning
-REF partitioning
-More Composite Partitioning
-Virtual Column-based partitioning
-System Partitioning
-Introduced Partition Advisor.


Partition Types
Let us discuss each of the above features briefly:

Range partitioning: The data is distributed based on a range of values of the partitioning key. For example, if we choose a date column as the partitioning key, the partition “JAN-2007” will contain all the rows that have the partition key values between 01-JAN-2007 and 31-JAN-2007 (assuming the range of the partition is from first of the month to the last date in the month).

Hash Partitioning: A hash algorithm is applied to the partitioning key to determine the partition for a given row. This provides I/O balancing, but cannot be used for range or inequality queries.

List Partitioning: The data distribution is defined by a list of values of the partitioning key. This is useful for discrete lists. e.g: Regions, States etc.

Composite Partitioning: A combination of 2 data distribution methods are used to create a composite partition. The table is initially partitioned by the first data distribution method and then each partition is sub-partitioned by the second data distribution method. The following composite partitions are available:
Range-Hash, Range-List, Range-Range, List-Range, List-List and List-Hash.

Index organized tables (tables where index and data are stored together) supports partitioning by Range, List or Hash in 10G. However, composite partitions are not supported on Index organized tables.

1.Interval Partitioning: Introduced in 11g, interval partitions are extensions to range partitioning. These provide automation for equi-sized range partitions. Partitions are created as metadata and only the start partition is made persistent. The additional segments are allocated as the data arrives. The additional partitions and local indexes are automatically created.


CREATE TABLE SALES_PART
(TIME_ID NUMBER,
REGION_ID NUMBER,
ORDER_ID NUMBER,
ORDER_DATE DATE,
SALES_QTY NUMBER(10,2),
SALES_AMOUNT NUMBER(12,2)
)

PARTITION BY RANGE (ORDER_DATE)
INTERVAL (NUMTOYMINTERVAL(1,’month’) store in (x,y)
(PARTITION p_first VALUES LESS THAN (’01-JAN-2006′);

The numtoyminterval function converts a number to an INTERVAL YEAR TO MONTH literal (‘YEAR’ or ’MONTH’).
The numtodsinterval function converts a number to an INTERVAL DAY literal (‘x’,’DAY’).
还没有详细的资料表明oracle如何控制分区属性,如自增长分区在tablespace之中的分配,以及大小等等

2.REF Partitioning: This partitioning scheme has been introduced with the assumption that related tables would benefit from same partitioning strategy. The detail table inherits the partitioning strategy of the master table through PK-FK relationship. There is no need for the partitioning key to be stored in the detail table and by specifying “PARTITION BY REFERENCE” keyword, the detail table inherits the partitioning strategy of the master table.


CREATE TABLE orders
( order_id NUMBER(12),
order_date DATE,
order_mode VARCHAR2(8),
customer_id NUMBER(6),
CONSTRAINT orders_pk PRIMARY KEY(order_id)
)
PARTITION BY RANGE(order_date)
(PARTITION Q1_2005 VALUES LESS THAN
(TO_DATE(’2005-4-1′,’yyyy-mm-dd’)),
PARTITION Q2_2005 VALUES LESS THAN
(TO_DATE(’2005-7-1′,’yyyy-mm-dd’)),
PARTITION Q3_2005 VALUES LESS THAN
(TO_DATE(’2005-10-1′,’yyyy-mm-dd’)),
PARTITION Q4_2005 VALUES LESS THAN
(TO_DATE(’2006-1-1′,’yyyy-mm-dd’)))

CREATE TABLE order_items
( order_id NUMBER(12) NOT NULL,
line_item_id NUMBER(3) NOT NULL,
product_id NUMBER(6) NOT NULL,
unit_price NUMBER(8,2),
quantity NUMBER(8),
CONSTRAINT order_items_fk
FOREIGN KEY(order_id) REFERENCES orders(order_id)
)
PARTITION BY REFERENCE(order_items_fk)


父表的分区发生变化,子表分区也会自动适应,而单独修改子表分区则不被允许。

3.Virtual Column Based partitioning: In the previous versions of Oracle, a table could be partitioned only if the partition key physically existed in the table. The new functionality in Oracle 11G, “Virtual columns”, removes this restriction and allows partitioning key to be defined by an expression that uses one or more columns of a table. The virtual columns are stored as metadata only.

借用一个网友的例子:


create table sales
(
sales_id number,
cust_id number,
sales_amt number,
sale_category varchar2(6)
generated always as
(
case
when sales_amt <= 10000 then 'LOW' when sales_amt > 10000
and sales_amt <= 100000 then case when cust_id < 101 then 'LOW' when cust_id between 101 and 200 then 'MEDIUM' else 'MEDIUM' end when sales_amt > 100000
and sales_amt <= 1000000 then case when cust_id < 101 then 'MEDIUM' when cust_id between 101 and 200 then 'HIGH' else 'ULTRA' end else 'ULTRA' end ) virtual ) partition by list (sale_category) ( partition p_low values ('LOW'), partition p_medium values ('MEDIUM'), partition p_high values ('HIGH'), partition p_ultra values ('ULTRA') ) ; Table created. SQL>
SQL> insert into sales (sales_id,cust_id,sales_amt) values (1,1,100);

1 row created.

SQL> commit;

Commit complete.

SQL> select * from sales partition (p_low);

SALES_ID CUST_ID SALES_AMT SALE_C
———- ———- ———- ——
1 1 100 LOW

4.System Partitioning

CREATE TABLE syspar (c1 int, c2 int)
PARTITION BY SYSTEM
(
PARTITION p1 TABLESPACE tbs_1,
PARTITION p2 TABLESPACE tbs_2,
PARTITION p3 TABLESPACE tbs_3,
PARTITION p4 TABLESPACE tbs_4
);


由SQL语句决定插入哪一个分区 like: insert into syspar partition (x) values (1,2); 这种类型的分区将不支持Partition Split操作,也不支持create table as select操作。

5.More Composite Partitioning

现在Range,List,Interval都可以作为Top level分区,而Second level则可以是Range,List,Hash,也就是在11g中可以有3*3=9种复合分区,满足更多的业务需求。

6.Introduced Partition Advisor

The Partition Advisor is part of the SQL Access Advisor of EM in 11g . The Partition Advisor can recommend a partitioning strategy for a table based on a supplied workload of SQL statements which can be supplied by the SQL Cache, a SQL Tuning set, or be defined by the user.