Friday, September 18, 2009

Order of column in index DOES matter

This is example proving this:

SQL> create table t1 pctfree 98 pctused 1 as select
trunc(dbms_random.value(0,10)) COL1,
rpad('DUMMY',dbms_random.value(10,50)) COL2,
trunc(dbms_random.value(0,100)) COL3,
rpad('NOTININDEX',dbms_random.value(10,50)) COL4
from all_objects where rownum < 2001

Table created.

SQL> create index T1_IDX on t1(col1,col2,col3) pctfree 90;

Index created.

SQL>
SQL> exec dbms_stats.gather_table_stats('SYS','T1',cascade=>true,method_opt=>'for all columns size auto');

PL/SQL procedure successfully completed.

SQL> select BLEVEL,LEAF_BLOCKS,DISTINCT_KEYS, CLUSTERING_FACTOR,NUM_ROWS from dba_indexes where index_name='T1_IDX';

BLEVEL LEAF_BLOCKS DISTINCT_KEYS CLUSTERING_FACTOR NUM_ROWS
---------- ----------- ------------- ----------------- ----------
1 154 1940 2000 2000

SQL> select COLUMN_NAME,NUM_DISTINCT,DENSITY,NUM_NULLS,NUM_BUCKETS from dba_tab_col_statistics where table_name = 'T1';

COLUMN_NAME NUM_DISTINCT DENSITY NUM_NULLS NUM_BUCKETS
------------------------------ ------------ ---------- ---------- -----------
COL1 10 .1 0 1
COL2 40 .025 0 1
COL3 100 .01 0 1
COL4 40 .025 0 1


REMEBER COST CALCULATION:
COST= BLEVEL + ceiling(index_effective_selectivity*LEAF_BLOCKS) + ceiling(table_effective_selectivity*CLUSTERING_FACTOR)

A few examples:
SQL> select /*+ INDEX(t1) */ col4 from t1 where col1=5;
--------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 200 | 6800 | 217 (0)| 00:00:03 |
| 1 | TABLE ACCESS BY INDEX ROWID| T1 | 200 | 6800 | 217 (0)| 00:00:03 |
|* 2 | INDEX RANGE SCAN | T1_IDX | 200 | | 17 (0)| 00:00:01 |
-------------------------------------------------------------------------------------

COST= 1 + ceiling(0.1*154) + ceiling(0.1*2000) = 1+16+200 = OK
=================================================================================
SQL> select /*+ INDEX(t1) */ col4 from t1 where col1 In (5,6);
---------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 400 | 13600 | 432 (1)| 00:00:06 |
| 1 | INLIST ITERATOR | | | | | |
| 2 | TABLE ACCESS BY INDEX ROWID| T1 | 400 | 13600 | 432 (1)| 00:00:06 |
|* 3 | INDEX RANGE SCAN | T1_IDX | 400 | | 31 (0)| 00:00:01 |
-------------------------------------------------------------------------------------

COST= 1 + ceiling(0.2*154) + ceiling (0.2*2000) = 1+31+400=432
===================================================================================
SQL> select /*+ INDEX(t1) */ col4 from t1 where col1=4 and col2='DUMMY ';
--------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 5 | 325 | 8 (0)| 00:00:01 |
| 1 | TABLE ACCESS BY INDEX ROWID| T1 | 5 | 325 | 8 (0)| 00:00:01 |
|* 2 | INDEX RANGE SCAN | T1_IDX | 5 | | 2 (0)| 00:00:01 |
-------------------------------------------------------------------------------------

COST=1+ceil(0.1*0.025*154)+ceil(0.1*0.025*2000)= 1 + ceil(0.39) + ceil(5) = 1 + 1 + 6

=========================================================================
SQL> select /*+ INDEX(t1) */ col4 from t1 where col1 = 1 and col3=97;

--------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 2 | 74 | 19 (0)| 00:00:01 |
| 1 | TABLE ACCESS BY INDEX ROWID| T1 | 2 | 74 | 19 (0)| 00:00:01 |
|* 2 | INDEX RANGE SCAN | T1_IDX | 2 | | 17 (0)| 00:00:01 |
--------------------------------------------------------------------------------------

COST=1 + ceil(0.1*0.01*154) + ceil(0.1*0.01*2000) = 1 + ceil(0.15) + ceil(2) = 2 DOESN'T MATCH!!!
Why? Because COL3 is on 3rd place in index, so cbo cannot limit leafs to scan
COST=1 + ceil(0.1*154) + ceil(0.1*0.01*2000) = 1 + ceil(15.4) + ceil(2) = 1 + 16 + 2


And now let's create index with COL3 on 2nd place and COL2 on 3rd:

create index T1_IDX2 on t1(col1,col3,col2) pctfree 90;

SQL> select BLEVEL,LEAF_BLOCKS,DISTINCT_KEYS, CLUSTERING_FACTOR,NUM_ROWS from dba_indexes where index_name='T1_IDX2';

BLEVEL LEAF_BLOCKS DISTINCT_KEYS CLUSTERING_FACTOR NUM_ROWS
---------- ----------- ------------- ----------------- ----------
1 154 1940 2000 2000


SQL> select /*+ INDEX(t1) */ col4 from t1 where col1 = 1 and col3=97;
---------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 2 | 74 | 4 (0)| 00:00:01 |
| 1 | TABLE ACCESS BY INDEX ROWID| T1 | 2 | 74 | 4 (0)| 00:00:01 |
|* 2 | INDEX RANGE SCAN | T1_IDX2 | 2 | | 2 (0)| 00:00:01 |
-------------------------------------------------------------------------------------

COST=1+ceil(0.1*0.01*154)+ceil(0.1*0.01*2000) = 1+ceil(0.154)+ceil(2) = 1+1 +2

So two indexes on the same columns, but diffrent order of column generate different costs because different access to index leafs (limited to one columns only or limited to both columns)

No comments: