ORA-0600 Oracle DBA blog: 9/1/09

This is example proving this:

SQL> create table t1 pctfree 98 pctused 1 as select
trunc(dbms_random.value(0,10)) COL1,
rpad('DUMMY',dbms_random.value(10,50)) COL2,
trunc(dbms_random.value(0,100)) COL3,
rpad('NOTININDEX',dbms_random.value(10,50)) COL4
from all_objects where rownum < 2001

Table created.

SQL> create index T1_IDX on t1(col1,col2,col3) pctfree 90;

Index created.

SQL>
SQL> exec dbms_stats.gather_table_stats('SYS','T1',cascade=>true,method_opt=>'for all columns size auto');

PL/SQL procedure successfully completed.

SQL> select BLEVEL,LEAF_BLOCKS,DISTINCT_KEYS, CLUSTERING_FACTOR,NUM_ROWS from dba_indexes where index_name='T1_IDX';

BLEVEL LEAF_BLOCKS DISTINCT_KEYS CLUSTERING_FACTOR NUM_ROWS
---------- ----------- ------------- ----------------- ----------
1 154 1940 2000 2000

SQL> select COLUMN_NAME,NUM_DISTINCT,DENSITY,NUM_NULLS,NUM_BUCKETS from dba_tab_col_statistics where table_name = 'T1';

COLUMN_NAME NUM_DISTINCT DENSITY NUM_NULLS NUM_BUCKETS
------------------------------ ------------ ---------- ---------- -----------
COL1 10 .1 0 1
COL2 40 .025 0 1
COL3 100 .01 0 1
COL4 40 .025 0 1

REMEBER COST CALCULATION:
COST= BLEVEL + ceiling(index_effective_selectivity*LEAF_BLOCKS) + ceiling(table_effective_selectivity*CLUSTERING_FACTOR)

A few examples:

SQL> select /*+ INDEX(t1) */ col4 from t1 where col1=5;
--------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 200 | 6800 | 217 (0)| 00:00:03 |
| 1 | TABLE ACCESS BY INDEX ROWID| T1 | 200 | 6800 | 217 (0)| 00:00:03 |
|* 2 | INDEX RANGE SCAN | T1_IDX | 200 | | 17 (0)| 00:00:01 |
-------------------------------------------------------------------------------------

COST= 1 + ceiling(0.1*154) + ceiling(0.1*2000) = 1+16+200 = OK
=================================================================================

SQL> select /*+ INDEX(t1) */ col4 from t1 where col1 In (5,6);
---------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 400 | 13600 | 432 (1)| 00:00:06 |
| 1 | INLIST ITERATOR | | | | | |
| 2 | TABLE ACCESS BY INDEX ROWID| T1 | 400 | 13600 | 432 (1)| 00:00:06 |
|* 3 | INDEX RANGE SCAN | T1_IDX | 400 | | 31 (0)| 00:00:01 |
-------------------------------------------------------------------------------------

COST= 1 + ceiling(0.2*154) + ceiling (0.2*2000) = 1+31+400=432
===================================================================================

SQL> select /*+ INDEX(t1) */ col4 from t1 where col1=4 and col2='DUMMY ';
--------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 5 | 325 | 8 (0)| 00:00:01 |
| 1 | TABLE ACCESS BY INDEX ROWID| T1 | 5 | 325 | 8 (0)| 00:00:01 |
|* 2 | INDEX RANGE SCAN | T1_IDX | 5 | | 2 (0)| 00:00:01 |
-------------------------------------------------------------------------------------

COST=1+ceil(0.1*0.025*154)+ceil(0.1*0.025*2000)= 1 + ceil(0.39) + ceil(5) = 1 + 1 + 6

=========================================================================

SQL> select /*+ INDEX(t1) */ col4 from t1 where col1 = 1 and col3=97;

--------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 2 | 74 | 19 (0)| 00:00:01 |
| 1 | TABLE ACCESS BY INDEX ROWID| T1 | 2 | 74 | 19 (0)| 00:00:01 |
|* 2 | INDEX RANGE SCAN | T1_IDX | 2 | | 17 (0)| 00:00:01 |
--------------------------------------------------------------------------------------

COST=1 + ceil(0.1*0.01*154) + ceil(0.1*0.01*2000) = 1 + ceil(0.15) + ceil(2) = 2 DOESN'T MATCH!!!
Why? Because COL3 is on 3rd place in index, so cbo cannot limit leafs to scan
COST=1 + ceil(0.1*154) + ceil(0.1*0.01*2000) = 1 + ceil(15.4) + ceil(2) = 1 + 16 + 2

And now let's create index with COL3 on 2nd place and COL2 on 3rd:

create index T1_IDX2 on t1(col1,col3,col2) pctfree 90;

SQL> select BLEVEL,LEAF_BLOCKS,DISTINCT_KEYS, CLUSTERING_FACTOR,NUM_ROWS from dba_indexes where index_name='T1_IDX2';

BLEVEL LEAF_BLOCKS DISTINCT_KEYS CLUSTERING_FACTOR NUM_ROWS
---------- ----------- ------------- ----------------- ----------
1 154 1940 2000 2000

SQL> select /*+ INDEX(t1) */ col4 from t1 where col1 = 1 and col3=97;
---------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 2 | 74 | 4 (0)| 00:00:01 |
| 1 | TABLE ACCESS BY INDEX ROWID| T1 | 2 | 74 | 4 (0)| 00:00:01 |
|* 2 | INDEX RANGE SCAN | T1_IDX2 | 2 | | 2 (0)| 00:00:01 |
-------------------------------------------------------------------------------------

COST=1+ceil(0.1*0.01*154)+ceil(0.1*0.01*2000) = 1+ceil(0.154)+ceil(2) = 1+1 +2

So two indexes on the same columns, but diffrent order of column generate different costs because different access to index leafs (limited to one columns only or limited to both columns)

Both columns are also in ALL_TAB_COLUMNS.

According to definition:
NUM_DISTINCT - number of distinct values in column
DENSITY - fraction of data that will be return by a query of type column=:a

So generally they should store the same data, as
DENSITY = 1 / NUM_DISTINCT

But it is not always the case. Please look at the example:

SQL> create table t1 as
2 select
3 rownum as ID,
4 trunc(dbms_random.value(1,20)) as VAL
5 from all_objects where rownum <= 1000;

Now we gather stats on this table, no histograms data:

SQL> begin
2 dbms_stats.gather_table_stats(user,'T1',estimate_percent=>null,method_opt=>'for all columns size 1');
3 end;
4 /

PL/SQL procedure successfully completed.

SQL> select column_name, num_distinct, density from dba_tab_col_statistics where table_name='T1';

COLUMN_NAME NUM_DISTINCT DENSITY
------------------------------ ------------ ----------
ID 1000 .001
VAL 19 .052631579

Looks like it works how described above:

1/0.001 = 1000
1/19 = .052631579

But now let's change statistics - use histograms

SQL> begin
2 dbms_stats.gather_table_stats(user,'T1',estimate_percent=>null,method_opt=>'for all columns size 100');
3 end;
4 /

PL/SQL procedure successfully completed.

SQL> select column_name, num_distinct, density from dba_tab_col_statistics where table_name='T1';

COLUMN_NAME NUM_DISTINCT DENSITY
------------------------------ ------------ ----------
ID 1000 .001
VAL 19 .0005

DENSITY for ID hasn't changed, but VAL density changed - now cbo has much more data to estimate density.

So which column cbo use to calculate cardinality (so what percentage of data query returns)? It depends whether you have histograms or no:
1. If there are no histograms, cbo assume that cardinality=num_rows/num_distincts
2. If we have histograms, cardinality=num_rows*density

ORA-0600 Oracle DBA blog

Friday, September 25, 2009

Blog moved

Friday, September 18, 2009

Order of column in index DOES matter

Thursday, September 17, 2009

NUM_DISTINCT and DENSITY in all_tab_col_statistics

Tuesday, September 08, 2009

Library cache lock - find locking session

About Me

Blog Archive

Counter

Labels

Followers

ads