Friday, September 25, 2009

Blog moved

I decided to move this blog to different location. All posts published here are also in new place. There will be no more entries here

And this is link to new blog
http://redikx.wordpress.com/

Friday, September 18, 2009

Order of column in index DOES matter

This is example proving this:

SQL> create table t1 pctfree 98 pctused 1 as select
trunc(dbms_random.value(0,10)) COL1,
rpad('DUMMY',dbms_random.value(10,50)) COL2,
trunc(dbms_random.value(0,100)) COL3,
rpad('NOTININDEX',dbms_random.value(10,50)) COL4
from all_objects where rownum < 2001

Table created.

SQL> create index T1_IDX on t1(col1,col2,col3) pctfree 90;

Index created.

SQL>
SQL> exec dbms_stats.gather_table_stats('SYS','T1',cascade=>true,method_opt=>'for all columns size auto');

PL/SQL procedure successfully completed.

SQL> select BLEVEL,LEAF_BLOCKS,DISTINCT_KEYS, CLUSTERING_FACTOR,NUM_ROWS from dba_indexes where index_name='T1_IDX';

BLEVEL LEAF_BLOCKS DISTINCT_KEYS CLUSTERING_FACTOR NUM_ROWS
---------- ----------- ------------- ----------------- ----------
1 154 1940 2000 2000

SQL> select COLUMN_NAME,NUM_DISTINCT,DENSITY,NUM_NULLS,NUM_BUCKETS from dba_tab_col_statistics where table_name = 'T1';

COLUMN_NAME NUM_DISTINCT DENSITY NUM_NULLS NUM_BUCKETS
------------------------------ ------------ ---------- ---------- -----------
COL1 10 .1 0 1
COL2 40 .025 0 1
COL3 100 .01 0 1
COL4 40 .025 0 1


REMEBER COST CALCULATION:
COST= BLEVEL + ceiling(index_effective_selectivity*LEAF_BLOCKS) + ceiling(table_effective_selectivity*CLUSTERING_FACTOR)

A few examples:
SQL> select /*+ INDEX(t1) */ col4 from t1 where col1=5;
--------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 200 | 6800 | 217 (0)| 00:00:03 |
| 1 | TABLE ACCESS BY INDEX ROWID| T1 | 200 | 6800 | 217 (0)| 00:00:03 |
|* 2 | INDEX RANGE SCAN | T1_IDX | 200 | | 17 (0)| 00:00:01 |
-------------------------------------------------------------------------------------

COST= 1 + ceiling(0.1*154) + ceiling(0.1*2000) = 1+16+200 = OK
=================================================================================
SQL> select /*+ INDEX(t1) */ col4 from t1 where col1 In (5,6);
---------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 400 | 13600 | 432 (1)| 00:00:06 |
| 1 | INLIST ITERATOR | | | | | |
| 2 | TABLE ACCESS BY INDEX ROWID| T1 | 400 | 13600 | 432 (1)| 00:00:06 |
|* 3 | INDEX RANGE SCAN | T1_IDX | 400 | | 31 (0)| 00:00:01 |
-------------------------------------------------------------------------------------

COST= 1 + ceiling(0.2*154) + ceiling (0.2*2000) = 1+31+400=432
===================================================================================
SQL> select /*+ INDEX(t1) */ col4 from t1 where col1=4 and col2='DUMMY ';
--------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 5 | 325 | 8 (0)| 00:00:01 |
| 1 | TABLE ACCESS BY INDEX ROWID| T1 | 5 | 325 | 8 (0)| 00:00:01 |
|* 2 | INDEX RANGE SCAN | T1_IDX | 5 | | 2 (0)| 00:00:01 |
-------------------------------------------------------------------------------------

COST=1+ceil(0.1*0.025*154)+ceil(0.1*0.025*2000)= 1 + ceil(0.39) + ceil(5) = 1 + 1 + 6

=========================================================================
SQL> select /*+ INDEX(t1) */ col4 from t1 where col1 = 1 and col3=97;

--------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 2 | 74 | 19 (0)| 00:00:01 |
| 1 | TABLE ACCESS BY INDEX ROWID| T1 | 2 | 74 | 19 (0)| 00:00:01 |
|* 2 | INDEX RANGE SCAN | T1_IDX | 2 | | 17 (0)| 00:00:01 |
--------------------------------------------------------------------------------------

COST=1 + ceil(0.1*0.01*154) + ceil(0.1*0.01*2000) = 1 + ceil(0.15) + ceil(2) = 2 DOESN'T MATCH!!!
Why? Because COL3 is on 3rd place in index, so cbo cannot limit leafs to scan
COST=1 + ceil(0.1*154) + ceil(0.1*0.01*2000) = 1 + ceil(15.4) + ceil(2) = 1 + 16 + 2


And now let's create index with COL3 on 2nd place and COL2 on 3rd:

create index T1_IDX2 on t1(col1,col3,col2) pctfree 90;

SQL> select BLEVEL,LEAF_BLOCKS,DISTINCT_KEYS, CLUSTERING_FACTOR,NUM_ROWS from dba_indexes where index_name='T1_IDX2';

BLEVEL LEAF_BLOCKS DISTINCT_KEYS CLUSTERING_FACTOR NUM_ROWS
---------- ----------- ------------- ----------------- ----------
1 154 1940 2000 2000


SQL> select /*+ INDEX(t1) */ col4 from t1 where col1 = 1 and col3=97;
---------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 2 | 74 | 4 (0)| 00:00:01 |
| 1 | TABLE ACCESS BY INDEX ROWID| T1 | 2 | 74 | 4 (0)| 00:00:01 |
|* 2 | INDEX RANGE SCAN | T1_IDX2 | 2 | | 2 (0)| 00:00:01 |
-------------------------------------------------------------------------------------

COST=1+ceil(0.1*0.01*154)+ceil(0.1*0.01*2000) = 1+ceil(0.154)+ceil(2) = 1+1 +2

So two indexes on the same columns, but diffrent order of column generate different costs because different access to index leafs (limited to one columns only or limited to both columns)

Thursday, September 17, 2009

NUM_DISTINCT and DENSITY in all_tab_col_statistics

Both columns are also in ALL_TAB_COLUMNS.

According to definition:
NUM_DISTINCT - number of distinct values in column
DENSITY - fraction of data that will be return by a query of type column=:a

So generally they should store the same data, as
DENSITY = 1 / NUM_DISTINCT

But it is not always the case. Please look at the example:

SQL> create table t1 as
2 select
3 rownum as ID,
4 trunc(dbms_random.value(1,20)) as VAL
5 from all_objects where rownum <= 1000;

Now we gather stats on this table, no histograms data:

SQL> begin
2 dbms_stats.gather_table_stats(user,'T1',estimate_percent=>null,method_opt=>'for all columns size 1');
3 end;
4 /

PL/SQL procedure successfully completed.

SQL> select column_name, num_distinct, density from dba_tab_col_statistics where table_name='T1';

COLUMN_NAME NUM_DISTINCT DENSITY
------------------------------ ------------ ----------
ID 1000 .001
VAL 19 .052631579

Looks like it works how described above:

1/0.001 = 1000
1/19 = .052631579

But now let's change statistics - use histograms

SQL> begin
2 dbms_stats.gather_table_stats(user,'T1',estimate_percent=>null,method_opt=>'for all columns size 100');
3 end;
4 /

PL/SQL procedure successfully completed.

SQL> select column_name, num_distinct, density from dba_tab_col_statistics where table_name='T1';

COLUMN_NAME NUM_DISTINCT DENSITY
------------------------------ ------------ ----------
ID 1000 .001
VAL 19 .0005

DENSITY for ID hasn't changed, but VAL density changed - now cbo has much more data to estimate density.

So which column cbo use to calculate cardinality (so what percentage of data query returns)? It depends whether you have histograms or no:
1. If there are no histograms, cbo assume that cardinality=num_rows/num_distincts
2. If we have histograms, cardinality=num_rows*density

Tuesday, September 08, 2009

Library cache lock - find locking session

1.
select saddr from v$session where sid in (select sid from v$session_wait where event like 'library cache lock');

2. FIND BLOCKER:
SELECT SID,USERNAME,TERMINAL,PROGRAM FROM V$SESSION
WHERE SADDR in
(SELECT KGLLKSES FROM X$KGLLK LOCK_A
WHERE KGLLKREQ = 0
AND EXISTS (SELECT LOCK_B.KGLLKHDL FROM X$KGLLK LOCK_B
WHERE KGLLKSES = 'saddr_from_v$session above' /* BLOCKED SESSION */
AND LOCK_A.KGLLKHDL = LOCK_B.KGLLKHDL
AND KGLLKREQ > 0)
);

3. FIND BLOCKED:
SELECT SID,USERNAME,TERMINAL,PROGRAM FROM V$SESSION
WHERE SADDR in
(SELECT KGLLKSES FROM X$KGLLK LOCK_A
WHERE KGLLKREQ > 0
AND EXISTS (SELECT LOCK_B.KGLLKHDL FROM X$KGLLK LOCK_B
WHERE KGLLKSES = 'saddr_from_v$session above' /* BLOCKING SESSION */
AND LOCK_A.KGLLKHDL = LOCK_B.KGLLKHDL
AND KGLLKREQ = 0)
);