I decided to move this blog to different location. All posts published here are also in new place. There will be no more entries here
And this is link to new blog
http://redikx.wordpress.com/
Friday, September 25, 2009
Friday, September 18, 2009
Order of column in index DOES matter
This is example proving this:
REMEBER COST CALCULATION:
COST= BLEVEL + ceiling(index_effective_selectivity*LEAF_BLOCKS) + ceiling(table_effective_selectivity*CLUSTERING_FACTOR)
A few examples:
COST= 1 + ceiling(0.1*154) + ceiling(0.1*2000) = 1+16+200 = OK
=================================================================================
COST= 1 + ceiling(0.2*154) + ceiling (0.2*2000) = 1+31+400=432
===================================================================================
COST=1+ceil(0.1*0.025*154)+ceil(0.1*0.025*2000)= 1 + ceil(0.39) + ceil(5) = 1 + 1 + 6
=========================================================================
COST=1 + ceil(0.1*0.01*154) + ceil(0.1*0.01*2000) = 1 + ceil(0.15) + ceil(2) = 2 DOESN'T MATCH!!!
Why? Because COL3 is on 3rd place in index, so cbo cannot limit leafs to scan
COST=1 + ceil(0.1*154) + ceil(0.1*0.01*2000) = 1 + ceil(15.4) + ceil(2) = 1 + 16 + 2
And now let's create index with COL3 on 2nd place and COL2 on 3rd:
COST=1+ceil(0.1*0.01*154)+ceil(0.1*0.01*2000) = 1+ceil(0.154)+ceil(2) = 1+1 +2
So two indexes on the same columns, but diffrent order of column generate different costs because different access to index leafs (limited to one columns only or limited to both columns)
SQL> create table t1 pctfree 98 pctused 1 as select
trunc(dbms_random.value(0,10)) COL1,
rpad('DUMMY',dbms_random.value(10,50)) COL2,
trunc(dbms_random.value(0,100)) COL3,
rpad('NOTININDEX',dbms_random.value(10,50)) COL4
from all_objects where rownum < 2001
Table created.
SQL> create index T1_IDX on t1(col1,col2,col3) pctfree 90;
Index created.
SQL>
SQL> exec dbms_stats.gather_table_stats('SYS','T1',cascade=>true,method_opt=>'for all columns size auto');
PL/SQL procedure successfully completed.
SQL> select BLEVEL,LEAF_BLOCKS,DISTINCT_KEYS, CLUSTERING_FACTOR,NUM_ROWS from dba_indexes where index_name='T1_IDX';
BLEVEL LEAF_BLOCKS DISTINCT_KEYS CLUSTERING_FACTOR NUM_ROWS
---------- ----------- ------------- ----------------- ----------
1 154 1940 2000 2000
SQL> select COLUMN_NAME,NUM_DISTINCT,DENSITY,NUM_NULLS,NUM_BUCKETS from dba_tab_col_statistics where table_name = 'T1';
COLUMN_NAME NUM_DISTINCT DENSITY NUM_NULLS NUM_BUCKETS
------------------------------ ------------ ---------- ---------- -----------
COL1 10 .1 0 1
COL2 40 .025 0 1
COL3 100 .01 0 1
COL4 40 .025 0 1
REMEBER COST CALCULATION:
COST= BLEVEL + ceiling(index_effective_selectivity*LEAF_BLOCKS) + ceiling(table_effective_selectivity*CLUSTERING_FACTOR)
A few examples:
SQL> select /*+ INDEX(t1) */ col4 from t1 where col1=5;
--------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 200 | 6800 | 217 (0)| 00:00:03 |
| 1 | TABLE ACCESS BY INDEX ROWID| T1 | 200 | 6800 | 217 (0)| 00:00:03 |
|* 2 | INDEX RANGE SCAN | T1_IDX | 200 | | 17 (0)| 00:00:01 |
-------------------------------------------------------------------------------------
COST= 1 + ceiling(0.1*154) + ceiling(0.1*2000) = 1+16+200 = OK
=================================================================================
SQL> select /*+ INDEX(t1) */ col4 from t1 where col1 In (5,6);
---------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 400 | 13600 | 432 (1)| 00:00:06 |
| 1 | INLIST ITERATOR | | | | | |
| 2 | TABLE ACCESS BY INDEX ROWID| T1 | 400 | 13600 | 432 (1)| 00:00:06 |
|* 3 | INDEX RANGE SCAN | T1_IDX | 400 | | 31 (0)| 00:00:01 |
-------------------------------------------------------------------------------------
COST= 1 + ceiling(0.2*154) + ceiling (0.2*2000) = 1+31+400=432
===================================================================================
SQL> select /*+ INDEX(t1) */ col4 from t1 where col1=4 and col2='DUMMY ';
--------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 5 | 325 | 8 (0)| 00:00:01 |
| 1 | TABLE ACCESS BY INDEX ROWID| T1 | 5 | 325 | 8 (0)| 00:00:01 |
|* 2 | INDEX RANGE SCAN | T1_IDX | 5 | | 2 (0)| 00:00:01 |
-------------------------------------------------------------------------------------
COST=1+ceil(0.1*0.025*154)+ceil(0.1*0.025*2000)= 1 + ceil(0.39) + ceil(5) = 1 + 1 + 6
=========================================================================
SQL> select /*+ INDEX(t1) */ col4 from t1 where col1 = 1 and col3=97;
--------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 2 | 74 | 19 (0)| 00:00:01 |
| 1 | TABLE ACCESS BY INDEX ROWID| T1 | 2 | 74 | 19 (0)| 00:00:01 |
|* 2 | INDEX RANGE SCAN | T1_IDX | 2 | | 17 (0)| 00:00:01 |
--------------------------------------------------------------------------------------
COST=1 + ceil(0.1*0.01*154) + ceil(0.1*0.01*2000) = 1 + ceil(0.15) + ceil(2) = 2 DOESN'T MATCH!!!
Why? Because COL3 is on 3rd place in index, so cbo cannot limit leafs to scan
COST=1 + ceil(0.1*154) + ceil(0.1*0.01*2000) = 1 + ceil(15.4) + ceil(2) = 1 + 16 + 2
And now let's create index with COL3 on 2nd place and COL2 on 3rd:
create index T1_IDX2 on t1(col1,col3,col2) pctfree 90;
SQL> select BLEVEL,LEAF_BLOCKS,DISTINCT_KEYS, CLUSTERING_FACTOR,NUM_ROWS from dba_indexes where index_name='T1_IDX2';
BLEVEL LEAF_BLOCKS DISTINCT_KEYS CLUSTERING_FACTOR NUM_ROWS
---------- ----------- ------------- ----------------- ----------
1 154 1940 2000 2000
SQL> select /*+ INDEX(t1) */ col4 from t1 where col1 = 1 and col3=97;
---------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 2 | 74 | 4 (0)| 00:00:01 |
| 1 | TABLE ACCESS BY INDEX ROWID| T1 | 2 | 74 | 4 (0)| 00:00:01 |
|* 2 | INDEX RANGE SCAN | T1_IDX2 | 2 | | 2 (0)| 00:00:01 |
-------------------------------------------------------------------------------------
COST=1+ceil(0.1*0.01*154)+ceil(0.1*0.01*2000) = 1+ceil(0.154)+ceil(2) = 1+1 +2
So two indexes on the same columns, but diffrent order of column generate different costs because different access to index leafs (limited to one columns only or limited to both columns)
Thursday, September 17, 2009
NUM_DISTINCT and DENSITY in all_tab_col_statistics
Both columns are also in ALL_TAB_COLUMNS.
According to definition:
NUM_DISTINCT - number of distinct values in column
DENSITY - fraction of data that will be return by a query of type column=:a
So generally they should store the same data, as
DENSITY = 1 / NUM_DISTINCT
But it is not always the case. Please look at the example:
SQL> create table t1 as
2 select
3 rownum as ID,
4 trunc(dbms_random.value(1,20)) as VAL
5 from all_objects where rownum <= 1000;
Now we gather stats on this table, no histograms data:
SQL> begin
2 dbms_stats.gather_table_stats(user,'T1',estimate_percent=>null,method_opt=>'for all columns size 1');
3 end;
4 /
PL/SQL procedure successfully completed.
SQL> select column_name, num_distinct, density from dba_tab_col_statistics where table_name='T1';
COLUMN_NAME NUM_DISTINCT DENSITY
------------------------------ ------------ ----------
ID 1000 .001
VAL 19 .052631579
Looks like it works how described above:
1/0.001 = 1000
1/19 = .052631579
But now let's change statistics - use histograms
SQL> begin
2 dbms_stats.gather_table_stats(user,'T1',estimate_percent=>null,method_opt=>'for all columns size 100');
3 end;
4 /
PL/SQL procedure successfully completed.
SQL> select column_name, num_distinct, density from dba_tab_col_statistics where table_name='T1';
COLUMN_NAME NUM_DISTINCT DENSITY
------------------------------ ------------ ----------
ID 1000 .001
VAL 19 .0005
DENSITY for ID hasn't changed, but VAL density changed - now cbo has much more data to estimate density.
So which column cbo use to calculate cardinality (so what percentage of data query returns)? It depends whether you have histograms or no:
1. If there are no histograms, cbo assume that cardinality=num_rows/num_distincts
2. If we have histograms, cardinality=num_rows*density
According to definition:
NUM_DISTINCT - number of distinct values in column
DENSITY - fraction of data that will be return by a query of type column=:a
So generally they should store the same data, as
DENSITY = 1 / NUM_DISTINCT
But it is not always the case. Please look at the example:
SQL> create table t1 as
2 select
3 rownum as ID,
4 trunc(dbms_random.value(1,20)) as VAL
5 from all_objects where rownum <= 1000;
Now we gather stats on this table, no histograms data:
SQL> begin
2 dbms_stats.gather_table_stats(user,'T1',estimate_percent=>null,method_opt=>'for all columns size 1');
3 end;
4 /
PL/SQL procedure successfully completed.
SQL> select column_name, num_distinct, density from dba_tab_col_statistics where table_name='T1';
COLUMN_NAME NUM_DISTINCT DENSITY
------------------------------ ------------ ----------
ID 1000 .001
VAL 19 .052631579
Looks like it works how described above:
1/0.001 = 1000
1/19 = .052631579
But now let's change statistics - use histograms
SQL> begin
2 dbms_stats.gather_table_stats(user,'T1',estimate_percent=>null,method_opt=>'for all columns size 100');
3 end;
4 /
PL/SQL procedure successfully completed.
SQL> select column_name, num_distinct, density from dba_tab_col_statistics where table_name='T1';
COLUMN_NAME NUM_DISTINCT DENSITY
------------------------------ ------------ ----------
ID 1000 .001
VAL 19 .0005
DENSITY for ID hasn't changed, but VAL density changed - now cbo has much more data to estimate density.
So which column cbo use to calculate cardinality (so what percentage of data query returns)? It depends whether you have histograms or no:
1. If there are no histograms, cbo assume that cardinality=num_rows/num_distincts
2. If we have histograms, cardinality=num_rows*density
Tuesday, September 08, 2009
Library cache lock - find locking session
1.
select saddr from v$session where sid in (select sid from v$session_wait where event like 'library cache lock');
2. FIND BLOCKER:
SELECT SID,USERNAME,TERMINAL,PROGRAM FROM V$SESSION
WHERE SADDR in
(SELECT KGLLKSES FROM X$KGLLK LOCK_A
WHERE KGLLKREQ = 0
AND EXISTS (SELECT LOCK_B.KGLLKHDL FROM X$KGLLK LOCK_B
WHERE KGLLKSES = 'saddr_from_v$session above' /* BLOCKED SESSION */
AND LOCK_A.KGLLKHDL = LOCK_B.KGLLKHDL
AND KGLLKREQ > 0)
);
3. FIND BLOCKED:
SELECT SID,USERNAME,TERMINAL,PROGRAM FROM V$SESSION
WHERE SADDR in
(SELECT KGLLKSES FROM X$KGLLK LOCK_A
WHERE KGLLKREQ > 0
AND EXISTS (SELECT LOCK_B.KGLLKHDL FROM X$KGLLK LOCK_B
WHERE KGLLKSES = 'saddr_from_v$session above' /* BLOCKING SESSION */
AND LOCK_A.KGLLKHDL = LOCK_B.KGLLKHDL
AND KGLLKREQ = 0)
);
select saddr from v$session where sid in (select sid from v$session_wait where event like 'library cache lock');
2. FIND BLOCKER:
SELECT SID,USERNAME,TERMINAL,PROGRAM FROM V$SESSION
WHERE SADDR in
(SELECT KGLLKSES FROM X$KGLLK LOCK_A
WHERE KGLLKREQ = 0
AND EXISTS (SELECT LOCK_B.KGLLKHDL FROM X$KGLLK LOCK_B
WHERE KGLLKSES = 'saddr_from_v$session above' /* BLOCKED SESSION */
AND LOCK_A.KGLLKHDL = LOCK_B.KGLLKHDL
AND KGLLKREQ > 0)
);
3. FIND BLOCKED:
SELECT SID,USERNAME,TERMINAL,PROGRAM FROM V$SESSION
WHERE SADDR in
(SELECT KGLLKSES FROM X$KGLLK LOCK_A
WHERE KGLLKREQ > 0
AND EXISTS (SELECT LOCK_B.KGLLKHDL FROM X$KGLLK LOCK_B
WHERE KGLLKSES = 'saddr_from_v$session above' /* BLOCKING SESSION */
AND LOCK_A.KGLLKHDL = LOCK_B.KGLLKHDL
AND KGLLKREQ = 0)
);
Subscribe to:
Posts (Atom)