Learning About Performance
I've found myself at a certain disadvantage lately, specifically in regards to understanding tuning. I don't like being at a disadvantage...I'm
competitive that way.
I understand, from a
very high level of what needs to be done, but I could not prove my theories. Well, I could, sort of, but it wasn't necessarily a reasoned or logical approach. I just tried all possible permutations. There's a certain time constraint with that method and when you're talking about huge volumes of data (i.e. datawarehouses), time is short.
I've decided it's time to change that, so I'll begin to peck away ever so slowly.
In Usage Tracking, there is a table called
S_ETL_TIME_DAY. It's a sister table to S_ETL_DAY which is your everyday
time dimension. S_ETL_TIME_DAY breaks down a single day into hours and minutes which means there are 1440 records (24*60*60).
Somewhere I saw the following SQL:
SELECT DISTINCT hours
FROM s_etl_time_day
which just returns the hours in the day (24). I've pondered on whether
DISTINCT is a bug, but it seems fairly innocuous here.
What about this though?
SELECT rownum - 1 hours
FROM dual
CONNECT BY LEVEL <= 24;
Which one is faster?
Let's try an explain plan
Query 1S_NQ_SCHED@TESTING>EXPLAIN PLAN FOR
2 SELECT rownum - 1 hours
3 FROM dual
4 CONNECT BY LEVEL <= 24;
Explained.
Elapsed: 00:00:00.05
S_NQ_SCHED@TESTING>@EXPLAIN
PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------
Plan hash value: 1731520519
------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Cost (%CPU)| Time |
------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 2 (0)| 00:00:01 |
| 1 | COUNT | | | | |
|* 2 | CONNECT BY WITHOUT FILTERING| | | | |
| 3 | FAST DUAL | | 1 | 2 (0)| 00:00:01 |
------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter(LEVEL<=24)
15 rows selected.
Query 2S_NQ_SCHED@TESTING>EXPLAIN PLAN FOR
2 SELECT DISTINCT hours
3 FROM s_etl_time_day;
Explained.
Elapsed: 00:00:00.04
S_NQ_SCHED@TESTING>@explain
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------
Plan hash value: 878743397
-------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 24 | 72 | 5 (20)| 00:00:01 |
| 1 | HASH UNIQUE | | 24 | 72 | 5 (20)| 00:00:01 |
| 2 | TABLE ACCESS FULL| S_ETL_TIME_DAY | 1440 | 4320 | 4 (0)| 00:00:01 |
-------------------------------------------------------------------------------------
9 rows selected.
Sadly, about the only thing I can usually see in an explain plan is whether an index is being picked up or if there is a nested loop. I'm not going to worry about
all of it now...this is just a start. More to get me in the habit.
Next I set up tracing since "reading" the file is so much
easier now.
ALTER SESSION SET TRACEFILE_IDENTIFIER = 'HOURS';
ALTER SESSION SET EVENTS '10046 TRACE NAME CONTEXT FOREVER, LEVEL 12';
Here's the output:
What does it all mean? I am sure you know or understand better than I. For me, I just need to create the habit. I'll read (more) about the specifics of the explain plan and the output from tkprof. If you want to explain, feel free. If you want to do it in your own post, link it up or you can write it here. I'd be happy to host it.
Labels: dba, performance, tuning