Playing With PARALLEL
I'm trying to learn about PARALLEL and how it affects SELECT statements in a DW/DSS (non OLTP) environment. I can't really say I know what I'm doing
or what I'm looking at, but I'm trying.
I had a DBA strongly discourage me from using PARALLELism on reporting tables (i.e. facts or dimensions). I think he even tried to explain it to me...but it was even more over my head at that time. I think it had something to do with all the processes kicked off then multiplied by the number of users running queries on those tables. "It will bring the database to its knees" he use to say. I never fought much on that because 1, he explained it to me 2, I trusted him and 3, he showed me once all the processes spawned during a typical data load operation. (I thought that was really cool by the way, all those neat little processes doing little chunks of work).
He did however let me (us) use them on staging tables. This is typically where our transformations took place. INSERTs and/or partition swapping.
Anyway, I was reading the 10gR2 docs, the Data Warehousing Guide to be exact on
Using Parallel Execution. Lots of neat pictures...which is good for me.

Then I decided to see if I could see what the run times on a SELECT COUNT(*) would be given different levels of parallelism. If you want to know if a table or index has PARALLEL enabled, you need to look at the DEGREE column in %_TABLES and %_INDEXES. I spent about 20 minutes looking for PARALLEL in those tables before I remembered that.
CREATE TABLE big_table
AS
SELECT *
FROM dba_objects;
I know have 74,431 records to play with. It's not a lot, I know, I just want to see what happens. I then created an anonymous block to:
1. loop through and change the PARALLELism starting with 1 up to 8
2. get the start time from DBMS_UTILITY.GET_TIME
3. loop doing 20 SELECT COUNT(*)s on table table
4. get the end time
5. get the total time
6. print it out
7. put it in a spreadsheet to compare the results side by side
DECLARE
l_start NUMBER;
l_end NUMBER;
l_total NUMBER;
l_dummy INTEGER;
BEGIN
FOR j IN 1..8 LOOP
EXECUTE IMMEDIATE 'ALTER TABLE big_table PARALLEL ' || j;
dbms_output.put_line( 'PARALLEL (DEGREE): ' || j );
FOR i IN 1..20 LOOP
l_start := dbms_utility.get_time;
SELECT COUNT(*)
INTO l_dummy
FROM big_table;
l_end := dbms_utility.get_time;
l_total := l_end - l_start;
dbms_output.put_line( 'Run ' || LPAD( i, 2, '0' ) || ': ' || LPAD( l_total, 3, '0' ) );
END LOOP;
END LOOP;
END;
/
And here are my results:

Sorry that it's an image...I can't seem to get decently formatted text into Blogger lately.
For this silly, unscientific test, PARALLEL set to 1 (i.e. NOPARALLEL) was the fastest all the way down. I don't know what it means or if it's even valid, but hey, I'm trying. I figured that if I posted something way above my head all you nice folks out there would happily correct me if I mistated something. I am avoiding any statements, if you couldn't tell.
Labels: dba, oradb, parallel, performance