Thursday, December 27, 2007

Asynchronous Distributed HotLog - Change Data Capture

I am currently researching the feasibility of getting Asynchronous Distributed HotLog CDC working for a Proof of Concept in our organization. The source database is a on Sun Solaris and our target database is a 10gR2 on the same OS.

I have been trying, since I arrived, to get them to use more features provided by Oracle. From what I have found so far, all of our Oracle databases are severely under utilized.

I will post my findings here along with a full How To when I have completed it and hopefully the results of whether it was implemented or not.

I did want to post some of the sources I have used so far though:

Oracle Documentation - This is where I started. Most of the other links reference the docs.
Mark Rittman - 10g to 10g. Great example but from what I can tell so far, it's almost identical to the documentation.
Oracle How To - I just found this one and am very interested in reading through it.

If you have any examples, feel free to post the link.

Tuesday, December 11, 2007

SAS: Create SAS Dataset from Oracle

SAS is very prevalent in our environment. When testing, I need to create some datasets for the business users so they they can do their UAT (yes, I shouldn't be involved at this point...but I am).

Here's the syntax to do that (I use SAS EG):

LIBNAME dmart oracle path=<oracle_sid> schema=<schema_name> pw=<password> user=<db_username>;

CREATE TABLE sasave.mytable
SELECT * FROM dmart.mytable;

You can also use the following syntax:

LIBNAME dmart oracle path=<oracle_sid> schema=<schema_name> pw=<password> user=<db_username>;

DATA sasave.mytable_filtered;
SET dmart.mytable
( WHERE = id IN ( 5, 6, 7, 8 ) );


If you want to use Oracle SQL (SAS guys and gals call it passthrough SQL), here's the syntax for that:


CONNECT TO oracle( username=<username> pw=<password> path="<oracle_sid>");

CREATE TABLE sasave.dual_test
FROM connection to oracle


disconnect from oracle;


Wednesday, December 5, 2007

Instrumentation: DEBUG/LOGGING

In my previous entry on instrumenting code I detailed the way in which you could use the DBMS_APPLICATION_INFO.SET_SESSION_LONGOPS procedure to help monitor your long running code.

In this one, I will detail my home grown version of Tom Kyte's debug routine. I do know that others have similar code but can't seem to find them right now.

You can find the source code here.

2 tables
2 sequences
1 package
1 procedure
1 build file

debug_tab allows you to turn the debugging on and off. debug_details_tab will store each line that you write to the debug routine when turned on.

Here's an example of it in practice:

FUNCTION get_age_in_months( p_id IN NUMBER ) RETURN NUMBER
l_age_in_months INTEGER;
--instrumentation calls
debug( 'GET_AGE_IN_MONTHS' );
debug( 'P_ID: ' || p_id );
debug( 'select value into variable' );

SELECT age_in_months
INTO l_age_in_months
FROM people_tab
WHERE id = p_id;

debug( 'L_AGE_IN_MONTHS: ' || l_age_in_months );

RETURN l_age_in_months;

WHEN no_data_found THEN
debug( 'no data found' );
END get_age_in_months;
show errors

I mentioned in the previous article that I had had difficulty grasping this concept initially. I think once I related it to DBMS_OUTPUT.PUT_LINE it became much more clear to me.

This simple debug routine has helped me tremendously in the last year or two that I have used it. Especially when you get nested levels of logic. It gets very hard to keep track of where you are, but with this you can run your procedure or function and then issue a SELECT against the debug_details_tab and see what was happening when.

I even began using this in my APEX applications. I would preface each line with "APEX: " and then write whatever was necessary so that I could step through the various pieces of code. It became crucial when I was doing validation on a table of records in a collection...oh so much easier.

On large systems this will generate a lot of data. I definitely would not put this inside a loop doing some sort of batch processing, but it is helpful to find where things fail out.

It can certainly be improved on. I basically took the idea of DBMS_OUTPUT.PUT_LINE and wrote it to a table, nothing fancy there. Mr. Kyte mentions writing it to the file system as well. Since I don't have ready access to the database file system, this was the easiest.

Make your life easier and use a debug/logging routine in your code. No longer will you have to comment, the debug statements should do it for you!

Sunday, December 2, 2007


Instrumentation has something that I have come to rely on fairly heavily. I believe I first read about it on asktom, but the one that really spurred me on was this post on instrumentation on his personal blog.

Initially, I couldn't really wrap my head around instrumentation. I don't know why it was so difficult; I had a similar problem with sessions when I first started my career. I look back now and it just seems so obvious.

Now that I am doing datawarehouse work, nothing is fast. Fast to me is now one hour to load 30 or 40 million records. No more split second queries for me.

We currently use no tools. It's straight PL/SQL. Instrumentation of the code is ideal. Actually, it's more instrumentation to aid monitoring. The tool most easily used is provided by Oracle in the DBMS_APPLICATION_INFO package.

There are three subprograms that I use most, SET_MODULE, SET_ACTION and most importantly SET_SESSION_LONGOPS. I hadn't started using it until this year, I mainly stuck to the first two. SET_SESSION_LONGOPS is now part of my procedure/function template I've created in JDeveloper.

What it allows you to do is set a row in the v$session_longops view (I know it's not actually putting the row in the's the underlying table, but I digress). You can then monitor how your job is doing.

Here's an example:

( rindex => g_index,
slno => g_slno,
sofar => 0,
totalwork => l_table.COUNT + 1,
target_desc => 'GETTING MEMBER DATA' );

g_index and g_slno are global variables in the package. l_table is a PL/SQL TABLE OF VARCHAR2.

Now you can monitor the progress of your job in v$session_longops!

Here's the query I use:

TO_CHAR( start_time, 'MM/DD/YYYY HH24:MI:SS' ) start_ti,
time_remaining rem,
elapsed_seconds ela,
ROUND( ( sofar / REPLACE( totalwork, 0, 1 ) ) * 100, 2 ) per,
totalwork work,
FROM v$session_longops
WHERE start_time >= SYSDATE - 1
ORDER BY start_time DESC

Now you too can sit for hours and watch your job move incrementally forward!

But seriously, it does help tremendously to know where a job is at. You can further use the SET_MODULE and SET_ACTION calls to see a specific point in the processing (inside a loop).

Here's the code in context:

PROCEDURE get_member_data
l_exists INTEGER;
TYPE table_of_lobs IS TABLE OF VARCHAR2(3);
l_count INTEGER := 0;
--check to see if there is enrollment data, if not, move on
INTO l_exists
FROM members
WHERE rownum < 2;

IF l_exists = 1 THEN--data exists, truncate and reload

g_index := dbms_application_info.set_session_longops_nohint;


g_audit_key := p_audit.begin_load
( p_targettable => 'MEMBER_STG',
p_loadsource => 'MEMBER_SOURCE',
p_loadstatus => 'PRE',
p_loadprogram => 'GET_MEMBER_DATA',
p_commenttext => 'INSERT' );

( rindex => g_index,
slno => g_slno,
op_name => 'GETTING MEMBERS',
sofar => 0,
totalwork => l_table.COUNT + 1,
target_desc => 'GETTING MEMBERS' );

FOR i IN 1..l_table.COUNT LOOP
l_count := l_count + 1;

INSERT INTO member_stg
FROM members;

g_total_rows_affected := g_total_rows_affected + sql%rowcount;


( rindex => g_index,
slno => g_slno,
op_name => 'GETTING MEMBERS',
sofar => l_count,
totalwork => l_table.COUNT + 1,
target_desc => 'GETTING MEMBERS' );


( p_auditkey => g_audit_key,
p_loadstatus => 'SUC',
p_rowsuccess => g_total_rows_affected );

( p_tablename => 'MEMBER_STG',
p_schemaname => 'MYHOME' );

( rindex => g_index,
slno => g_slno,
op_name => 'GETTING MEMBERS',
sofar => l_count + 1,
totalwork => l_table.COUNT + 1,
target_desc => 'GETTING MEMBERS' );


WHEN others THEN
( p_auditkey => g_audit_key,
p_comments => SQLCODE || ' ' || SQLERRM );

END get_member_data;