ORACLENERD
 
Sunday, July 27, 2008
  Foot In Mouth Syndrome
On Friday of last week we were doing our first code review (well, my first code review at this job). A fun 2600 line package, 42 pages.

As were going through one piece, there's an exception block, with an explicit ROLLBACK, some logging code, and then a COMMIT. Of course the logging code should be an autonomous transactions, but apparently it's a bit sketchy.

Anyway, I blather on about not needing an explicit ROLLBACK after the exception as everything is rolled back prior...

"no it's not."

Me: "Yes it is."

"No, it's not."

Me: "Yes, it is."

We then moved on. I go back to my desk to write up a quick test proving my point.

Of course I was wrong though. If you have thing one and thing two and the exception occurs on thing two, the data/record/whatever for thing one exists. And, if you are going to issue a COMMIT and you don't want that orphaned record out there, you must issue an explicit ROLLBACK.

I sent a note out to everyone saying that my colleague is right, but I was still irked by it somehow.

It just didn't feel right.

So I left for lunch for a bit, still thinking about if of course.

Wait, I was saying one thing but thinking about something else. I was thinking of a logical transaction. Thing two depends on thing one therefore you would never, ever want to COMMIT after the exception. The logging code through me off for some reason.

I of course had to send out another email explaining my thought process and my mistake, in essence making me seem the hard loser. Why do I have to do that? Just let it lie...

Oh well, I've done it before and I'll probably do it again. Thankfully I work with some good people so they won't hold it over my head for too long...

Labels: ,

 
Wednesday, July 16, 2008
  Fraud Analysis?
So we are pondering how to go about fraud analysis real time.

One thing currently holding us back is that all the key fields are encrypted (obviously). How do you go about doing fraud analysis with such seemingly high overhead?

My question to you is, do you know of any fraud related off-the-shelf tools or libraries out there? Preferably written in PL/SQL, but Java or C will do as well.

Labels: , , ,

 
Monday, June 30, 2008
  Fun With Triggers
I don't care a whole lot for triggers. About the only thing I see of use is for some sort of auditing.

That said, I don't know a whole lot about them either. I don't know the specifics of how they work. I was tested today...

CREATE TABLE t
(
update_date DATE DEFAULT SYSDATE,
update_user VARCHAR2(30)
);

INSERT INTO t ( update_date )
VALUES ( ADD_MONTHS( SYSDATE, 100 ) );

INSERT INTO t ( update_date )
VALUES ( ADD_MONTHS( SYSDATE, -22 ) );

CJUSTICE@XE>SELECT * FROM T;

UPDATE_DA UPDATE_USER
--------- ------------------------------
31-OCT-16
31-AUG-06
Easy enough.

My requirement is that I capture the update_user, specifically in the case where it is not supplied. And this is where I ran into a wall.
CREATE OR REPLACE
TRIGGER biu_t
BEFORE INSERT OR UPDATE
ON t
REFERENCING NEW AS NEW OLD AS OLD
FOR EACH ROW
BEGIN
IF :NEW.update_user IS NULL THEN
:NEW.update_user := USER;
END IF;
END;
/
show errors
Easy enough.
UPDATE t 
SET update_date = ADD_MONTHS( SYSDATE, -12 );

CJUSTICE@XE>SELECT * FROM T;

UPDATE_DA UPDATE_USER
--------- ------------------------------
30-JUN-07 CJUSTICE
30-JUN-07 CJUSTICE

2 rows selected.
OK, not too bad. Let's try specifying the UPDATE_USER.
UPDATE t
SET update_date = ADD_MONTHS( SYSDATE, 12 ),
update_user = 'BOLLOCKS';

UPDATE_DA UPDATE_USER
--------- ------------------------------
30-JUN-09 BOLLOCKS
30-JUN-09 BOLLOCKS
Good, everything's working as expected...or is it? Let's connect as another user:
CJUSTICE@XE>conn hr/testing@xe
Connected.

UPDATE cjustice.t SET update_date = SYSDATE - 100;

HR@XE>SELECT * FROM cjustice.t;

UPDATE_DA UPDATE_USER
--------- ------------------------------
22-MAR-08 BOLLOCKS
22-MAR-08 BOLLOCKS
What? Why didn't the user get updated with HR? Let's add some dbms_output statements to the trigger:
CREATE OR REPLACE
TRIGGER biu_t
BEFORE INSERT OR UPDATE
ON t
REFERENCING NEW AS NEW OLD AS OLD
FOR EACH ROW
BEGIN
dbms_output.put_line( 'New Value: ' || :NEW.update_user );
dbms_output.put_line( 'Old Value: ' || :OLD.update_user );

IF :NEW.update_user IS NULL THEN
:NEW.update_user := USER;
END IF;
END;
/
show errors

HR@XE>UPDATE cjustice.t SET update_date = SYSDATE + 100;
New Value: BOLLOCKS
Old Value: BOLLOCKS
New Value: BOLLOCKS
Old Value: BOLLOCKS

2 rows updated.

Elapsed: 00:00:00.04
HR@XE>SELECT * FROM cjustice.t;

UPDATE_DA UPDATE_USER
--------- ------------------------------
08-OCT-08 BOLLOCKS
08-OCT-08 BOLLOCKS

2 rows selected.
So UPDATE_USER was not updated with the value of HR, which I would expect. The :NEW and :OLD values are exactly the same...I'm guessing that Oracle makes a copy of the record first (puts it into a collection or something?) so UPDATE_USER would never be NULL.

Ultimately, I scrapped the update of UPDATE_USER in the trigger. I did force the UPDATE_DATE = SYSDATE, and that was it.

Ideally, I would specify that in the package call, but we aren't there yet. In my opinion, no other user would have UPDATE, INSERT or DELETE privileges on the table. But that's what I have.

Can someone with more experience with triggers help me out? Or advise me on what's going on?

Labels: , ,

 
Thursday, June 5, 2008
  BULK COLLECT and FORALL
I have a task to tune a little bit of SQL. It's very ETL like, but the target is not (yet) a star schema, it's pretty much a table flattened out for reporting purposes.

As I reviewed it, I noticed it went row by row, with a COMMIT inside the LOOP. That's gotta go.

Can I do this in one SQL statement? No, there's other processing that needs to be done (UPDATE two other tables before and after). Hmmm...could I just return the appropriate records into a collection? I'll have to look at that to see if it's possible.

For now though, I am going to try and use BULK COLLECT with the LIMIT clause and FORALL for the processes that occur before and after.

Let's create some data:

CREATE TABLE t( x NUMBER, y NUMBER );

INSERT INTO t( x, y )
SELECT
TRUNC( dbms_random.value( 1, 99999999 ) ),
TRUNC( dbms_random.value( 1, 100000 ) )
FROM dual
CONNECT BY level < 1001;
OK, now let's create an anonymous block, BULK COLLECTing the data from T into a PL/SQL table and then populated another table with that data:

DECLARE
TYPE t_record IS TABLE OF T%ROWTYPE;
l_table T_RECORD;
CURSOR c
IS
SELECT x, y
FROM t;
BEGIN
OPEN c;

LOOP
FETCH c
BULK COLLECT INTO l_table
LIMIT 100;

FORALL i IN 1..l_table.COUNT
INSERT INTO s ( x, y )
VALUES ( l_table(i).x, l_table(i).y );

EXIT WHEN C%NOTFOUND;
END LOOP;
CLOSE c;
END;
/
And then I run it and I get the following:

ERROR at line 18:
ORA-06550: line 18, column 16:
PLS-00436: implementation restriction: cannot reference
fields of BULK In-BIND table of records
ORA-06550: line 18, column 16:
PLS-00382: expression is of wrong type
ORA-06550: line 18, column 30:
PLS-00436: implementation restriction: cannot reference
fields of BULK In-BIND table of records
ORA-06550: line 18, column 30:
PLS-00382: expression is of wrong type
ORA-06550: line 18, column 16:
PL/SQL: ORA-22806: not an object or REF
ORA-06550: line 17, column 7:
PL/SQL: SQL Statement ignored
A quick google search and I end up here .

So I can bulk bind, but I have to INSERT into the table as a whole. I can't be selective.

I updated my code to this:

DECLARE
TYPE t_record IS TABLE OF T%ROWTYPE;
l_table T_RECORD;
CURSOR c
IS
SELECT x, y
FROM t;
BEGIN
OPEN c;

LOOP
FETCH c
BULK COLLECT INTO l_table
LIMIT 100;

FORALL i IN 1..l_table.COUNT
INSERT INTO s
VALUES l_table(i);

EXIT WHEN C%NOTFOUND;
END LOOP;
CLOSE c;
END;
/
I run it and it completes successfully. This is all on XE, so I wonder (hope) that 11g will allow me to do what I want (I'll be working on an 11g RAC system). I scroll down the list of google results and I find this one which then takes me to AskTom. The first post demonstrates that my first attempt will work on 11g.

BULK COLLECT and FORALL are great tools if you can't do it in a single SQL statement and if you want to avoid the row by row processing.

Labels: , , , ,

 
Thursday, April 17, 2008
  DBMS_CRYPTO: Example
In the comments from my previous post, Tyler and Tom both mentioned the fact that you should never store sensitive data as plain-text in the database. I used a bad example unfortunately, but Tom did mention the DBMS_CRYPTO package which I have used before.

Since I have used it and published a working example to the OTN forums, I figured I'd put it here as well.

The goal is the encrypt a SSN and store the encrypted string (RAW) in the database. In conjunction with VPD, you should be able to easily control who can see the plain-text version of the SSN. To do that you will need to create another function which decrypts the encrypted SSN.

This is a pretty basic example, but it should help to get you started. You can read more about the DBMS_CRYPTO package in the docs.

CREATE OR REPLACE
PACKAGE BODY p_encrypt
AS
--DO NOT FORGET TO WRAP THIS BEFORE LOADING INTO DATABASE
--IF IT IS NOT WRAPPED, THE KEY WILL BE EXPOSED
--THE WRAP UTILITY IS LOCATED IN THE \BIN DIRECTORY (WRAP.EXE)
G_KEY RAW(32) := UTL_I18N.STRING_TO_RAW( 'some_random_string_stuff_goes_here', 'AL32UTF8' );

FUNCTION encrypt_ssn( p_ssn IN VARCHAR2 ) RETURN RAW
IS
l_ssn RAW(32) := UTL_I18N.STRING_TO_RAW( p_ssn, 'AL32UTF8' );
l_encrypted RAW(32);
BEGIN
NULL;
l_encrypted := dbms_crypto.encrypt
( src => l_ssn,
typ => DBMS_CRYPTO.DES_CBC_PKCS5,
key => G_KEY );

RETURN l_encrypted;
END encrypt_ssn;

FUNCTION decrypt_ssn( p_ssn IN RAW ) RETURN VARCHAR2
IS
l_decrypted RAW(32);
BEGIN
l_decrypted := dbms_crypto.decrypt
( src => p_ssn,
typ => DBMS_CRYPTO.DES_CBC_PKCS5,
key => G_KEY );

RETURN UTL_I18N.RAW_TO_CHAR( l_decrypted, 'AL32UTF8' );
END decrypt_ssn;

END p_encrypt;
/
show errors
I mention it in the comments of the code, but do not forget to wrap your PL/SQL before you load it, otherwise someone will be able to easily see your salt/key.

Labels: , , , ,

 
Wednesday, February 27, 2008
  EDI Fun
or Electronic Data Interchange, just a fancy phrase for sending and receiving files. We (IT) do love to complicate things don't we?

I've put the change data capture stuff on hold as my never-ending project goes into it's 12th week past deadline. It's at nine 9s: 99.9999999. It's finance related stuff and nothing less than 100% is acceptable. I'm tired.

Part of my project was to move from an already built in house table to the raw (files) tables. My feisty colleague took on that fun challenge for 6 months or so. He's heading up a new project though so he's had to pass the baton on to me. I've accepted it...reluctantly. ;)

Anyway, we store these inbound files all over the place (seemingly to me). I started writing a little Java application that would scan a directory, read these x12-820 files and tell me the interchange date, control number, total amount and some other useful information.

I plan on either putting this in the database and wrapping it up in PL/SQL or creating a service (Java Service Wrapper) and pushing this useful data to a table. So if you have to deal with the wonderful x12/820 formats, you may want to check back soon for the code. I can't promise it will be good, but it will work!

Labels: , , , , ,

 
Wednesday, December 5, 2007
  Instrumentation: DEBUG/LOGGING
In my previous entry on instrumenting code I detailed the way in which you could use the DBMS_APPLICATION_INFO.SET_SESSION_LONGOPS procedure to help monitor your long running code.

In this one, I will detail my home grown version of Tom Kyte's debug routine. I do know that others have similar code but can't seem to find them right now.

You can find the source code here.

Contents:
2 tables
2 sequences
1 package
1 procedure
1 build file

debug_tab allows you to turn the debugging on and off. debug_details_tab will store each line that you write to the debug routine when turned on.

Here's an example of it in practice:

CREATE OR REPLACE
FUNCTION get_age_in_months( p_id IN NUMBER ) RETURN NUMBER
IS
l_age_in_months INTEGER;
BEGIN
--instrumentation calls
debug( 'GET_AGE_IN_MONTHS' );
debug( 'P_ID: ' || p_id );
debug( 'select value into variable' );

SELECT age_in_months
INTO l_age_in_months
FROM people_tab
WHERE id = p_id;

debug( 'L_AGE_IN_MONTHS: ' || l_age_in_months );

RETURN l_age_in_months;

EXCEPTION
WHEN no_data_found THEN
debug( 'no data found' );
RETURN NULL;
END get_age_in_months;
/
show errors


I mentioned in the previous article that I had had difficulty grasping this concept initially. I think once I related it to DBMS_OUTPUT.PUT_LINE it became much more clear to me.

This simple debug routine has helped me tremendously in the last year or two that I have used it. Especially when you get nested levels of logic. It gets very hard to keep track of where you are, but with this you can run your procedure or function and then issue a SELECT against the debug_details_tab and see what was happening when.

I even began using this in my APEX applications. I would preface each line with "APEX: " and then write whatever was necessary so that I could step through the various pieces of code. It became crucial when I was doing validation on a table of records in a collection...oh so much easier.

On large systems this will generate a lot of data. I definitely would not put this inside a loop doing some sort of batch processing, but it is helpful to find where things fail out.

It can certainly be improved on. I basically took the idea of DBMS_OUTPUT.PUT_LINE and wrote it to a table, nothing fancy there. Mr. Kyte mentions writing it to the file system as well. Since I don't have ready access to the database file system, this was the easiest.

Make your life easier and use a debug/logging routine in your code. No longer will you have to comment, the debug statements should do it for you!

Labels: , , , , , , ,

 
Sunday, December 2, 2007
  Instrumentation: DBMS_APPLICATION_INFO
Instrumentation has something that I have come to rely on fairly heavily. I believe I first read about it on asktom, but the one that really spurred me on was this post on instrumentation on his personal blog.

Initially, I couldn't really wrap my head around instrumentation. I don't know why it was so difficult; I had a similar problem with sessions when I first started my career. I look back now and it just seems so obvious.

Now that I am doing datawarehouse work, nothing is fast. Fast to me is now one hour to load 30 or 40 million records. No more split second queries for me.

We currently use no tools. It's straight PL/SQL. Instrumentation of the code is ideal. Actually, it's more instrumentation to aid monitoring. The tool most easily used is provided by Oracle in the DBMS_APPLICATION_INFO package.

There are three subprograms that I use most, SET_MODULE, SET_ACTION and most importantly SET_SESSION_LONGOPS. I hadn't started using it until this year, I mainly stuck to the first two. SET_SESSION_LONGOPS is now part of my procedure/function template I've created in JDeveloper.

What it allows you to do is set a row in the v$session_longops view (I know it's not actually putting the row in the view...it's the underlying table, but I digress). You can then monitor how your job is doing.

Here's an example:

dbms_application_info.set_session_longops
( rindex => g_index,
slno => g_slno,
op_name => 'GETTING MEMBER DATA',
sofar => 0,
totalwork => l_table.COUNT + 1,
target_desc => 'GETTING MEMBER DATA' );

g_index and g_slno are global variables in the package. l_table is a PL/SQL TABLE OF VARCHAR2.

Now you can monitor the progress of your job in v$session_longops!

Here's the query I use:

SELECT
username,
sid,
serial#,
TO_CHAR( start_time, 'MM/DD/YYYY HH24:MI:SS' ) start_ti,
time_remaining rem,
elapsed_seconds ela,
ROUND( ( sofar / REPLACE( totalwork, 0, 1 ) ) * 100, 2 ) per,
sofar,
totalwork work,
message,
target_desc
FROM v$session_longops
WHERE start_time >= SYSDATE - 1
ORDER BY start_time DESC


Now you too can sit for hours and watch your job move incrementally forward!

But seriously, it does help tremendously to know where a job is at. You can further use the SET_MODULE and SET_ACTION calls to see a specific point in the processing (inside a loop).

Here's the code in context:


PROCEDURE get_member_data
IS
l_exists INTEGER;
TYPE table_of_lobs IS TABLE OF VARCHAR2(3);
l_table TABLE_OF_LOBS := TABLE_OF_LOBS( 'COM', 'ORG' );
l_count INTEGER := 0;
BEGIN
--check to see if there is enrollment data, if not, move on
SELECT COUNT(*)
INTO l_exists
FROM members
WHERE rownum < 2;

IF l_exists = 1 THEN--data exists, truncate and reload

g_index := dbms_application_info.set_session_longops_nohint;

EXECUTE IMMEDIATE 'TRUNCATE TABLE member_stg';

g_audit_key := p_audit.begin_load
( p_targettable => 'MEMBER_STG',
p_loadsource => 'MEMBER_SOURCE',
p_loadstatus => 'PRE',
p_loadprogram => 'GET_MEMBER_DATA',
p_commenttext => 'INSERT' );

dbms_application_info.set_session_longops
( rindex => g_index,
slno => g_slno,
op_name => 'GETTING MEMBERS',
sofar => 0,
totalwork => l_table.COUNT + 1,
target_desc => 'GETTING MEMBERS' );

FOR i IN 1..l_table.COUNT LOOP
l_count := l_count + 1;

INSERT INTO member_stg
SELECT *
FROM members;

g_total_rows_affected := g_total_rows_affected + sql%rowcount;

COMMIT;

dbms_application_info.set_session_longops
( rindex => g_index,
slno => g_slno,
op_name => 'GETTING MEMBERS',
sofar => l_count,
totalwork => l_table.COUNT + 1,
target_desc => 'GETTING MEMBERS' );

END LOOP;

p_audit.end_load
( p_auditkey => g_audit_key,
p_loadstatus => 'SUC',
p_rowsuccess => g_total_rows_affected );

gather_table_stats
( p_tablename => 'MEMBER_STG',
p_schemaname => 'MYHOME' );

dbms_application_info.set_session_longops
( rindex => g_index,
slno => g_slno,
op_name => 'GETTING MEMBERS',
sofar => l_count + 1,
totalwork => l_table.COUNT + 1,
target_desc => 'GETTING MEMBERS' );

END IF;

EXCEPTION
WHEN others THEN
p_audit.failed_load
( p_auditkey => g_audit_key,
p_comments => SQLCODE || ' ' || SQLERRM );
RAISE;

END get_member_data;

Labels: , , , , ,

 
Saturday, November 24, 2007
  Parallel Processing using DBMS_JOB
I found this article through the OraNA feed by ProdLife which talked about running a report that was based on multiple queries. It reminded me of something I did awhile back.

We have this multi-step process which loads data into 2 tables that the business would use to reconcile our money in the door and our membership. Membership is on a month granularity (member month) and our money is transactional (they may have multiple transactions within a given month).






One table stores the transactions joined with our members. Not the correct grain that the business needs but useful for research. The other table summarizes the transactions to the month level and then is joined with our membership so that both are at the same granularity. Currently we're pulling across about 27 million records for members and the same for their transactions.

On the right is a basic diagram of the process.

The process initially took 8 hours to complete. Part of it was the fact that it runs
sequentially. However, not all parts of this process are dependent on one another. It isn't until the final 2 steps (Target Table 1 and Target Table 2, in yellow) that they need to run sequentially.


I wanted to speed this up and began thinking about the ways to do this (assuming as much tuning as possible had already completed).

1. I could use our scheduler or unix shell scripts.

2. Use a table based approach as ProdLife did.

3. Utilize PL/SQL and DBMS_JOB.

I chose number 3 initially and that's the focus of this post. I'll detail why I didn't use this method at the end.

The first thing I had to figure out was how to get PL/SQL to wait. Having read a few
posts on AskTom I remembered the SLEEP procedure. After a quick scan of the site, I found that it was part of the DBMS_LOCK package. I asked the DBAs to give me access so that I could being testing.

I figured that if I could wait long enough, it would be easy to "poll" the USER_JOBS
view to see when it had finished. I'm just going to show code snippets as the whole thing can get quite long.

I first determined that the error returned from Oracle for a job not there is -23241.
That will let me know when it is complete. Next, I declared variables for each job to run.

DECLARE
no_job EXCEPTION;
PRAGMA EXCEPTION_INIT( no_job, -23421 );
l_exists NUMBER;
l_dollars_job NUMBER;
l_members_job NUMBER;

First thing I do in the body is create the jobs using DBMS_JOB.SUBMIT.

BEGIN
dbms_job.submit
( job => l_dollars_job,
what => 'BEGIN p_mypackage.get_dollars; COMMIT; END;',
next_date => SYSDATE );

dbms_job.submit
( job => l_members_job,
what => 'BEGIN p_mypackage.get_members; COMMIT; END;',
next_date => SYSDATE );

COMMIT;

Make sure you issue the COMMIT statement after the jobs have been submitted.

Here's the fun part. I created a loop that would call DBMS_LOCK.SLEEP and wait for 60 seconds. After the wait has ended, I check to see whether that job remains in the USER_JOBS table. This allows the jobs to complete in 100 minutes.

FOR i IN 1..100 LOOP
dbms_lock.sleep( 60 );

IF l_dollars_job IS NOT NULL THEN
BEGIN
SELECT 1
INTO l_exists
FROM user_jobs
WHERE job = l_dollars_job;

l_exists := NULL;
EXCEPTION
WHEN no_data_found THEN
l_dollars_job := NULL;--job is finished
END;
END IF;

IF l_members_job IS NOT NULL THEN
BEGIN
SELECT 1
INTO l_exists
FROM user_jobs
WHERE job = l_members_job;

l_exists := NULL;
EXCEPTION
WHEN no_data_found THEN
l_members_job := NULL;--job is finished
END;
END IF;

The next step is to determine when to exit the loop. Hopefully, the jobs will finish in time and move on to the next, but if not, you want to exit gracefully. Well, semi-gracefully anyway.

IF l_dollars_job IS NULL
AND l_members_job IS NULL
THEN
EXIT;
ELSIF i = 100 THEN
BEGIN
dbms_job.remove( l_dollars_job );
EXCEPTION
WHEN no_job THEN
NULL;
END;

BEGIN
dbms_job.remove( l_members_job );
EXCEPTION
WHEN no_job THEN
NULL;
END;
--abort run, taking too long
raise_application_error( -20001, 'DOLLARS/MEMBERS data from not loaded timely...' );
END IF;
END LOOP;
END;

That's all there is to it.

In the end though, I was convinced not to use this method as restartability would be difficult. Perhaps this method combined with the table-based approach would be the ideal. I'll leave that for another day though.

Labels: , , ,

 
Google



How To
Parallel Processing: DBMS_JOB
SAS: Create Dataset From Oracle Table
Instrumentation: DBMS_APPLICATION_INFO
DBMS_CRYPTO

Popular
AppDev vs DataDev
Code Style Index
Better than Tom Kyte?
Good Day to Worse Day

Archives
August 2007 / September 2007 / October 2007 / November 2007 / December 2007 / January 2008 / February 2008 / March 2008 / April 2008 / May 2008 / June 2008 / July 2008 / August 2008 /


 

Powered by Blogger

Aggregated by OraNA