Thursday, August 16, 2007

Swapping Partitions

Recently I learned a pretty cool trick, swapping partitions.

I'm sure I read it originally on AskTom, but I never had a need for it. Now that I am in data warehousing though, moving millions and millions of records around takes time...a lot of time.

On this recent project, I had a process which brought in the data from a different database. There were some minor transformations on that data including the use of an analytic function to de-dup the data (I would assign a 1 to the first record matching my criteria and that's what I would use for everything downstream).

So I have essentially six tables. One which is the target for the source data, four which handle the transformed data and another one which holds all of that transformed data. Initially I would perform a INSERT INTO...SELECT FROM...but it would take upwards of three hours for this one particular set of data (~26 million records), and I was doing that twice (six hours).

Sitting with a colleague trying to figure out a way to speed up the process, we came across an article on AskTom which mentioned swapping (or exchanging) partitions with a regular table. The final table (of the six) is partitioned by line of business, a perfect candidate.

So I issue the statement found there:

ALTER TABLE target_table
EXCHANGE PARTITION gmd_lob
WITH TABLE source_table;

That didn't work because the table structures did not match. So I went about changing the table structure to match that of the target table.

I issued the statement again:

ALTER TABLE target_table
EXCHANGE PARTITION gmd_lob
WITH TABLE source_table;

Success! I then went on to change the other three tables structure to match that of the target table. What had taken three to four hours now took less than five minutes!

No comments: