Wednesday, April 22, 2015

Is MERGE a bug?

A few years back I pondered whether DISTINCT was a bug.

My premise was that if you are depending on DISTINCT to return a correct result set, something is seriously wrong with your table design. I was reminded of this again recently when I ran across Kent Graziano's post on Better Data Modeling: Are you making these 3 beginner mistakes in your data models?. Specifically:
Instead of that, you should be defining a natural, or business, key for every table in your system. A natural key is a an attribute or set of attributes (that occur naturally in the data set) required to uniquely identify a row in that table. In addition you should define a Unique Key Constraint on those attributes in the database. Then you can be sure you will not get any duplicate data into the tables.

CLARIFICATION: This point has caused a lot of questions and comments. To be clear, the mistake here is to have ONLY defined a surrogate key. i believe that even if using surrogate keys is the best solution for your design, you should ALSO define an alternate unique natural key.
So why MERGE?

I learned about the MERGE statement in 2008. During an interview, Frank Davis asked me about when I would use it. I didn't even know what it was (and admitted that) but I went home that night and...wait...I think he asked me about multi table inserts. Whatever, credit is still going to Mr. Davis. Where was I? OK, so I had been working with Oracle for about 6 years at that point and I didn't know about it. My initial reaction was to use it everywhere (not really)! You know, shiny object and all. Look! Squirrel!

Why am I considering MERGE a bug? Let me be more specific. I was working with a couple of tables and had not written the API for them yet and a developer was writing some PL/SQL to update the records from APEX. In his loop he had a MERGE. I realized at that moment there was 1, no surrogate key and 2, no natural key defined (which ties in with Kent's comments up above). Upon realizing the developer was doing this, I knew immediately what the problem was (besides not using a PL/SQL API to nicely encapsulate the business logic). The table was poorly designed.

Easy fix. Update the table with a surrogate key and define a natural key. I was thankful for the reminder, I hadn't added the unique constraint yet. Of course had I written the API already I probably would have noticed the design error, either way, a win for design.

Now, there are perfectly good occasions to use the MERGE statement. Most of those, to me anyway, relate to legacy systems where you don't have the ability to change the underlying table structures (or it's just cost prohibitive) or ETL, where you want to load/update a dimension table in your data warehouse.

Noons, how's that? First time out in 10 months. Thanks for the push.

10 comments:

Noons said...

Excellent! And thanks for that!
Yes, I agree entirely: using MERGE on tables that really don't follow even the first normal form is asking for trouble. And DISTINCT is another manifestation of that trouble!
I'm having a go at making our developers replace complex "difference/delta" tables in our DW ETL with MERGE, to avoid the usual storm of multiple intermediate or temp tables or long PL/SQL cursor loops in what is essentially an single upsert.
That's what MERGE was created for.
I agree entirely with your prior claim that it is really designed for DW ETL, not OLTP.
Great stuff and thanks heaps for talking and writing about it.

jeffkemponoracle.com said...

Not sure I follow the logic - if the tables all have natural keys, what exactly is wrong with using a MERGE statement to insert/update them? Like any feature, it may be misused; but this doesn't imply it's a bug in itself.

Dominic Brooks said...

I like MERGE.
But here's why I don't like it for some OLTP circumstances:

https://orastory.wordpress.com/2011/10/13/concurrent-merge/

chet justice said...

@jeff,

I never really stated it, but you're right, MERGE in and of itself is not a bug. My point (on both posts actually) is that the use of the features, to me, indicates either a design flaw in your (OLTP) system, a misunderstanding of the design by the developer or simple a lack of how to do things by the developer (a teaching moment?).

jeffkemponoracle.com said...

So you would avoid MERGE in an OLTP system with a well-designed data model? Assuming you've dealt appropriately with the concurrency problem referred to by Domenic, why?

chet justice said...

Never said that Jeff.

"Now, there are perfectly good occasions to use the MERGE statement..."

jeffkemponoracle.com said...

Sorry, perhaps I misinterpreted your qualification of that about it only generally being good for dealing with legacy data models, etc etc.

Not having a go, I just want to make sure I understand where you're coming from.

Michael O'Neill said...

I *just* used a MERGE in an OLTP context and it was far from a bug. It was used in avoidance of an INSERT, trapping DUP_VAL_ON_INDEX, UPDATING instead mess.

Now where the bug may have been in my situation is that my MERGE may have introduced a last-in-wins scenario - but that is he application design (only the latest value matters), so it works well and performs better than trapping the exception.

chet justice said...

@Michael,

Sounds to me like you've caught one of those actual uses. In my experience, they've been rare, especially so in OLTP.

Noons said...

Just got a 3113 on a MERGE using a remote table (dblink) as source.
Nothing in doco says it cannot be.
11.2.0.3 patched up here.
Opened an SR, lessee what MOS comes up with. There used to be a bug with this in 10g, but I'm on 11.2...