Wednesday, April 22, 2015

Is MERGE a bug?

A few years back I pondered whether DISTINCT was a bug.

My premise was that if you are depending on DISTINCT to return a correct result set, something is seriously wrong with your table design. I was reminded of this again recently when I ran across Kent Graziano's post on Better Data Modeling: Are you making these 3 beginner mistakes in your data models?. Specifically:
Instead of that, you should be defining a natural, or business, key for every table in your system. A natural key is a an attribute or set of attributes (that occur naturally in the data set) required to uniquely identify a row in that table. In addition you should define a Unique Key Constraint on those attributes in the database. Then you can be sure you will not get any duplicate data into the tables.

CLARIFICATION: This point has caused a lot of questions and comments. To be clear, the mistake here is to have ONLY defined a surrogate key. i believe that even if using surrogate keys is the best solution for your design, you should ALSO define an alternate unique natural key.
So why MERGE?

I learned about the MERGE statement in 2008. During an interview, Frank Davis asked me about when I would use it. I didn't even know what it was (and admitted that) but I went home that night and...wait...I think he asked me about multi table inserts. Whatever, credit is still going to Mr. Davis. Where was I? OK, so I had been working with Oracle for about 6 years at that point and I didn't know about it. My initial reaction was to use it everywhere (not really)! You know, shiny object and all. Look! Squirrel!

Why am I considering MERGE a bug? Let me be more specific. I was working with a couple of tables and had not written the API for them yet and a developer was writing some PL/SQL to update the records from APEX. In his loop he had a MERGE. I realized at that moment there was 1, no surrogate key and 2, no natural key defined (which ties in with Kent's comments up above). Upon realizing the developer was doing this, I knew immediately what the problem was (besides not using a PL/SQL API to nicely encapsulate the business logic). The table was poorly designed.

Easy fix. Update the table with a surrogate key and define a natural key. I was thankful for the reminder, I hadn't added the unique constraint yet. Of course had I written the API already I probably would have noticed the design error, either way, a win for design.

Now, there are perfectly good occasions to use the MERGE statement. Most of those, to me anyway, relate to legacy systems where you don't have the ability to change the underlying table structures (or it's just cost prohibitive) or ETL, where you want to load/update a dimension table in your data warehouse.

Noons, how's that? First time out in 10 months. Thanks for the push.