Tuesday, May 18, 2010

Exadata and The Apocalypse

Fresh off the heels of my meandering thoughts on Exadata. I now bring you my thoughts on the future state of database development.

In a word, it's gonna suck.

Just as more memory, faster CPUs and the other hardware improvements over the years...actually, that's what Exadata is. Let me try that again.

You think most database development sucks now, just wait until Exadata becomes commonplace. Why worry about design or good coding practices when you have brute force?

In a private conversation the other day, I compared it to the effect Windows has had on the populace (myself included), making us computer dumb. It's not always a bad thing; as the computer revolution wouldn't be where it is today without Windows because it gave idiots like me a low barrier to entry.

I see more frustration in my future, not less because of Exadata. Toad will reign supreme as hordes of developers write horrible queries...and get away with it.

I see less emphasis on the fundamentals if using THE Database Machine.

I said it above...brute force.

Why bother tuning up front when you have something that powerful?

Why bother tweaking the design when you have something that powerful?

Just drop it in and run. Exadata will cope just fine.

Thankfully, this won't happen soon, because the cost of Exadata is (perceived to be) so high. I will argue that in another rant I'm sure.

As the volume of data grows, there will be a point at which you must start to tune and contemplate a good design.

Just like Windows though, the bar will be lower. That scares me. I've had plenty of problems over the past couple of years, professionally speaking. Plenty of arguments. When oraclue was my DBA, you could get away with murder, because he could figure out a way to tune bad statement n.

Looking at the bright side, my future will full of excitement. I'll just have new things to bitch about.


Gary Myers said...

Solid-state drives (or cards) will have an impact at the lower price point pretty quick.

Interesting whether that will spark an interest in more compact, departmental databases that neatly fit onto those 64GB - 256GB drives.

Noons said...

"As the volume of data grows, there will be a point at which you must start to tune and contemplate a good design"

Spot-on, Chet. I've heard this same argument many, many times over the last 30 years.

"We won't need proper coding because Cobol hides all that complexity" was the mantra in the 60s, when said language replaced Assembler.
Then it was C, which was supposed to liberate us from the "complexity of Cobol".
Then we had the OO crowd claiming that processing power was infinite therefore it was OK to be complex. That one worked really well...
Then we had Java/j2ee claiming that everything could be broken into individual pools of complexity reacting randomly to external messages... Yikes!

We went through multiple iterations of the same on the data front with Codasyl replacing the "complexity" of flat files, relational replacing the "complexity" of Codasyl, and now flat files replacing the "complexity" of structured data.
Sprinkle in liberal amounts of SCSI versions, fibre channels, SSDs, flash memory, etcetc.

I wish I had a dime for every marketing announcement that "re-defined" the landscape of IT: I'd be filthy rich now, instead of on the soup queue because of the euro collapse...

The bottom line is "Nerd's law"(tongue in cheek): for every quantum improvement in hardware or software platform performance, there will be an exponential increase in the amount of data and/or processing needed.

Call it a joke if you will. But I reckon we're far from not needing proper design and development. It just changes its nature.
Yet again.

Ah, the smell of IT novelty in the morning!...

Tom said...

That's funny Noons about the latest technology being a panacea or silver bullet to solve world hunger and give us world peace.

I was on the IRC chat for #Cassandra the other day. (As you know Cassandra is being hailed all over as the thing to break the RDBMS. It is in essence the ULTIMATE NoSQL solution). The beauty behind Cassandra is that you don't need to know SQL and all that "complex" stuff. So I watched as developrs asked, "How do I do a sort on N column? How can I grab the top 10 values etc.. etc..?" The response was "You can't do that, sorry you can't do that either, nope it doesn't do that, nope can't do that". I then saw them say, MySQL won't scale, it just does not work, even with sharding, then I saw developers say, "when I do a write, then a read, then a write, how come it doesn't show up?" The response was "Oh yeah, that is bug #xyz and that will be fixed in version nyp and we will get to it whenever the hell we feel like it".

So to keep the cheap databases that "sort" of work relevant, you will see things like SSD's come along, but you ultimately get to a breaking point. We can say it hides bad code and it does. Do you want to spend a lot of effort to to tune a mission critical ETL process that currently runs 24 hours and needs to run in 6? Yes, you most likely will invest time, effort, energy, and money into doing it. Now if that query runs in 4.2 minutes, you will say Im good, no need to tune. There will still be tuning that should be done, it's just a matter if it is worth the effort or not. Now that 4.2 minute job can make everyone feel really good, but then what happens if the volume of data increases by a factor of 100 or 1000, then you HAVE to tune. It's inevitable that we will get there.

Timur Akhmadeev said...

>but then what happens if the volume of data increases by a factor of 100 or 1000, then you HAVE to tune. It's inevitable that we will get there.

Well, Exadata is aimed at the linear scalability of IO, so increasing data volume by a factor of 1000 might pass almost unnoticed for some apps running on Exadata - that would be good apps in my opinion. But of course that fact doesn't make Exadata tuning-free because brute force doesn't work well for bad algorithms - if you have a bad approach, no matter what kind of smart HW/SW you have, it won't help in making it fast enough.

Noons said...

I'm always reminded of the Apple development mantra: "do it the way we say or don't do it at all"!

Along the same lines, we have the definition of database without performance problems: the one without users.

Any IT technology or architecture is largely scaleable if all it achieves is one or two functions.

That is the problem with the "Cassandras" of this world: some inexperienced, wet behind the ears "architect", does a "framework" for a project and assumes that said framework will be applicable to everything and be the answer to everything.
Nothing could be further from reality.

@Timur: spot-on! We all recall the past history of "hardware doubles in speed every 12 months: just use faster/bigger/more expensive hardware".
All that mantra ensures is that we'll have the same inefficient designs and code running a LOT faster.
Or a faster CPU running the same infinite loop, but a LOT faster.

And the show goes on...