Wednesday, February 27, 2008

EDI Fun

or Electronic Data Interchange, just a fancy phrase for sending and receiving files. We (IT) do love to complicate things don't we?

I've put the change data capture stuff on hold as my never-ending project goes into it's 12th week past deadline. It's at nine 9s: 99.9999999. It's finance related stuff and nothing less than 100% is acceptable. I'm tired.

Part of my project was to move from an already built in house table to the raw (files) tables. My feisty colleague took on that fun challenge for 6 months or so. He's heading up a new project though so he's had to pass the baton on to me. I've accepted it...reluctantly. ;)

Anyway, we store these inbound files all over the place (seemingly to me). I started writing a little Java application that would scan a directory, read these x12-820 files and tell me the interchange date, control number, total amount and some other useful information.

I plan on either putting this in the database and wrapping it up in PL/SQL or creating a service (Java Service Wrapper) and pushing this useful data to a table. So if you have to deal with the wonderful x12/820 formats, you may want to check back soon for the code. I can't promise it will be good, but it will work!

Wednesday, February 20, 2008

Application Developers vs Database Developers

It started innocently enough with this article. I sent it out to about 20 colleagues.

The best line from the article:
"Jerry: "Yeah, databases cause lots of headaches. They crash all the time, corrupt data, etc. Using text files is better."

One of my more recently arrived colleagues (I'll call him Mr. M) replied to everyone with this statement:

"Kind of funny actually, databases are less and less important at the large investment banks, where they basically load everything up into a data grid across a several hundred node cluster. Writing to the db is way too slow."

This started a day long exchange of emails. What follows is the entire thread (up until my last post tonight).

Me:
"I would just argue that they don’t necessarily know how to write to databases. I would however love to see benchmarking done on both methods. Would be an interesting test..."

Mr. M:
"Well, my understanding is they just can’t scale out the db enough. Even something like Oracle RAC won’t work. And outside of the military, these are probably the top 1% of programmers in the world building this stuff."

Me:
"A benchmark would be the only way I would believe it.

If you said the top 1% of database developers tried it and failed, I would be more likely to agree.

My experience is that application developers != database developers. Different type of thinking involved."

Mr. M:
"'A benchmark would be the only way I would believe it.'

Do you need a benchmark before you would believe in-memory retrieval is faster than disk retrieval? Essentially, this is what we’re talking about.

'If you said the top 1% of database developers tried it and failed, I would be more likely to agree. My experience is that application developers != database developers. Different type of thinking involved.'

Why? It’s an issue to do with application performance not simply database performance. Database concerns are a subset of application concerns, essentially a specialization, requiring less encompassing knowledge. ;)

From the article you linked to (http://www.watersonline.com/public/showPage.html?page=432587)

"Better data management is the answer, says Lewis Foti, manager of high-performance computing and grid at The Royal Bank of Scotland (RBS) global banking and markets. "For very large compute arrays, the key issue is data starvation and saturation. This problem requires data grids with high bandwidth and scalable, parallel access,
...
Banks are learning that data management in a distributed grid environment is very different from online transaction processing. "With so many data sources, distribution channels, demands for aggregation and analytics, surges in data volumes and complex dynamics between the flows, we need to manage 'data in motion' and give up the notion that data is somehow stored. It's dynamic, not static," says Michael Di Stefano, vice president and architect for financial services at GemStone Systems
...
There is even some debate over how small a unit of work can be put on today's grids. Di Stefano at GemStone, for example, says, "One client has gone from 200 trades per second in a program trading application to more than 6,000 trades per second. This shows what the technology can do."

Yep, the writing is on the wall. Oracle knows it too.

http://www.google.com/search?hl=en&q=oracle+buys+tangosol&btnG=Google+Search"

Me:
"Good points. If it is in-memory it would be faster. I have not had the pleasure to work on such a system.

I do disagree with the database concerns being a subset of application concerns. The data drives the app. We’re probably getting religious at this point (or am I)."

Mr. M:
"‘The data drives the app.”

Exactly, but who’s to say where the data comes from or in what format? My application data may reside completely in xml files, or maybe I get it from some third party web services a la the en vogue “mashup.” Heck, I may not even need to worry about a database anymore…. http://www.amazon.com/gp/browse.html?node=16427261 The database is only one particular concern of the overall application. And it’s the application that matters. Data is useless if it just sits on a disk somewhere. It’s the ways in which the application lets the users view and manipulate the data that adds value to the business.

Yep, definitely a different type of thinking between application developers and database developers."

Me:
"Definitely religious now.

Applications come and go, data stays the same. Think Green Screens, EJBs, Ruby…what’s next?"

Mr. M:
"'Applications come and go'

Exactly. Businesses are not static, nor are the markets they compete in. Changing applications are a function of changing business processes and changing markets.

'data stays the same.'

Nonsense. Otherwise UPDATE would not be an SQL reserved word. If you mean database technology stays the same, well, I’m more inclined to agree with that.

'Think Green Screens, EJBs, Ruby...what’s next?'

Whatever comes along to let the business more effectively respond to current market realities. Application platforms have evolved much faster than database platforms have. They’ve had to, their sphere of operation is much broader than that of databases, this is only natural, they deal with much broader concerns than do databases. Databases in the internet era function in essentially the same role they did in the era of dumb terminals. Clearly application platforms have evolved orders of magnitude more. Hence the statement, database concerns are a subset of application concerns.

Here’s a simple test….if I take some business application and I’m forced to throw away one or the other, either the database or the appl- wait a second, it doesn’t even make sense to finish it, does it? The business can live without the database. I could do all kinds of things with the data, I could stick it anywhere. The business can’t live without the application though. Another way to look at is, what do the business users look at, test, approve, and use? The database? Of course not, they look at the application. They could care less whether the data sits on disk in an RDBMS, xml, or flat files."

Me:
"We obviously violently disagree.

Without the database (and I use database and data interchangebly), the business could no longer function. The app is meaningless. How would you contact your customer? You couldn’t find it.

'Exactly. Businesses are not static, nor are the markets they compete in. Changing applications are a function of changing business processes and changing markets.'

Poorly designed applications…that is all."

A Feisty Colleague:
"Using data and database interchangeably is incorrect. A database is a mechanism for data storage. XML data sets and flat files are mechanisms for data storage, too. So is a file cabinet, because, the data doesn’t have to be electronic, it could be … gasp! … on paper, and the application to use that data would be hands for holding the paper and a pencil to update and add data to the page."

Me:
"No it isn’t. I take into account xml files, flat files, web services (but not paper, unless it’s scanned) and all that. It would be consumed by the database and then accessed by the application via SQL.

(that’s for Mr. M and the feisty one)"

At which point someone forwarded the home page for Oracle's TimesTen In-Memory Database.

Me:
"A database on/in the mid-tier...Perfect!"

Mr. M:
"Implicit acknowledgment that disk IO operations that come with traditional database access simply can’t match the performance of in-memory data access (a point which you previously were unconvinced of but now seem perfectly accepting of the idea once you see it’s got Oracle’s imprimatur on it).

Of course, why any application developer would want to program against an SQL interface if they weren’t forced to is beyond me. It is orthogonal to the programming model of most application platform languages.

Surely Oracle recognize this fact too or they wouldn’t be buying Tangosol and other data grid technologies. Of course, most of those products are far more technically advanced than TimesTen or anything Oracle has in that space.

Incidentally, it’s illustrative to note that Coherence and other products like it were for the most part designed and built by application programmers. The development of all these products is pretty much driven by the needs of the large investment banks on Wall Street. These trading applications simply had too many concurrent transactions to use an RDBMS (a problem quite a number of public domains now share, most famously google.com, nope, no RDBMS there, yet miraculously there is still data). The database just simply would not scale to such a degree. So the application developers, by necessity, came up with an alternate solution that did work, a fully transactional cache of data replicated across a cluster with node numbers in the thousands, and no relational model whatsoever to speak of. A perfect example of how database concerns are only one, sometimes small, concern amongst many that application developers must be aware of and ready to solve."

Me:
"Like you said initially, the top 1%.

Many of us will never touch a system like this.

I will certainly concede that it is faster (still would love to see benchmarking though), but that still leaves 99% of the applications out there that do not require that kind of performance."

Me (again):
"And don’t forget, I use data and database interchangeably. Applications are nothing without the data right?

As to the object/relational impedance mismatch...well, more people that don’t know how to work in sets. Looping is what they understand. I understand the application side more than you seem to give me credit for.

I’m not saying applications aren’t important, they are. Data (databases) and applications go hand in hand. If the application went away though, they could still access their data via SELECT statements (yes, via an application client tool), however painful that may be. Applications make retrieving data that much easier for our users.

If anyone wants to unsubscribe from this mailing list, just let us know. This is fun for me (I’m guessing Mr. M too)."

Needless to say it was a fun day. It didn't get [too] personal. More than anything I'm happy to have an equally passionate colleague.

Besides, he claims he was just fracking around with me. ;)

Wednesday, February 6, 2008

It's the Data, Stupid!

Search for the phrase on Google and you'll get plenty of results.

After reflecting for a few days on reaction to MySQL, I think I've realized what is at the heart of it all. Data.

Application developers are not stewards of data. They believe that to be the job of the DBA.

Someone recently asked one of our architects what features of MySQL convinced them to choose this as our new database engine. It's open source!

Of course, that makes perfect sense.

Can it connect to Oracle?

I don't know.

Our architects are made up primarily of former application developers, be it web or client server apps. Data was never that important...

They are currently driving our tool set to favor the application developers, which makes perfect sense to them. It's all about the interface.

But it's not. In the health-care industry, data is king. For any industry really.

I've been trying to convince everyone that this million dollar piece of software called Oracle is not just a bucket, it's feature rich. Streams, Queueing, all kinds of really cool tools. According to our DBAs, none of that stuff is used.

No wonder we're moving to MySQL.

So my quest is to convince the powers that be is to stop wasting money on our million dollar buckets and use them to their full capabilities.

If you have any information to help in this fight, links, slideshows, whatever, please send them on to me (myfirstname.mylastname@gmail.com), please!

Help me turn the tide back to Oracle, back to the data!

Monday, February 4, 2008

Is It Arrogance?

I wrote on Friday night about my experiences that day.

I am a very opinionated person. I believe, whole-heartedly, that the database is severely under-utilized, especially at my current employer.

I believe that one of the big draws of MySQL is that it's easy for web/application people to pick up. I also believe, in our situation, that's it's a way for application developers to skirt the whole "data" problem. They'll just pawn it off on the Production DBAs to keep the database running.

Amusingly, some of our application developers brought down one of our Oracle instances, more than once. Pretty tough thing to do I always thought.

I've read articles on bind variables since the beginning, but since it had been drilled into me, I found it quaint. Who would do that?

From a C# app someone passed in hundreds of thousands of un-bound INSERT statements. It flooded the shared pool (is that right?) and brought it to a screeching halt.

Anyway, back to the point.

I've been very vocal lately about MySQL. A few of my friends have begun to warn me that I may be crossing the line towards arrogance. That I will come off as someone resistant to change.

I don't see it. But sometimes we're the last to see our own reflection.

I don't believe that I am resistant to change. I like change. I just want it to be proven, that's all. I embraced ApEx because it made my life easier. That's all I want.

Does this make me arrogant?

Friday, February 1, 2008

MySQL Friday

Each month we have an IT All-Hands meeting.

Last month I was promoted to Senior Vice President (SVP), because of my superior management techniques.

Today I was promoted to CEO! Unfortunately it only lasted for a few minutes. I happen to resemble our new CEO (and I'm always pining for a promotion) and they thought it would be funny (again) to bring me up.

I hugged the guy behind me, shook hands with people next to me and ran up to the front. I wanted to shriek, like the people do on The Price is Right, but I didn't have it in me. You gotta have fun at work right?

Well, after that it got serious. Our new Director (at WellCare, Directors are executives, one step up from managers and one below VPs) who heads our architecture team (and release management) got up to discuss where he would be taking us.

Slide one:
From 3 database engines to 1.
From 4 programming languages to 2.
From 3 OSs to 1.

Wanna guess what question I had?

"So, what database engine are we going to use?"

I knew the answer, but I take every single opportunity I get to make my point.

"MySQL."

Being on the datawarehouse team, I was confident that Oracle was not going away.

He went on to explain:

"Legacy applications would be maintained but everything going forward would be done in MySQL."

A flurry of questions came from the crowd so I was unable to followup immediately. I could feel the room come alive...it was weird (I think I'm still hopped up from the events that took place today).

Our CIO asked if there were any more questions or comments.

I spoke up.

I have two points.
1. If it's about cost, move all of the one-off applications into just a few Oracle instances. From what I can tell, we have somewhere in the neighborhood of 100. Let's say 5 databases, datawarehouse, our production OLTP and one for others. All you need to do is assign them different schemas, voila! Cost is much lower and there is a very big chance to reuse code.
2. Actually, I can't remember what my other point was. I think it had something to do with putting the logic in the database, that Java was the fad a few years ago, Ruby was the big thing now, what would it be in 5 years? Will we have to rewrite all of the logic then? (I guess I do sorta remember).

After that, someone asked about the two programming languages. Not a great answer from the crowd's reaction. Then someone asked about the OS.

The crowd was riotous (if that's a word). The CIO had to calm us all down.

I made a remark that he hadn't danced yet (one of our former hazing techniques for new employees) because I didn't want it to be completely personal, or just to ease something that I started.

After the meeting, I spoke with the Director. Oracle will be gone in 20 years because of the open source databases, it's being commoditized (not sure what that means). SOA is the wave of the future.

It was a polite conversation. I told him I look forward to learning from him but that I will probably never be sold on that idea. Fewer moving parts, simplicity, that's what I want.

I then spoke with the CIO, told him that once the decision was made, I would support it and keep my mouth shut (or find a new job).

I sent an email to the VP of the Director's group (after a couple of beers...idiot!) explaining my rationale.

One of the biggest reasons we chose to come to Tampa, to WellCare specifically, was because it was so young and immature. I would have the opportunity, if I could prove myself, to shape the future of IT here.

It's nice to have a voice.

Anyway, it's Friday, I'm prepped to spend all weekend at work to get this project delivered that was due in November. Have a good weekend!