Showing posts with label datawarehouse. Show all posts
Showing posts with label datawarehouse. Show all posts

Thursday, November 8, 2012

BIWA Summit 2013

I meant to write about this sooner...

If you're looking for a good BI/DW/Analytics focused event, check out the BIWA Summit which takes place in January of 2013. If you're interested in speaking at the event (and you know you are), hurry up and get your abstract in here, it closes tomorrow (November 9th).

Day 1 will give us Tom Kyte who will talk about What's new from Oracle in BI and Data warehousing. Day 2 will feature Vaishnavi Sashikanth, Vice President, Development, Oracle Advanced Analytics at Oracle who will speak on Making Big Data Analytics Accessible.

For more information, go here and here.

Monday, January 31, 2011

Rittman Mead BI Forum 2011, Brighton & Atlanta May 2011

Read more about it here. Call for Papers is here.

When/Where
It depends, where are you? While I would love to go to the UK, doubtful I could swing it. I'll be attending in Atlanta:

Emory Convention Center, Atlanta GA, May 25th – 27th 2011

If I were going to the UK, here are those details:

Hotel Seattle, Brighton, May 18th – 20th 2011

What is it?
Actually, it's quite cool. I would imagine it is something like the Hotsos Symposium for Performance enthusiasts, but for OBIEE.

* Its focus is OBIEE, and related technologies (Essbase, OWB/ODI etc)
* It’s run by OBIEE enthusiasts, for OBIEE enthusiasts
* It’s aimed at an intermediate-to-experienced level audience who already know the basics
* All presentations will be either technical or methodology, with no marketing “fluff”
* Numbers attending are limited to 70 (UK) and 50 (US), with a single stream, giving us the ability to focus the event on just the one topic and with a lot of audience interaction.

It's gonna be small, so there will be quite a lot of interaction amongst the people. I'm looking forward to it, heck, I might even do a presentation myself (not about Twitter, this time).

If you're interested, follow the links above. Submit a paper. Attend. Do it.

Thursday, January 27, 2011

Google Refine

I can't remember exactly where I found this, probably Twitter.

From the website:
Google Refine is a power tool for working with messy data, cleaning it up, transforming it from one format into another, extending it with web services, and linking it to databases like Freebase.

I installed it this afternoon and played around with it a little bit. I was hoping to use it more for analysis, but I just didn't understand what it was built for.

Basically, it allows you to clean up sets of data. You know, you get an excel file from a customer and you want to make sure all instances of State match. Usually, you get a few different variations of it.

- FL
- Fl
- Florida
- FLORIDA
- florida
- FLOR.

I am sure I could go on and on. But you can select all those values and then update them with a single value. Yes, I know you could update those values in a single SQL statement. Perhaps you don't have time to create the table and load the data. This is a simple tool to allow you to do some basic data cleansing.

Check out the video to get a better example of how to use it. Cool stuff.



Friday, May 28, 2010

SOUG: Stewart Bryson

Stewart BrysonTonight we welcomed Mr. Stewart Bryson [blog|twitter|RittmanMead] into the fold.

(Small side note: I met Mr. Bryson in person at COLLABORATE this year. For some strange reason, I had pictured him as this late fifties dude. Turns out he's (slightly) younger than me).

First off though, I walked in and there seemed to be a buzz in the air along with a few new faces (for me anyway, I can't attend regularly due to travel). Was this buzz all about Mr. Bryson. It seemed so.

Mr. Bryson has been working with RittmanMead (you might know one of the founders, Mark Rittman, Oracle's Person of the Year) since the beginning of last year and he's heading up the RittmanMead America division.

Anyway, off to the presentation. The title of the presentation was OLTP DBA’s Guide to Delivering a Dimensional Model. Many willing DBAs in the crowd wanting to hear his take on the matter. He certainly didn't disappoint. I started to realize it was getting dark in the room when I finally looked at my clock, a few minutes after 8. I believe Mr. Bryson began around 6:30, maybe a little later. That was the fastest hour and a half. Usually by then I am bored out of my mind and ready for beer with whomever will join me.

Now I was mad because it was going to get cut short. I'm pretty sure everyone else would have been willing to stay another hour or so as well.

Here's the presentation if you are interested. For more RittmanMead papers and presentations go here.

It took well over an hour for Mr. Bryson to finally make his escape. He stuck around to answer lots of questions and make fun of me...wait, that was someone else...or was it? No matter.

If you get a chance to see Mr. Bryson speak, take it. He knows his sh...err...stuff.

Sunday, April 4, 2010

Job: Data Warehouse Manager

This came across just the other day. I had someone in mind but they start their new position come Monday.

This job is in South Tampa (Florida). Free free to email me.

Responsibilities:
The Data Warehouse Manager will be responsible for managing the day-to-day operations and on-going development and maintenance activities of the enterprise data warehouse. Other responsibilities include, but are not limited to:
  • Plan and manage the delivery of business intelligence applications and related data warehouse technical solutions
  • Work closely with the business functional areas in defining systems/project priorities, scope, approach, resource requirements, timing deliverables, change management and funding
  • Manage application and systems project teams, including estimation, tracking and management of project deliverables
  • Effectively plan, prioritize and administer work tasks
  • Provide leadership and direction to project teams
  • Effectively manage project budgets and timelines to deliver results on-time and on- budget
  • Oversee team members responsible for:
    • Logical Data Modeling
    • Data Integration/ETL Development
    • BI Architecture and Administration
    • Development and support of BI metadata and standard BI reports
Required Skills/ Experience:
  • Minimum 8-10 years of experience in data management and data warehousing technologies related to data modeling, database administration and performance tuning
  • Expertise in business intelligence and data warehouse (BI/DW) application development life cycle
  • Strong SQL and development skills
  • Experience with Oracle DBMS, preferably Oracle 10g/11g
  • Experience with Informatica Power Center and Data Explorer
  • # Solid oral and written communication, problem solving and project management skills
Additional Skills Preferred:
  • Experience with BI tools preferably OBIEE version 10.x
  • Experience with CA Erwin data modeling tool
  • Experience with Data Mining and OLAP technologies
  • Food Service or Retail Industry experience a plus

Tuesday, February 23, 2010

Connect to HP Neoview using Heterogenous Services

Being the lazy sort that I am and not wanting to pay for a SQL client, I decided to use my local Oracle instance to access a HP Neoview database. Please don't ask me any questions about it, because like much of the world, I don't know either.

Hat(s) off to Tak Tang. I've used his guide a number of times throughout the years. If he blogs, I can't find it, so if you know about it, please link him up. The first time I used it was back in aught (sp?) 5 to connect to a DB2 instance.

Setup
Database (source): Oracle 10gR2
OS: Windows Vista Ultimate running as a VirtualBox Guest on Ubuntu
Database (target): HP Neoview 2.4 (or something)
OS: Doesn't matter

Materials
HP Neoview driver which can be found here. (Thanks Christian)

Steps
Download the HP Neoview driver and install it. Since it is windows, just click away accepting all the defaults.

Next, configure an ODBC Datasource. I won't go into the gory details, but it's pretty easy. When you are done, you should see something like this:

ODBC Datasource

Now comes the fun part.

In your <ORACLE_HOME>\hs\admin directory you should see the following:

HS Folder

Open up the inithsodbc.ora file and set the parameters as follows:
HS_FDS_CONNECT_INFO = NEOVIEW
HS_FDS_TRACE_LEVEL = 1
Save the file as initNEOVIEW.ora in the same directory.

Now traverse to <ORACLE_HOME>\network\admin and open up your listener.ora file. Add a SID_DESC so that your file now looks like this:
LISTENER=
(DESCRIPTION=
(ADDRESS_LIST=
(ADDRESS=(PROTOCOL=tcp)(HOST=localhost)(PORT=1521))
(ADDRESS=(PROTOCOL=ipc)(KEY=extproc))))
SID_LIST_LISTENER=
(SID_LIST=
(SID_DESC=
(GLOBAL_DBNAME=testing)
(ORACLE_HOME=c:\oracle)
(SID_NAME=testing))
(SID_DESC=
(SID_NAME=plsextproc)
(ORACLE_HOME=c:\oracle)
(PROGRAM=extproc))
(SID_DESC=
(SID_NAME=NEOVIEW)
(ORACLE_HOME=c:\oracle)
(PROGRAM=hsodbc)))
Reload or stop and start your listener.

Now open up your tnsnames.ora file and add an entry for NEOVIEW. It should look like this:
NEOVIEW =
(DESCRIPTION=
(ADDRESS_LIST=
(ADDRESS=
(PROTOCOL=TCP)
(HOST=localhost)
(PORT=1521)
)
)
(CONNECT_DATA=(SID=NEOVIEW))
(HS=OK)
)
The last step is to create a database link.
CREATE DATABASE LINK neoview
CONNECT TO "username"
IDENTIFIED BY "password"
USING 'NEOVIEW';
You should be all set now. You can test it out by issuing a simple
SELECT * FROM dual
If that works, you're done.

If you have problems, check out Tak's Troubleshooting section.

Thursday, August 13, 2009

Baseball Data Modeling

Anyone out there like baseball? Ever had a desire to model a baseball game?

I do and I have tried a few times in the past. It get's pretty hairy down at the game game/inning/player level. If I remember correctly, substitutions tripped me up a bit. There there's the whole datawarehouse side, I'd like that to be part of the project as well.

I started a project on Google Code here, the name is pretty vanilla, baseball-database. If you join you can suggest a better name.

I'd like to try and talk Oracle into giving a few licenses to the recently released production version of SQL Developer Data Modeler. I've hit up @krisrice on Twitter, but he has no control over licenses, just development. I've also hit up Justin Kestelyn (@oracletechnet) who said he would look into it.

I've had no time lately to bother him; perhaps with a few more people...

If you are interested, just drop me a line chet at oraclenerd or message me through twitter.

Monday, March 16, 2009

What's the Purpose of a Datawarehouse?

I've been meaning to write this since I first began my foray into the world of datawarehousing back in December of 2006.

What is the purpose of a Datawarehouse?

  1. Reporting

    • Key Performance Indicators (KPI)

    • Performance - Data is stored in such a way (facts and dimensions) which enables higher performance against the same relational tables in an OLTP system.

    • Slicing and Dicing - Viewing the data from multiple angles (dimensions) is the one of the best aspects of the Datawarehouse. Also included is the ability to drill down into the finer grained details.

  2. Single Source of Truth - Especially important in an enterprise environment which may have many systems of record. The datawarehouse allows you to aggregate all those systems into one environment. Not always completed, it's more of an ideal than anything.

  3. Clean Source Data (Transformation) - The datawarehouse allows you to "fix" the sub-par OLTP systems. Think garbage in, garbage out.

From my point of view as a database developer, #3 is the most important. I believe that the "Transformation" stage of ETL should be easy if only the source system had been designed properly. I've seen instances where a column (VARCHAR2(30) for example) meant one thing at one point in time and another at a different point in time. Or a data was stored in the column as MM/DD/YYYY, DD-MON-YY or Month DD, YYYY. How do you clean that up?

I also recognize that in an enterprise environment you can have the best designed OLTP systems but you still have to map the data to one field.

Ultimately, I think #1 should drive the design of all OLTP systems, making concessions where needed for performance. From there, your datawarehouse should flow fairly naturally. The ability to report on the data that you gather allows the business to make decisions for the future. In other words, it's the data stupid.

Thursday, March 12, 2009

How To Populate Your TIME Dimension

Yesterday I wrote about the Datawarehousing TIMES table. Today I populated it. I modified it for my own purposes naturally, specifically removing the fiscal components (thankfully calendar year equals fiscal year). I think that saved me days of figuring out how to calculate certain fields.

I remember the first time trying to do this took about 20 different SQL statements. I generated the key (date, time_id) and then calculate that either loop or perform another SQL statement using specific dates out of the table (last day of the month for instance). This time I managed to do it in a single SQL statement. Again, I left out some of the more complicated calculations since I did not have to worry about fiscal requirements.

I also got a short refresher course in the date format models which always come in handy and learned a new function, NEXT_DAY.
NEXT_DAY returns the date of the first weekday named by char that is later than the date date. The return type is always DATE, regardless of the datatype of date. The argument char must be a day of the week in the date language of your session, either the full name or the abbreviation. The minimum number of letters required is the number of letters in the abbreviated version. Any characters immediately following the valid abbreviation are ignored. The return value has the same hours, minutes, and seconds component as the argument date.
Here's the SQL to populate your TIME Dimension:
SELECT 
TRUNC( sd + rn ) time_id,
TO_CHAR( sd + rn, 'fmDay' ) day_name,
TO_CHAR( sd + rn, 'D' ) day_number_in_week,
TO_CHAR( sd + rn, 'DD' ) day_number_in_month,
TO_CHAR( sd + rn, 'DDD' ) day_number_in_year,
TO_CHAR( sd + rn, 'W' ) calendar_week_number,
( CASE
WHEN TO_CHAR( sd + rn, 'D' ) IN ( 1, 2, 3, 4, 5, 6 ) THEN
NEXT_DAY( sd + rn, 'SATURDAY' )
ELSE
( sd + rn )
END ) week_ending_date,
TO_CHAR( sd + rn, 'MM' ) calendar_month_number,
TO_CHAR( LAST_DAY( sd + rn ), 'DD' ) days_in_cal_month,
LAST_DAY( sd + rn ) end_of_cal_month,
TO_CHAR( sd + rn, 'FMMonth' ) calendar_month_name,
( ( CASE
WHEN TO_CHAR( sd + rn, 'Q' ) = 1 THEN
TO_DATE( '03/31/' || TO_CHAR( sd + rn, 'YYYY' ), 'MM/DD/YYYY' )
WHEN TO_CHAR( sd + rn, 'Q' ) = 2 THEN
TO_DATE( '06/30/' || TO_CHAR( sd + rn, 'YYYY' ), 'MM/DD/YYYY' )
WHEN TO_CHAR( sd + rn, 'Q' ) = 3 THEN
TO_DATE( '09/30/' || TO_CHAR( sd + rn, 'YYYY' ), 'MM/DD/YYYY' )
WHEN TO_CHAR( sd + rn, 'Q' ) = 4 THEN
TO_DATE( '12/31/' || TO_CHAR( sd + rn, 'YYYY' ), 'MM/DD/YYYY' )
END ) - TRUNC( sd + rn, 'Q' ) + 1 ) days_in_cal_quarter,
TRUNC( sd + rn, 'Q' ) beg_of_cal_quarter,
( CASE
WHEN TO_CHAR( sd + rn, 'Q' ) = 1 THEN
TO_DATE( '03/31/' || TO_CHAR( sd + rn, 'YYYY' ), 'MM/DD/YYYY' )
WHEN TO_CHAR( sd + rn, 'Q' ) = 2 THEN
TO_DATE( '06/30/' || TO_CHAR( sd + rn, 'YYYY' ), 'MM/DD/YYYY' )
WHEN TO_CHAR( sd + rn, 'Q' ) = 3 THEN
TO_DATE( '09/30/' || TO_CHAR( sd + rn, 'YYYY' ), 'MM/DD/YYYY' )
WHEN TO_CHAR( sd + rn, 'Q' ) = 4 THEN
TO_DATE( '12/31/' || TO_CHAR( sd + rn, 'YYYY' ), 'MM/DD/YYYY' )
END ) end_of_cal_quarter,
TO_CHAR( sd + rn, 'Q' ) calendar_quarter_number,
TO_CHAR( sd + rn, 'YYYY' ) calendar_year,
( TO_DATE( '12/31/' || TO_CHAR( sd + rn, 'YYYY' ), 'MM/DD/YYYY' )
- TRUNC( sd + rn, 'YEAR' ) ) days_in_cal_year,
TRUNC( sd + rn, 'YEAR' ) beg_of_cal_year,
TO_DATE( '12/31/' || TO_CHAR( sd + rn, 'YYYY' ), 'MM/DD/YYYY' ) end_of_cal_year
FROM
(
SELECT
TO_DATE( '12/31/2002', 'MM/DD/YYYY' ) sd,
rownum rn
FROM dual
CONNECT BY level <= 6575
)
/
You can find the table definition and the INSERT statement here as well.

Any feedback good, bad or ugly is welcome.

Wednesday, March 11, 2009

Time

Many years ago (about 6 or 7 really), my boss sent me this diagram from the Oracle Docs. In particular, he wanted me to focus on the times table. After a few emails exchanged I finally figured out what he was asking, well sort of. He had used a table called periods described below:
CREATE TABLE periods
(
periodid NUMBER(10),
datefrom DATE,
datethru DATE,
periodtype VARCHAR2(30),--DAY, MONTH, YEAR, ETC
);
This was primarily (solely perhaps) used for reporting. He had actually built this pretty cool page for the business to be able to drill down on certain key aspects of the business. That may have been my first real exposure to Data Warehousing.

The point he was trying to make though was that this TIMES table was a flattened version of his PERIODS table. If you have one date, you can see how that date fell in multiple dimensions. What month it took place. What quarter it took place. What day of the year (1-365) it took place. All with just a single row of data.

The TIMES table, or dimension is a standard in data warehousing for exactly this reason.

I've been working on our "data warehouse" (replicated tables from the OLTP) and had been using the PERIODS methodology. There is a time dimension, but it's not populated at this time.

Since I am by no means a data warehousing expert and I have virtually no experience with OBIEE, I'm trying to come up with a way to help bridge both those gaps. Use OBIEE against the relational tables to help me learn OBIEE. I've created a couple of small subject areas so far and demo'd them to the users and they love it. So do I in fact.

Anyway, back to my topic. As I was thinking about TIME, I began to reminisce about time and how I got started. It's funny how I remember that exact moment sitting there with my feet propped up on my desk in my garage office.

Was there a point to all this? I don't think so. Just me rambling on.

Tuesday, January 13, 2009

OBIEE: Direct Database Request

Of course a couple of our leads asked last week if they could get access to OBIEE to do some reports. Naturally, they didn't want the point and click variety, they had the SQL (our OLTP is replicated to our DW as well).

I had no idea how to do it of course. Click click clicking around I couldn't find anything. Wait, BI Administrator. That's how you add users. OK, good enough.

How do you give them access to write Direct Database Request SQL? I'm still not sure but the answer probably lies in the "Answer" section. From there, you can either select a subject area or select a "Direct Database Request." With the Direct Database Request all you need is the Connection Pool information (still not positive on how this is formatted) and your SQL statement.

The Connection Pool entry looks something like this:
"DEV_BI"."Connection Pool"

You do not need to put the relevant tables into the Physical Layer of the Administration tool. That makes life a little easier.

Monday, January 12, 2009

OBIEE: Copy Reports From Prod to Dev

Yes, I said from Prod to Dev.

As I recently reacquired my job via the DW guy resigning, I'm in charge of all his stuff. Which basically means I get to learn the easy way, fire.

This means that I haven't had any time to "learn." I've just had to do. Fortunately so far, I haven't been asked to do a whole lot with the tool. Until now.

We have a large migration in the pipe right now...massive changes to the core of our OLTP system which require massive retooling of our DW system. Naturally I've been working on the database side of things. I started out with some 150 invalid objects and I've whittled it down to 3.

What about the reports though? That SQL is not stored in the database (at least as I understand it). Last week I began looking for a tool in OBIEE that would allow me to see database dependencies, kind of like Application Express' tool. Alas I couldn't find one, but I didn't have a whole lot of time to search. And I'm still learning the OBIEE lingo so I don't really know what to search for.

Anyway, I chatted with my former colleague and he told me I could just update the catalog in the Dev environment. Talked to the DBA and the Linux admin about getting access but it never materialized (it was Friday I think and I was out sick today). I still don't know what all is involved other than there's some sort of file I can just swap out...which scares me. Chances are I'll write over something.

So, off to Mr. Rittman's site.

First, Migrating OBIEE Projects Between DEV and PROD Environments, which led me to this one, Migrating OBIEE Reports Between Web Catalogs.

I hadn't seen (or heard) about the BI Catalog Manager, so I opened it up. It want's a directory or URL. Hmmm...I have no idea. I'll have to get the admin guys to help me out with this one. Let me just try the login URL and see if that works:

http://bi_server_prod:7777/analytics/

Nope. But there's a hint, URL must be in the form of...

http://bi_server_prod:7777/analytics/saw.dll

Got it. Sweet. No outside intervention necessary. Log in to dev:

http://bi_server_dev:7777/analytics/saw.dll

Voila!

Copy and paste the reports over and I'm done. Now the arduous task of going through each report to make sure it's not broken. Should be fun.

Monday, January 5, 2009

Learning Oracle Business Intelligence (OBIEE)

I learned how to edit a user in OBIEE without breaking anything (thus no need for a new counter Jake).

At first I thought it would be done through the web application only to find no edit button associated with the user. I had to go through the online repository, update the user, and commit those changes. Not really sure what all that means yet, but it's a start.

I had a crash tutorial on this last week as the person I replaced was off to start his own company. OBIEE is a vast, vast world.

Where does one go to learn OBIEE from the bottom? OTN naturally. The Oracle By Example series is excellent so that's where I started.

First up, Creating a Repository Using the Oracle Business Intelligence Administration Tool . I've made it through the Physical Mappings and the Business Mappings so far, one more section to go. I can't say I completely understand all the lingo, but I'll catch on sooner or later. I'm more concerned with the basics of support at this point.

General
- Instrumentation

Official Oracle Documentation
- 11.1.1.5
- 11.1.1.3
- 10.1.3.4.1 and prior
- OBIA 7.9.6.3
- OBIA 7.9.6.2

Update 1/13/2011
I have noticed quite a bit of traffic comes to this page so I am going to continue to update it in regards to OBIEE 11g. I'll keep the 10g stuff here as well as many of the principles are the same, I'll just point to specific posts in this section.

Samples
Sample Applications for both 10g and 11g provided by Oracle are here. Good way to learn as you get to see how others do certain tasks.
11.1.1.5 Sample App (v107) is here.
OBIEE 11.1.1.6.2 BP1 - Sample Application (V207)

OBIEE 11g
- Oracle BI EE 11g – Upgrading from BI EE 10g – Repository & Web Catalog
- An Oracle Fusion Middleware 11g Primer for EBS Sysadmins - Don't let the EBS part fool you. Fusion Middleware is now the underpinning of OBIEE so understanding it is key.
- OBIEE 11g - Demystifying the directory structure - Great overview of the new directory structure. I still get lost trying to figure out what is what. Doesn't help that there are multiple folders under different roots with the same name...
--> Here's my take, Fun With Folders!

Startup/Shutdown Stuff

- Startup/Shutdown Script for Linux Installations
- Starting and Stopping Services on Linux (mine)
- OBIEE: Start/Stop Individual Components (Enterprise Manager)
- OBIEE: Start/Stop Individual Components (Manually)

Customization

- Customizing Oracle Business Intelligence Enterprise Edition 11g
- OBE: Creating and Deploying Custom Styles, Skin, and Messages in Oracle BI EE 11g

Architecture Series
Mr. Rittman.

- So Just What Does WebLogic Server Do Within OBIEE 11g?
- So How Does Enterprise Manager work, Within OBIEE 11g?

Bob Ertl - Bob is the first person I have seen from the BI team to actually blog. What better place to get information than straight from the source?
- Oracle BI Server Modeling, Part 1- Designing a Query Factory
- Oracle BI Server Modeling 2- Dimensional Schema Shapes
- Oracle BI Server Modeling, Part 3- Conformed Logical Dimensions

OBIEE 10g 10.1.3.4
Blogs
- The RittmanMead blog is the premier online resource for all things OBI EE. It can be found here.
- John Minkjan over at OBIEE 101 has quite a bit of information. He posts often and detail posts. I've already been able to reference individual posts a number of times.
- gerardnico There are some great resources here.
- Christian Berg - who has helped me quite a bit writes over here.
- rnm1978 - same goes for this guy. Very helpful articles.
- OTN Forums

BI Architecture Series
This just started recently by Mark Rittman, I'm looking forward to future posts in this regard:

Inside the Oracle BI Server Part 1 : The BI Server Architecture
Inside the Oracle BI Server Part 2 : How Is A Query Processed?
Inside the Oracle BI Server Part 3 : BI Server In-Memory Joins

How ToInstallation
Miscellaneous
My articles on said subject can be found here.
I'm sure as time goes on my tastes will ebb and flow. I'll update as appropriate.

Wednesday, September 24, 2008

More on Oracle Exadata Storage

I don't know why I'm so excited about this...I guess I'm trying to live up to my name.

See my previous post here full of a few screen shots from Larry's speech.

A great post on most of the stuff I don't understand from Kevin Closson who works more on the datawarehouse side.

Paul Vallee over at the Pythian Group blog ruminates here and here.

Curt Monash talks about it here and here.

Dana Gardner from ZDNet.

Larry Dignan from CNET.com via ZDNet Australia.

Michael Liedtke via AP.

And how could I forget my local Oracle blogger, Mr. Cunningham.

I'm still amazed at how excited I am. It's just a machine right? Hopefully I'll get to use one of those someday...

Oracle Exadata Storage

So the big announcement today...Larry created an appliance with HP as the hardware.

Press Release

Exadata

Here's a picture of the new beast:

In comparison to Larry:

Some of the characteristics:

And finally, it runs OLTP too!

Thursday, August 28, 2008

Jobs: ETL Developer

This position is in Tampa, Florida. If you are interested you can email me at chet dot justice at gmail.

Position Description:
· Develop and implement applications that support the business strategy of measuring and optimizing performance
· Support existing data warehouse applications and users

Mandatory Skills/Experience
· 2+ years experience working on data warehouse/data integration solutions
· 2+ years working with vendor ETL tools to design/build/test (Business Objects Data Integrator XIR2)
· 1+ years working with Oracle database; good PL/SQL knowledge
· Experience with dimensional data models (star schemas)

Preferred Skills/Experience
· Java programming experience
· Data Quality XIR2
· SQL Server 2005/SSIS
· Business Objects Enterprise XIR2
· Crystal Reports XIR2

Wednesday, July 9, 2008

Pseudo Column ROWDATE?

I'm working a little bit in the datawarehouse again (by the way, is it two words or one?). A technical guy from Oracle showed up today to help us decide the best way to move forward on capturing changes. There were essentially 4 methods:

1. SQL and PL/SQL
2. Streams (CDC)
3. OWB/ODI
4. Logical Standby/Data Guard

As someone on the phone was talking, I started to wander...hmmm...what about some kind of pseudo column for that stored the last update (either INSERT or UPDATE) of a row?

Off the top of my head, I can think of rowid, rownum, and level. I'm sure there are others (feel free to comment).

Oracle, I'm sure, stores that information some where right?

I hadn't really given a thought as to the feasibility or the impact it might have, but that would make capturing changes a whole lot easier...

Wednesday, May 21, 2008

Multi Table INSERTs

I learned something new today! Multi Table INSERTs.

I was trying to capture the new records into a staging table using RETURNING BULK COLLECT INTO, but that doesn't work (yet).

I remembered reading about Multi Table INSERTs, but never had the need for using them, so never bothered to learn it.


CREATE TABLE t
(
x NUMBER PRIMARY KEY,
y NUMBER,
z NUMBER
);

CREATE TABLE s
(
x NUMBER PRIMARY KEY
);

INSERT ALL
INTO t ( x, y, z )
VALUES ( myrownum, ran1, ran2 )
INTO s ( x )
VALUES ( myrownum )
SELECT
rownum myrownum,
dbms_random.value ran1,
dbms_random.value ran2
FROM dual
CONNECT BY LEVEL < 101;

CJUSTICE@ORA10GR2>SELECT COUNT(*) FROM t;

COUNT(*)
----------
100

1 row selected.

Elapsed: 00:00:00.01
CJUSTICE@ORA10GR2>SELECT COUNT(*) FROM s;

COUNT(*)
----------
100

1 row selected.

Elapsed: 00:00:00.02


Awesome!

Monday, May 19, 2008

DIMENSION Objects

One of the very first things I did when I got my current job was to go through the Datawarehousing guide in the Oracle documentation.

Analytics, check.
Materialized Views, check.
OLAP, interesting, but not applicable yet.
Dimensions...hmm, dimensions.

I looked at all the objects in the database but couldn't find any of the type DIMENSION.

I searched AskTom, nothing (or at least nothing I could find).

I can't remember finding any articles on the DIMENSION objects either. Can that be right? Does any one our there use them? Mr. Rittman? Mr. Aldridge? Mr. Scott?

From my brief perusal of the documentation, you can set up the DIMENSION like this:

CREATE TABLE test_dim
(
test_key NUMBER(10,0) PRIMARY KEY,
source VARCHAR2(10) NOT NULL,
source_subcategory VARCHAR2(10)
);

INSERT INTO test_dim( test_key, source, source_subcategory )
VALUES ( 1, 'MEDICAID', NULL );
INSERT INTO test_dim( test_key, source, source_subcategory )
VALUES ( 2, 'MEDICARE', NULL );
INSERT INTO test_dim( test_key, source, source_subcategory )
VALUES ( 3, 'SSA', 'MEDICARE' );
INSERT INTO test_dim( test_key, source, source_subcategory )
VALUES ( 4, 'SPAP', 'MEDICARE' );
INSERT INTO test_dim( test_key, source, source_subcategory )
VALUES ( 5, 'MAPD', 'MEDICARE' );

CREATE DIMENSION test_dim
LEVEL source IS ( test_dim.source )
LEVEL source_subcategory IS ( test_dim.source_subcategory );
My best guess is that it's sort of like the heirarchical table format (FK referencing PK from the same table).

Does anyone out there have any experience using them?

Friday, April 18, 2008

Datawarehouse: Testing

Until recently, I have been doing support/enhancements and new development for one particular project.

In our development environment, I have essentially been doing full load testing, using all of the records that are in production.

Needless to say, our development iterations were slow. Make a change to the code and then reload some 100 million records which are bounced against one another. The good thing is that I've picked up some really good tuning skills. The bad thing is my project was recently 2 months late.

Now that I've had a chance to breathe, I've been able to rethink some of the processes. First and foremost being how we test (both in development and QA).

If I had taken the time one year ago to build a "build" and "teardown" script, I probably could have shaved months of the project life cycle.

I used SQLUnit a few years ago and found that to be a pretty good tool. At the time, it consisted of writing out XML files and was quite time consuming, but gave me more confidence in the code I wrote. It also taught me to think like a tester forcing me to write better code as I could predict much easier where the problems might be.

We've used no such tool at WellCare.

I believe database testing to be extremely difficult. Our unit tests have consisted of running the procedures and then pasting that to a Word document...not really my idea of good testing, but it's what we have.

So are there any tools that you use to test database specific items? If you don't use tools, how do you do it? Why does database testing seem so difficult?