ORACLENERD: sql

Showing posts with label sql. Show all posts

Tuesday, August 23, 2016

Real World SQL and PL/SQL: Advice from the Experts

Because my hero is Cary Millsap, I'm going to do what he did and publish my ~~foreword~~ Preface. All joking aside, I consider myself incredibly fortunate to have been included in this project. I learned...a lot, by simply trying to find the author's mistakes (and there were not many). There was a lot more work than I expected, as well. (Technical) Editing is lot easier than writing, to be sure.

Brendan Tierney and Heli Helskyaho approached me in March 2015 about being an author on this book, along with Arup Nanda and Alex Nuijten. Soon after, we picked up Martin Widlake. To say that I was honored to be asked would be a gross understatement. Rather quickly though, I realized that I did not have the mental energy to devote to the project and didn’t want to put the other authors at risk. Still wanting to be part of the book, I suggested that I be the Technical Editor and they graciously accepted my new role.

This is my first official role as Technical Editor, but I’ve been doing it for years through work; checking my work, checking others work, etc. Having a touch of Obsessive Compulsive Disorder (OCD) helps greatly.

All testing was done with the pre-built Database App Development VM provided by OTN/Oracle which made things easy. Configuration for testing was simple with the instructions provided in those chapters that required it.

One of my biggest challenges was the multi-tenant architecture of Oracle 12c. I haven’t done DBA type work in a few years, so trying to figure out if I should be doing something in the root container (CDB) or the pluggable database (PDB) was fun. Other than that though, the instructions provided by the authors were pretty easy to follow.

Design (data modeling, Edition Based Redefinition, VPD), Security (Redaction/Masking, Encryption/Hashing), Coding (Reg Ex, PL/SQL, SQL), Instrumentation, and “Reporting” or turning that raw data into actionable information (Data Mining, Oracle R, Predictive Queries). These topics are covered in detail throughout this book. Everything a developer would need to build an application from scratch.

Probably my favorite part of this endeavor is that I was forced to do more than simply see if it works. Typically when reading a book, or blog entry, I’ll grab the technical solution and move on often skipping the Why, When, and Where. How, to me, is relatively easy. I read AskTom daily for many years, it was my way of taking a break without getting in trouble. At first, it was to see how particular solutions were solved, occasionally using it for my own problems. After a year or two, I wanted to understand the Why of doing it a certain way and would look for those responses where Tom provided insight into his approach.

That’s what I got reviewing this book. I was allowed into their minds, to not only see How they solved technical problems, but Why. This is invaluable for developer’s and DBAs. Most of us can figure out How to solve specific technical issues, but to reach that next level we need to understand the Why, When and Where. This book provides that.

Wednesday, November 6, 2013

Fun with SQL - My Birthday

This year is kind of fun, my birthday is on November 12th (next Tuesday, if you want to send gifts). That means it will fall on 11/12/13. Even better perhaps, katezilla's birthday is December 13th. 12/13/14. What does this have to do with SQL?

Someone mentioned to me last night that this wouldn't happen again for 990 years. I was thinking, "wow, I'm super special now (along with the other 1/365 * 6 billion people)!" Or am I? I had to do the math. Since date math is hard, and math is hard, and I'm good at neither, SQL to the rescue.

select 
  to_number( to_char( sysdate + ( rownum - 1 ), 'mm' ) ) month_of,
  to_number( to_char( sysdate + ( rownum - 1 ), 'dd' ) ) day_of,
  to_number( to_char( sysdate + ( rownum - 1 ), 'yy' ) ) year_of,
  sysdate + ( rownum - 1 ) actual
from dual
  connect by level <= 100000

(In case you were wondering, 100,000 days is just shy of 274 years. 273.972602739726027397260273972602739726 to be more precise.)

That query gives me this:

MONTH_OF DAY_OF YEAR_OF ACTUAL   
-------- ------ ------- ----------
11       06     13      2013/11/06 
11       07     13      2013/11/07 
11       08     13      2013/11/08 
11       09     13      2013/11/09 
11       10     13      2013/11/10 
11       11     13      2013/11/11 
...

So how can I figure out where DAY_OF is equal to MONTH_OF + 1 and YEAR_OF is equal to DAY_OF + 1? In my head, I thought it would be far more complicated, but it's not.

select *
from
(
  select 
    to_number( to_char( sysdate + ( rownum - 1 ), 'mm' ) ) month_of,
    to_number( to_char( sysdate + ( rownum - 1 ), 'dd' ) ) day_of,
    to_number( to_char( sysdate + ( rownum - 1 ), 'yy' ) ) year_of,
    sysdate + ( rownum - 1 ) actual
  from dual
    connect by level <= 100000
)
where month_of + 1 = day_of
  and day_of + 1 = year_of
order by actual asc

Which gives me:

MONTH_OF DAY_OF YEAR_OF ACTUAL   
-------- ------ ------- ----------
11       12     13      2013/11/12 
12       13     14      2014/12/13 
01       02     03      2103/01/02 
02       03     04      2104/02/03 
03       04     05      2105/03/04 
04       05     06      2106/04/05 
05       06     07      2107/05/06 
...

OK, so it looks closer to 100 years, not 990. Let's subtract. LAG to the rescue.

select
  actual,
  lag( actual, 1 ) over ( partition by 1 order by 2 ) previous_actual,
  actual - ( lag( actual, 1 ) over ( partition by 1 order by 2 ) ) time_between
from
(
  select 
    to_number( to_char( sysdate + ( rownum - 1 ), 'mm' ) ) month_of,
    to_number( to_char( sysdate + ( rownum - 1 ), 'dd' ) ) day_of,
    to_number( to_char( sysdate + ( rownum - 1 ), 'yy' ) ) year_of,
    sysdate + ( rownum - 1 ) actual
  from dual
    connect by level <= 100000
)
where month_of + 1 = day_of
  and day_of + 1 = year_of
order by actual asc

Which gives me:

ACTUAL     PREVIOUS_ACTUAL TIME_BETWEEN
---------- --------------- ------------
2013/11/12                              
2014/12/13                          396 
2103/01/02                        32161 
2104/02/03                          397 
2105/03/04                          395 
2106/04/05                          397 
2107/05/06                          396 
2108/06/07                          398 
2109/07/08                          396 
2110/08/09                          397 
2111/09/10                          397 
2112/10/11                          397 
2113/11/12                          397 
2114/12/13                          396 
2203/01/02                        32161

So, it looks like every 88 years it occurs and is followed by 11 consecutive years of matching numbers. The next time 11/12/13 and 12/13/14 will appear is in 2113 and 2114. Yay for SQL!

Monday, March 25, 2013

Analysis Tools...

I've taken on an effort to port a custom data integration (PL/SQL, Java, etc) application.

In that regard, I'm doing a fair amount of analysis right now. So I need help finding two tools:
1. A tool that will allow me to map (visually or otherwise) a single data point from source to target(s). I typically use Excel. It's easy to use and available everywhere. Where it falls apart, slightly, is that a single data point may have one or more middle steps (i.e. not target) and one or more targets. I think I want something like this:

Keep in mind though, I have potentially hundreds of columns in a system with thousands upon thousands of...

A couple of people have suggested using an ETL tool like Informatica, Pentaho or ODI. Yes. But I don't see it yet. Besides, I don't want to map to actually do something...most of the conversion has already been done and I'm picking it up at a particular step (near the beginning). What's missing is that mapping document that I want to create for everything...but that's another story.

2. I want to to look at a view and know where those stupid unaliased columns are sourced from. A very, very basic example:

SELECT
  hs.column_1,
  hs.column_2,
  add.address_line_1,
  var_value_01,
  var_value_02
FROM
  big_table hs,
  address_table add,
  other_random_table ran
WHERE hs.address_id = add.address_id
  AND hs.random_id = ran.random_id

VAR_VALUE_01 and VAR_VALUE_02, why don't you have aliases? Why did your developer neglect you so? Why can't every single developer just remember that someone, someday, will have to look at their code? Please? Pretty please? Or did you know it would be me and thus you did it on purpose? If so, I'm not talking to you again.

Anyway, it doesn't take me very long to figure where those columns are sourced from. What if there are 10's of those in a view with 100's of columns? Yes, not enjoyable. What if there are many views just like this that you have to analyze? Yes!

Data Dictionary?!

Not yet. DBA_TAB_COLUMNS? Nope. Come on! It's got to be there somewhere...when you compile a view Oracle checks to make sure everything is a-ok right? Doesn't it store that information somewhere? It must!. I took to Twitter, naturally, and Steve Karam, aka @OracleAlchemist found this possible gem:

@oraclenerd @thatjeffsmith @eaviles94 Check this out Chet. rwijk.blogspot.com/2008/10/dbadep…
— Steve Karam (@OracleAlchemist) March 26, 2013

I'm also requesting a feature in SQL Developer...or, trying to anyway. Back channels of course.

I've done this kind of analysis in the past, but it is usually a one off, so there never seemed to be a need to make it repeatable. Now, there is a need. A giant need. If you've got any ideas for me, let me know...

Sunday, February 10, 2013

Fun with Date Math

(First off, sorry Mike, I'm hoping this will break my writer's block...)

On Friday I was asked to look at a report that wasn't returning all of the data. Sample:

Year/Month  Total Sales Total Sales (YAGO)
------------------------------------------
01/31/2013   $1,000,000           $900,000                
03/31/2013                        $950,000

For reference, YAGO is "Year Ago."

Notice anything funny there?

Yeah, February is missing. The (OBIEE) report has a filter on Jan, Feb and Mar of 2013. But it wasn't showing up. I confirmed via manual SQL (hah!) that there was (YAGO) data in there for February. Any ideas?

I immediately suspected one of two things:
- If the Date (month) dimension had a "year ago" column it was wrong.
- The join in OBIEE was doing it wrong.

I checked the date dimension first. It was fine. It didn't even have a YAGO column, so nothing to see there. I looked at the join between the date dimension and the fact table...

(YEAR ("DW".""."DW"."My_Month_Dim"."MONTHEND_DATE" ) - 1 ) * 10000 
+  MONTH ("DW".""."DW"."My_Month_Dim"."MONTHEND_DATE" )  * 100
+ CASE WHEN DayOfMonth("DW".""."DW"."My_Month_Dim"."MONTHEND_DATE") = 29 THEN 28 ELSE
DayOfMonth("DW".""."DW"."My_Month_Dim"."MONTHEND_DATE")  END 
= "DW".""."DW"."My_Fact_Table"."MONTH_DIM_KEY"

I want to tear my eyes out when I see stuff like that. I don't even want to know what it does. * 1000? * 100? Shoot me.

OK, so the MONTH_DIM_KEY is in the YYYYMMDD format. MONTHEND_DATE is a date data-type that corresponds to the last day of the month. For February 2013, it's 20130228, For February 2012, it should be 20120229. <<< Leap Year!!! I'm going to make a wild guess and say that the formula up there isn't working. How to test it though? That's logical SQL (OBIEE), it doesn't run in the database. I just ran the report and grabbed the SQL submitted to the database. This is what it looked like:

          AND ( TO_NUMBER( TO_CHAR( MONTHEND_DATE, 'yyyy' ), '9999' ) - 1 ) * 10000 +
          TO_NUMBER( TO_CHAR( MONTHEND_DATE, 'MM' ), '99' ) * 100 +
          CASE
            WHEN TO_NUMBER( TO_CHAR( MONTHEND_DATE, 'dd' ), '99' ) = 29
            THEN 28
            ELSE TO_NUMBER( TO_CHAR( MONTHEND_DATE, 'dd' ), '99' )
          END = MONTH_DIM_KEY
  AND( MONTHEND_DATE IN( TO_DATE( '2013-01-31', 'YYYY-MM-DD' ), TO_DATE(
  '2013-02-28', 'YYYY-MM-DD' ), TO_DATE( '2013-03-31', 'YYYY-MM-DD' ) ) )

Eyes are burning again. This is also the "prettified" SQL after I hit Ctrl + F7 in SQL Developer. The very first thing I do with OBIEE generated SQL.

One part of that wouldn't be so bad, but it's three formulas adding up to some mysterious number (presumably the last day of the month, for the previous year, in YYYYMMDD format). So I moved all those formulas up into the SELECT part of the statement. Let's see what they are doing.

SELECT
  ( TO_NUMBER( TO_CHAR( MONTHEND_DATE, 'yyyy' ), '9999' ) - 1 ) * 10000 part_1,
  TO_NUMBER( TO_CHAR( MONTHEND_DATE, 'MM' ), '99' ) * 100 part_2,
  CASE
    WHEN TO_NUMBER( TO_CHAR( MONTHEND_DATE, 'dd' ), '99' ) = 29
    THEN 28
    ELSE TO_NUMBER( TO_CHAR( MONTHEND_DATE, 'dd' ), '99' )
  END part_3
FROM my_month_dim
WHERE MONTHEND_DATE IN ( TO_DATE( '2013-01-31', 'YYYY-MM-DD' ), 
                         TO_DATE( '2013-02-28', 'YYYY-MM-DD' ), 
                         TO_DATE( '2013-03-31', 'YYYY-MM-DD' ) )

That resulted in this:

PART_1         PART_2         PART_3
20120000       100            31
20120000       200            28
20120000       300            31

So PART_3 is definitely incorrect. Am I going to bother to figure out why? I have some serious issues inside of my brain which simply do not allow me to do date math. I avoid it at all costs...instead choosing to use whatever the system provides me.

One of my favorites, especially when dealing with leap years, is ADD_MONTHS.

If date is the last day of the month or if the resulting month has fewer days than the day component of date, then the result is the last day of the resulting month

That's why. Add -12 months to February 28, 2013. You don't get back February 28, 2012, you get back the 29th, as it should be. Do the same thing starting with February 29th, 2012. Add 12 months, subtract 12 months. It's always right. Yay for someone figuring this out so I don't ever have to do so.

Sadly, OBIEE doesn't have the equivalent of ADD_MONTHS (or LAST_DAY), you have to build out a string and then concatenate it all together, not pleasant. So I cheated, I used EVALUATE. Here's my working solution.

TO_NUMBER( TO_CHAR( ADD_MONTHS( MONTHEND_DATE, -12 ), 'YYYYMMDD' ) ) = DW_MONTH_DIM_KEY

Oops, that's the physical SQL. How about the OBIEE SQL:

CAST( EVALUATE( 'TO_NUMBER( TO_CHAR( ADD_MONTHS( %1, %2 ), %3 ) )',
 "EDW".""."DW"."Dim_DW_MONTH_DIM_CBM"."MONTHEND_DATE", 
-12, 'YYYYMMDD' ) AS INTEGER ) = "DW".""."DW"."My_Fact_Table"."MONTH_DIM_KEY"

Thursday, July 12, 2012

Fun with OBIEE SQL

I posted a couple of tweets yesterday about this crazy query I had to work with. I won't Storify you to death, so I'll sum it up quickly. 1269 lines (formatted, of course). 13 WITH clauses. 8 base tables.

The error that was occurring was an Oracle one: ORA-00937, "not a single-group function."

That's interesting because OBIEE, the BI Server, generates/builds the SQL based on the logical model you've created. It's doing it wrong, in this instance. I'm willing to wager that it's a bug, in one way or another.

Now, if the logical model doesn't support this particular combination of columns, grain, or whatever, it will tell me so immediately. It will usually tell me while working in the RPD, so I won't even get this far (presentation layer).

Anyway, I found the offending piece of SQL, right at the bottom in the SELECT statement. There were 2 analytic functions, MIN and SUM and no GROUP BY. So I took the non-analytic columns, created a GROUP BY and added them in. It worked.

That didn't solve my problem though. Sure, I could run it in SQL Developer, but that doesn't do the end-user much good.

I tried to trace those 2 columns back up through the SQL...and then my eyes bled. With help from the user, I identified the column (measure) that was breaking the report. Now, could I trace this back through the SQL and figure out where? Not likely, but I tried.

I needed a visual representation, so I tried out FreeMind, a mind mapping software. This is what I came up with:

Nodes in Yellow share the same base tables. Ditto for the other color coded nodes. I couldn't figure out a way to get them to connect or share those child nodes.

Ultimately a fun little exercise (in futility?). Anyone else tried to do something like this?

Thursday, June 14, 2012

TRUNC Time Off Date in OBIEE

"How do I get a prompt to display the day, minus 3 hours?" was the question.

This was a dashboard prompt (calendar widget) with a default value that used the following SQL:

SELECT CURRENT_DATE
FROM "Subject Area"

OK, easy enough.

We just needed to add a predicate or WHERE clause. What would I compare it with?

OK, I have a date column (with the no time component), so I can use that. I also have to use OBIEE (Logical) SQL. Gah. I wrote it in Oracle SQL first, day_column = sysdate - ( 3 / 24 ). Wait, I have to TRUNC the date to compare it to a date or else I won't get anything back. day_column = TRUNC( sysdate - ( 3 / 24 ) ). Better. I test it out and it does what I need it to do.

Now, to convert it to OBIEE SQL.

Instead of SYSDATE, I'll use CURRENT_DATE. Now I need to subtract 3 hours. TIMESTAMPADD for that.

[nQSError: 22025] Function TimestampAdd is called with an incompatible type. (HY000)

Oops. How about CURRENT_TIMESTAMP? Yeah, that works. Here's what I have so far:

TIMESTAMPADD( SQL_TSI_HOUR, -3, CURRENT_TIMESTAMP )

That doesn't work either. Ah, the time component, need to get rid of that. TRUNCATE? Nope, it's for numbers. Hmmm...wait, CAST! I'll just cast it to a date which should remove the time component.

CAST( TIMESTAMPADD( SQL_TSI_HOUR, -13, CURRENT_TIMESTAMP ) AS DATE )

Awesome. The final logical SQL looks like this:

SELECT CURRENT_DATE
FROM "Subject Area"
WHERE
( ( "Date"."Date" IN ( CAST( TIMESTAMPADD( SQL_TSI_HOUR, -13, CURRENT_TIMESTAMP ) AS DATE ) ) ) )

Easy.

Monday, March 28, 2011

Decoding DECODE

I've never been a fan of DECODE. I find it difficult to read, especially past the (EXPRESSION, MATCH_TO_EXPRESSION, RESULT_IF_MATCHED, ELSE_RETURN_VALUE), i.e. 4 spots there. A friend of mine was trying to...decode a long DECODE expression, so I put on my Cary Millsap hat and tried to break it down.

Despite never having seen Cary speak, I love that he always includes Teacher in his bio. Those that present are in fact teachers, but with that inclusion, I think Cary takes it a step further.

(I plan on changing that streak this year, see Tools, Tools, Tools!)

So here's what I said, maybe it can help some newbie out there get started. Me, I encourage people to avoid the statement suggesting the CASE statement instead...mostly for it's readability (naturally I have seen that abused as well, oh well).

Here's a mock statement:

DECODE(my_expression,0,0,NULL,NULL,(another_expresssion/my_expression))

First, format the dang thing. Put some white space in there.

DECODE( my_expression, 0, 0, NULL, NULL, ( another_expresssion / my_expression ) )

We're not exactly tuning, but readability goes a long way.

Now, let's look at the picture defined in the docs:

Just for simplicity's sake, I'm going to spell it out:

DECODE( POSITION_1, POSITION_2, POSITION_3, POSITION_4, POSITION_5, POSITION_6 )

At least to me, that helps. Up to 4 places, this probably isn't necessary, but as it gets long (and I've seen some whoppers), it's good to be able to draw it out.

What does that mean? If POSITION_1 = POSITION_2, then use the value from POSITION_3. If POSITION_1 = POSITION_4, use the value from POSITION_5. The default will be POSITION_6.

Written out in CASE format, it would look like this:

CASE
  WHEN my_expression /* POSITION_1 */ = 0 /* POSITION_2 */
  THEN 0 /* POSITION_3 */
  WHEN my_expression /* POSITION_1 */ IS NULL /* POSITION_4 */
  THEN NULL
  ELSE my_expression / another_expression /* POSITION_6 */
END

I write this up because I got the "You should be a teacher" comment at the end, which I took as a great compliment.

Wednesday, March 23, 2011

Fun with Tuning!

It's been an interesting and fun night.

Started, innocently enough, with a tweet (go figure, reason #1042 why Twitter rocks) about my CTAS operation completing.

That resulted in a flurry of activity and some actual learning on my part.

Of course you have the smart-ass (tweet), Matt Topper [@topperge]

I quickly did the math, it was only 4,919 times faster than mine. Though after tonight, I would have a hard time believing anything I say.

My CTAS operation created 102 GB of data in a shade over 7 hours. He did 70 TB per hour. Whatever. Showoff.

I need to back up a little actually. I have been posting these numbers over the last few days. Yesterday, Martin Berger [@martinberx] sees one those tweets (a cry for help?), and follows up via email (he rocks anyway, he sent katezilla a postcard from Vienna last year too).

We've exchanged a few emails, mostly me telling him I have no idea what he's talking about and then me trying to explain what I am talking about. Or something.

Tonight (yesterday?) he asked for an AWR report. I told him I disabled everything via DBCA. He told me I probably didn't, nicely. Then he pointed me to Tim Hall's post on running awrrpt.sql. Following those easy instructions, I ran the report. I guess I didn't turn it off.

So far, Matt's a smart-ass, and Martin rocks.

Then Greg Rahn (@gregrahn) joins.

So I have cruddy disks? (read it again Justice).

"Seems likely that the disk writes are the slow side of the execution. The read side probably faster. Got SQL Monitor report?"

I'm thinking, "...", actually, I wasn't. I was stuck on disk. But I could get him a SQL Dev report.

With that, he gets the SQL ID and tells me to do this:

Oracle 11g: Real-Time SQL Monitoring Using DBMS_SQLTUNE.REPORT_SQL_MONITOR

That's 3 years old. Wow.

BTW, here's the SQL statement I was running:

CREATE TABLE claim
  COMPRESS BASIC
  NOLOGGING
AS
SELECT /*+ PARALLEL( c, 8 ) */
  date_of_service,
  date_of_payment,
  claim_count,
  units,
  amount,
  ...
  ...
  ( CASE
      WHEN application_year IS NULL THEN '9'
      ELSE application_year
    END ) application_year,
  ( CASE
      WHEN accounting_code IS NULL THEN '9'
      ELSE accounting_code
    END ) accounting_code,
  ( CASE
      WHEN claim_form_type IS NULL THEN '9'
      ELSE claim_form_type
    END ) claim_form_type,
  ( CASE
      WHEN diagnosis_code_1 IS NULL THEN '-999'
      ELSE diagnosis_code_1
    END ) diagnosis_code_1
  ...
  ...
FROM claims c;

The night gets even crazier, old friend, Tom Roach (@tomroachoracle) has now ssh'd into my VM. He's doing all kinds of craziness. He walked me through sar, iostat, and a few other tricks. At least now I know how to use the history command so I can replicate.

Meanwhile, Matt is still heckling me and I get an email from Greg after reviewing the SQL Monitor report.

(I'm paraphrasing here)

"Take a look at your CTAS again, anything jump out at you?"

Me: (To myself): "No"
Me: (staring at the email)

"Perhaps you have a parallel hint on the select but not on the table, like this"

CREATE TABLE claim
  COMPRESS BASIC
  NOLOGGING
  PARALLEL 8
AS
SELECT /*+ PARALLEL( c, 8 ) */
  date_of_service,
  date_of_payment,
  claim_count,
  units,
  amount,
  ...
  ...

Wow. Really? How did he do that? Impressive. Cool!

I admit to Greg that I thought he was talking about hardware in his original tweet. He said something pithy. I bowed.

So that information (kinda important huh?) couple with Mr. Roach's look at what was happening using sar told me something...I remember this, CPU wasn't being fully utilized. I can't remember the exact numbers so let's call it 50%. I told him I was about to rerun (it's about 1:30 AM at this point), he suggested upping the DOP to 16 from 8. Sure. I'll do what I'm told.

I reran the statement with the bug fix corrected and upping the DOP on both the table and the SELECT. As I was putting this together, it finished. 2.5 hours. Waaaay better than 7.5 hours. Tolerable for me since I'm not as cool as Matt (who was only on a 1/4 rack).

I learned stuff...mostly about how little I do know. I'll try to write up more notes in the future so I don't forget everything and so no one who helped me will have wasted their time.

Thanks so much to Tom, Martin and Greg, your help is greatly appreciated.

Monday, March 14, 2011

Fun with SQL - Part 578

by Enrique Aviles [twitter]
The first guest post in awhile. I met Mr. Aviles via twitter (yeah for Twitter!) some time back. We got to meet in real life at our local SOUG meeting with Tanel Põder came to town. He's also graced the front page for awhile wearing his oraclenerd t-shirt (which is now his avatar on Twitter).

Part of my job consists of tuning SQL queries. Opportunities abound and present themselves in many different scenarios. A few weeks ago I was assigned a trouble ticket. I was the lucky winner because, after some investigation by the developers, the culprit was identified as an SQL issue that was causing a stack overflow in Hibernate. For those unfamiliar with it, Hibernate is an Object Relational Mapping library for Java (more info here). The stack overflow occurred during string manipulations probably during the code generation phase. I didn’t have to do much investigation of my own because the developers indicated the offending query was composed of 60 IN clauses, each clause containing 1000 elements. The query resembled something like this:

SELECT COL1, COL2, COL3
FROM
WHERE (UNIQUE_ID IN (1, 2, 3,..., 1000)
  OR (UNIQUE_ID IN (1001, 1002, 1003,..., 2000)
  OR (UNIQUE_ID IN (2001, 2002, 2003,..., 3000)
  ...
  OR (UNIQUE_ID IN ( 59001, 59002, 59003, ..., 60000)

I was happy to see the query generation failed before it reached the database. It is usually better to have the application fail with a glaring error during unit tests rather than allowing a potential failure slip into production. Fortunately we test against a database that’s a slightly older copy of the production database so we execute all our queries using real data volumes, not just against artificial tiny development databases. Had we only tested on a development database, this issue would have made it to production with two possible outcomes:
- The application fails with the Hibernate stack overflow error that triggered the trouble ticket.
- Given a smaller set of unique ids, Hibernate successfully generates the query and sends it to the database.

Since the query never reached the database, it is hard to tell how well it would have performed.

For a handful of IN clauses, it is safe to assume the query would have run in a reasonable amount of time but 60 IN clauses could have possibly taken an unacceptable amount of time. Even if the database is able to process 60 IN clauses with no impact to the user, generating such statements as

UNIQUE_ID IN (1001, 1002,..., 2000)
UNIQUE_ID IN (2001, 2002,..., 3000)
UNIQUE_ID IN (59001, 59002, 59003,..., 60000));

in a query is not a good idea for various reasons. In the absence of bind variables, a lot of CPU will be consumed hard parsing SQL statements since potentially every generated query becomes a brand new query sent to the database even if the number of IN clauses remain static. Using bind variables lessens the load but doesn’t eliminate hard parsing as much as it should because queries with a different number of IN clauses become unique, thus unshareable. Both cases fill the shared pool with unshareable SQL, although the absence of bind variables would fill it faster. This is a waste of CPU cycles and memory. We can significantly reduce hard parsing and eliminate the risk of generating a huge SQL statement by implementing a temporary table.

Instead of selecting and sending to the application server all values that eventually ended in a series of IN clauses, we could simply insert those values in a temporary table and modify the original query to join the temporary table. The following is a sample temporary table:

CREATE GLOBAL TEMPORARY TABLE IDS_TMP 
(
  ID NUMBER
) 
  ON COMMIT PRESERVE ROWS;

The table was loaded with the same INSERT that returned all IDs that were used to build the IN clauses. Once the temporary table contained all desired IDs the application generated the following query:

SELECT COL1, COL2, COL3
FROM TABLE AS TAB
WHERE EXISTS
(SELECT TMP.ID
FROM IDS_TMP AS TMP
WHERE TMP.ID = TAB.ID);

Of course, we could have also generated a query with an INNER JOIN or with a single IN clause but EXISTS worked fine for our case. This is obviously a much shorter SQL text. It remains static regardless of how many IDs are selected since they are hidden in the temporary table. Additionally, there is no need to send all IDs to the application server, spend time building a long character sting, and send them back as part of a SQL query. This translates in one hard parse and many executions so the risk of wasting shared pool and CPU usage is eliminated.

The “magic” of temporary tables makes this a safe solution since data inserted is only available per session. User A cannot select data inserted by user B and vice versa. Temporary data disappears automatically after the session disconnects so there is no need to explicitly DELETE or TRUNCATE the table. At the end of the day, this proved to be a suitable solution.

The fact that IN clauses only support 1000 elements was seen as a limitation. If the database imposes limits chances are defeating them with apparently clever solutions will result in slow performing queries and will increase the risk of wasting system resources. When faced with similar limitations we should ask ourselves if there are other means of achieving the same result that don’t require circumventing what was considered a database shortcoming.

Thursday, November 4, 2010

CyanogenMod 6.0.2 - Email Force Close

This week I managed to brick my phone, then with help unbrick it, and like an ID10T, I messing with it again.

After getting it fixed yesterday, I reinstalled CyanogenMod 6.0.2. Lo and behold, those Email Force Close errors showed up again. Great.

Fortunately I use GMail, so it doesn't really affect me, but it is very annoying.

With a trick I learned yesterday, I decided to see what was going on.

The trick is the Android Debug Bridge, or ADB. Among other things, it allows you to run shell commands on your phone. Since I have had Ubuntu for close to 2 years now, I'm naturally a Linux expert (quit laughing).

The specific command I am talking about is logcat. You plug your phone in, run adb logcat and you can see what's going on...it's a log file.

With it turned on, I went to applications and selected Email.

Sorry! The application Email (process com.android.email) has stopped unexpectedly. Please try again.

I then checked the log output...and what could it be? I'm thinking it's some sort of java error, but after reading more closely...

I/Database(19754): sqlite returned: error code = 1, msg = no such column: accountColor
D/AndroidRuntime(19754): Shutting down VM
W/dalvikvm(19754): threadid=1: thread exiting with uncaught exception (group=0x4001d7e0)
E/AndroidRuntime(19754): FATAL EXCEPTION: main

What's that? Database? Really?

A little further down:

E/AndroidRuntime(19754): java.lang.RuntimeException: Unable to create service 
com.android.exchange.SyncManager: android.database.sqlite.SQLiteException: no such column: 
accountColor: , while compiling: SELECT
  _id,
  displayName,
  emailAddress,
  syncKey,
  syncLookback,
  syncInterval,
  hostAuthKeyRecv,
  hostAuthKeySend,
  flags,
  isDefault,
  compatibilityUuid,
  senderName,
  ringtoneUri,
  protocolVersion,
  newMessageCount,
  securityFlags,
  securitySyncKey,
  signature,
  accountColor
FROM Account

(formatted for better readability)

So someone forgot to add a column? Nice.

Now I know I can put a database on there, I'll be more likely to write something for it. :)

Monday, August 23, 2010

Fun with SQL - Part 59

I got an email today titled, "Your Favorite Movie."

It claims to successfully calculate your favorite movie by picking a number, from 1 to 9, then doing some calculations and applying the result to a list of movies.

Years ago...my first boss, got a similar email and began working through it via SQL. I was very impressed, both by the idea of using SQL and that he could do it in SQL. He was (is) a smart, creative guy.

I'm sure you've seen this one before:

Your Favorite Movie

Try this test. Scroll down and do the quiz as it instructs and find out what movie is your favorite. This amazing math quiz can likely predict which of 18 films you would enjoy the most. Don't ask me how but it really works!

Pick a number from 1-9.

Multiply by 3.

Add 3.

Multiply by 3 again.

Now add the two digits together to find your predicted favorite movie in the list of 18 movies below.

Mine was "Gone with the Wind" - exactly right! So be honest, and do it before you scroll down to see the list below. It's easy and it works.

How do you do this in SQL? It's not that hard really. First, get 1-9. SELECT from DUAL will accomplish that.

SELECT rownum
FROM dual
  CONNECT BY LEVEL <= 9

Then you just need to apply the calculations. First go, I put the calcs in the outer query:

SELECT rn, ( ( ( rn * 3 ) + 3 ) * 3 ) calc_#
FROM
  (
    SELECT rownum rn
    FROM dual
      CONNECT BY LEVEL <= 9
  )
/

    RN     CALC_#
------ ----------
     1         18
     2         27
     3         36
     4         45
     5         54
     6         63
     7         72
     8         81
     9         90

A pattern jumps out at me immediately...

So I move it to the inner query and use SUBSTR to get each number and then add them together. I am sure there are better ways...this was quick and dirty.

SELECT
  rn,
  SUBSTR( t, 1, 1 ) + SUBSTR( t, 2, 1 ) total
FROM
  (
    SELECT
      rownum rn,
      ( ( ( rownum * 3 ) + 3 ) * 3 ) t
    FROM dual
      CONNECT BY LEVEL <= 9
  )
/

The results reveal the pattern rather quickly.

    RN      TOTAL
------ ----------
     1          9
     2          9
     3          9
     4          9
     5          9
     6          9
     7          9
     8          9
     9          9

Of course #9 is the movie that you would never pick. It's funny though:

1. Gone With The Wind
2. E.T.
3. Beverly Hills Cop
4. Star Wars
5. Forrest Gump
6. The Good, The Bad, and the Ugly
7. Jaws
8. Grease
9. The Joy of Anal Sex With A Goat
10. Casablanca
11. Jurassic Park
12. Shrek
13. Pirates of the Caribbean
14. Titanic
15. Raiders Of The Lost Ark
16. Home Alone
17. Mrs. Doubtfire
18. Toy Story

Tuesday, May 11, 2010

OBIEE: Gotcha #1.1

Another, funny in hindsight, continuation on Gotcha #1.

Sadly I can't speak to the specifics of what I was doing but it did involve the coolest thing ever, Exadata.

We ran into some problems when are reports weren't returning the correct results. Specifically, year over year data (last year's).

I suspected one thing, someone else suspected another and a third person got it right.

Turns out, we had added a "blank" logical column to the RPD. Not sure of the exact reason why or what was trying to be accomplished, but there it was.

As I dug through the physical SQL, I noticed a bunch of '' (that's back-to-back single quotes), essentially an empty string. I have read on how Oracle handles empty strings and how many vendors deal with NULL values differently, so I suspected this was the cause. My first thought wasn't correct though.

First a sample of how Oracle handles both empty strings and NULL values in a WHERE clause.

CJUSTICE@TESTING>SELECT * FROM dual WHERE '' = '';

no rows selected

So comparing an empty string to itself evaluates to false.

CJUSTICE@TESTING>SELECT * FROM dual WHERE NULL = NULL;

no rows selected

Of course everyone knows that NULL is the absence of value, so comparing it to itself evaluates to false.

My original thought was that somehow the '' (empty string) was being used in the GROUP BY clause...but that didn't make much sense as even if it was used, it would be the same "value" as in the SELECT.

Then I saw it...after digging through 500 lines of OBIEE generated SQL, there it was.

In the first Gotcha, I had been capturing physical SQL not tuned specifically for Oracle...it was just using a generic ODBC connection. I'm glad I made that mistake though, because I had all that SQL.

For this specific report using the generic ODBC connection, 4 queries were sent to the database and stitched in OBIEE to display the reports.

When the driver was switched to an Oracle specific one, it create one giant WITH statement. Actually, it was about 8 different WITH statements...which made it hard to analyze because of the dependencies on previous WITH statements in the same SQL.

I found it be running the first 2 SELECT statements individually. I got the results from this year and the results from last year. Perfect. But it added to the Why? of what was going on.

Further down in this massive SQL statement, I found 2 FULL OUTER JOINs. Could it be? Could it really be joining on that '' column?

Yes it could and it would explain exactly why incorrect results were being returned.

Per the example above, when comparing an empty string against itself, the expression evaluates to false...so the table being joined to would never return results.

We decided to add a character to it, a dot (.) and sure enough, the correct results returned.

Another realization after the fact (made by my colleague), was that using an Oracle database would decrease the number of queries being sent to the database thus allowing for more connections and less work by the BI Server. Super win!

Wednesday, April 28, 2010

ORA-01820: format code cannot appear in date input format

I learned something new today...you can't use week date format qualifiers in TO_DATE.

I was trying to transform a week to a month like this:

CJUSTICE@TESTING>SELECT TO_DATE( 200803, 'YYYYWW' ) FROM DUAL;
SELECT TO_DATE( 200803, 'YYYYWW' ) FROM DUAL
                        *
ERROR at line 1:
ORA-01820: format code cannot appear in date input format

Then tried one "W"

  1* SELECT TO_DATE( 200803, 'YYYYW' ) FROM DUAL
CJUSTICE@TESTING>/
SELECT TO_DATE( 200803, 'YYYYW' ) FROM DUAL
                        *
ERROR at line 1:
ORA-01820: format code cannot appear in date input format

Then "IW"

  1* SELECT TO_DATE( 200803, 'YYYYIW' ) FROM DUAL
CJUSTICE@TESTING>/
SELECT TO_DATE( 200803, 'YYYYIW' ) FROM DUAL
                        *
ERROR at line 1:
ORA-01820: format code cannot appear in date input format

Datetime Format models in 10gR2

Lo and behold the Error messages manual didn't have it (at least where I would expect it).

Put it up on twitter and Gary Myers responded:

That's good to know.

Now I just need to figure out how to transform that week into a month.

I'm trying to generate some test data for a quick...test.

I switched to using SYSDATE as that would allow more more flexibility to transform the other way.

SELECT DISTINCT
  TO_NUMBER( TO_CHAR( r, 'YYYYWW' ) ) fiscal_year_week,
  TO_NUMBER( TO_CHAR( r, 'YYYYMM' ) ) fiscal_year_month
FROM 
  (
    SELECT SYSDATE - rownum r
    FROM dual
      CONNECT BY LEVEL <= SYSDATE - TRUNC( SYSDATE, 'YEAR' )
  )

Which gave me exactly what I needed

FISCAL_YEAR_WEEK FISCAL_YEAR_MONTH
---------------- -----------------
          201015            201004
          201014            201004
          201012            201003
          201011            201003
          201007            201002
          201006            201002
          201002            201001
          201016            201004
          201010            201003
          201003            201001
          201017            201004
          201013            201003
          201009            201003
          201009            201002
          201005            201002
          201004            201001
          201013            201004
          201008            201002
          201005            201001
          201001            201001

Monday, April 26, 2010

Analytics in Views

The question:

Does embedding an analytic function into a view cause a WINDOW SORT every time the view is called?

I say no, only if it is specifically referenced.

SH@TESTING>SELECT * FROM v$version;

BANNER
----------------------------------------------------------------
Oracle Database 10g Enterprise Edition Release 10.2.0.3.0 - Prod
PL/SQL Release 10.2.0.3.0 - Production
CORE    10.2.0.3.0      Production
TNS for 32-bit Windows: Version 10.2.0.3.0 - Production
NLSRTL Version 10.2.0.3.0 - Production

The view definition:

CREATE OR REPLACE
VIEW vw_sales
AS
SELECT
  prod_id,
  cust_id,
  time_id,
  channel_id,
  promo_id,
  quantity_sold,
  amount_sold,
  SUM( amount_sold ) OVER
    ( PARTITION BY
        prod_id,
        cust_id
      ORDER BY time_id ) amount_sold_running
FROM sales;

I run an explain plan as SELECT *

SH@TESTING>EXPLAIN PLAN FOR SELECT * FROM VW_SALES;

Explained.

Elapsed: 00:00:02.19
SH@TESTING>@EXPLAIN

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------
Plan hash value: 2700574370

----------------------------------------------------------------------------------------------------
| Id  | Operation             | Name     | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     | Pstart| Pstop |
----------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT      |          |   918K|    87M|       |  7841   (3)| 00:01:35 |       |       |
|   1 |  VIEW                 | VW_SALES |   918K|    87M|       |  7841   (3)| 00:01:35 |       |       |
|   2 |   WINDOW SORT         |          |   918K|    25M|    84M|  7841   (3)| 00:01:35 |       |       |
|   3 |    PARTITION RANGE ALL|          |   918K|    25M|       |   389  (12)| 00:00:05 |     1 |    28 |
|   4 |     TABLE ACCESS FULL | SALES    |   918K|    25M|       |   389  (12)| 00:00:05 |     1 |    28 |
----------------------------------------------------------------------------------------------------

Now if I just select the columns I want/need, will Oracle perform the WINDOW SORT?

SH@TESTING>EXPLAIN PLAN FOR
  2  SELECT QUANTITY_SOLD, AMOUNT_SOLD
  3  FROM vw_sales;

Explained.

Elapsed: 00:00:00.02
SH@TESTING>@explain

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------
Plan hash value: 1550251865

---------------------------------------------------------------------------------------------
| Id  | Operation           | Name  | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
---------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |       |   918K|  7178K|   389  (12)| 00:00:05 |       |       |
|   1 |  PARTITION RANGE ALL|       |   918K|  7178K|   389  (12)| 00:00:05 |     1 |    28 |
|   2 |   TABLE ACCESS FULL | SALES |   918K|  7178K|   389  (12)| 00:00:05 |     1 |    28 |
---------------------------------------------------------------------------------------------

Oracle is smart enough not to perform the WINDOW SORT if the column is not selected.

Nice.

Thursday, April 15, 2010

Fun with OBIEE SQL - Part I

I suspect this will be the first of many in the series.

I was grabbing physical SQL for some tests and noticed this little gem.

I have prettified it for your reading pleasure.

WITH 
SAWITH0 AS 
(
  select T351236.TIME_ID as c2
  from TIME T351236
  where  ( T351236.DAY = TO_DATE('2010-04-14' , 'YYYY-MM-DD') ) 
),
SAWITH1 AS 
(
  select T351236.TIME_ID as c2
  from TIME T351236
  where ( T351236.DAY = TO_DATE('2010-04-14' , 'YYYY-MM-DD') ) ),
SAWITH2 AS 
(
  select T351236.TIME_ID as c2
  from TIME T351236
  where  ( T351236.DAY = TO_DATE('2010-04-14' , 'YYYY-MM-DD') ) 
),
SAWITH3 AS 
(
  select T351236.TIME_ID as c2
  from TIME T351236
  where  ( T351236.DAY = TO_DATE('2010-04-14' , 'YYYY-MM-DD') ) 
),
SAWITH4 AS 
(
  select T351236.TIME_ID as c2
  from TIME T351236
  where  ( T351236.DAY = TO_DATE('2010-04-14' , 'YYYY-MM-DD') ) 
),
SAWITH5 AS 
(
  select T351236.TIME_ID as c2
  from TIME T351236
  where  ( T351236.DAY = TO_DATE('2010-04-14' , 'YYYY-MM-DD') ) 
),
SAWITH6 AS 
(
  select T351236.TIME_ID as c2
  from TIME T351236
  where  ( T351236.DAY = TO_DATE('2010-04-14' , 'YYYY-MM-DD') ) 
),
SAWITH7 AS 
(
  select T351236.TIME_ID as c2
  from TIME T351236
  where  ( T351236.DAY = TO_DATE('2010-04-14' , 'YYYY-MM-DD') ) 
),
SAWITH8 AS 
(
  select T351236.TIME_ID as c2
  from TIME T351236
  where  ( T351236.DAY = TO_DATE('2010-04-14' , 'YYYY-MM-DD') ) 
),
SAWITH9 AS 
(
  select T351236.TIME_ID as c2
  from TIME T351236
  where  ( T351236.DAY = TO_DATE('2010-04-14' , 'YYYY-MM-DD') ) 
),
SAWITH10 AS 
(
  select T351236.TIME_ID as c2
  from TIME T351236
  where  ( T351236.DAY = TO_DATE('2010-04-14' , 'YYYY-MM-DD') ) 
),
SAWITH11 AS 
(
  select T351236.TIME_ID as c2
  from TIME T351236
  where  ( T351236.DAY = TO_DATE('2010-04-14' , 'YYYY-MM-DD') ) 
),
SAWITH12 AS 
(
  select T351236.TIME_ID as c2
  from TIME T351236
  where  ( T351236.DAY = TO_DATE('2010-04-14' , 'YYYY-MM-DD') ) 
)
(
  (
    (
      (
        (
          (
            (
              (
                (
                  (
                    (
                      (
                        select distinct '201002' as c1
                        from SAWITH0
                        union
                        select distinct '201003' as c1
                        from SAWITH1
                      )
                      union
                      select distinct '201004' as c1
                      from SAWITH2
                    )
                    union
                    select distinct '201005' as c1
                    from SAWITH3
                  )
                  union
                  select distinct '201006' as c1
                  from SAWITH4
                )
                union
                select distinct '201007' as c1
                from SAWITH5
              )
              union
              select distinct '201008' as c1
              from SAWITH6
            )
            union
            select distinct '201009' as c1
            from SAWITH7
          )
          union
          select distinct '201010' as c1
          from SAWITH8
        )
        union
        select distinct '201011' as c1
        from SAWITH9
      )
      union
      select distinct '201012' as c1
      from SAWITH10
    )
    union
    select distinct '201013' as c1
    from SAWITH11
  )
  union
  select distinct '201014' as c1
  from SAWITH12
)

First off, I had never thought to use the WITH clause in such a way. Note there are 13 WITH statements above.

Second, WTF is up with all those UNIONs?

Third (and this isn't OBI EE's fault), what's up with the DISTINCT clause?

All the WITH clauses are doing is getting the TIME_ID (in other words, a single record) for the current day...and doing that 13 times to get 13 records. If this had been against Oracle originally, I probably would have used DUAL to generate the 13 rows. Since those values are periods (weeks actually), you have your TIME dimension to work with...so maybe something like this:

SELECT DISTINCT week
FROM time
WHERE day <= SYSDATE
  AND day >= SYSDATE - ( 13 /*weeks*/ * 7 )
ORDER BY week DESC;

Looks like some low hanging fruit to me.

Wednesday, April 7, 2010

Fun with SQL - Part 39

I've been busy helping to get an environment set up for an Exadata POC. w00t.

What this means, is getting all of these disparate data structures into a single Oracle database. OBIEE is run on Windows and thankfully, 11gR2 was released for 64bit and 32bit this week. I downloaded it, installed the software (4 minutes) then used the DBCA to install the database (3 minutes). I really need to get a better computer because these activities take too long on my itty bitty laptop.

Anyway, there are 5 schemas across 3 databases. I have to 1, get access to the specific environment and 2, copy over the data. I've used a mixture of COPY, Data Pump, import/export from SQL Developer and database links to move all this stuff around.

Finally I get to my last table, a table full of usernames and passwords.

Only the password column is of the LONG variety.

ERROR at line 1:
ORA-00997: illegal use of LONG datatype

Bollocks.

I then go to AskTom because I know there is an easy way to convert to VARCHAR2. Nope, source database doesn't recognize ROWIDs in this context.

Hmmm...how about TO_LOB? Nope.

I gave up due to a time constraint.

I then asked one of the other developers if he could dump the table for me. While I was explaining it to him...I realized I didn't need the stupid password column. Why? Because this was just for a POC/demo, the passwords would all be the same. CREATE TABLE t AS SELECT username... would work just fine. And it did.

Finally, to the fun part, at least for me.

All of these users need database accounts as well. How about scripting that out?

What would I need? CREATE USER statements and GRANT create session TO user; statements.

I started with, what would be, my inner query:

SELECT
  username,
  'CREATE USER ' || username || ' IDENTIFIED BY password DEFAULT TABLESPACE USERS;' create_s,
  'GRANT create session TO ' || username || ';' grant_s
FROM t
ORDER BY username

That's all well and good, but I need this to run in order and a column/column approach wouldn't work (without manual intervention anyway). Now that I think about it...I could have solved this a lot easier...but I'll save that for the end. It was too much "fun" trying to figure out this solution. Can you say Complicator's Gloves?

Indulge me for a moment.

Basically, with 3 columns (really only 2, I don't need to display the username), I need to pivot these 2 columns into 2 rows. I can use dual for that.

SELECT rownum my_rownum
FROM dual
  CONNECT BY LEVEL <= 2;

Now I hook them up:

SELECT
  username,
  ( CASE WHEN MOD( rownum, 2 ) = 0 THEN 1 ELSE 2 END ) c,
  ( CASE 
      WHEN MOD( rownum, 2 ) = 0 THEN create_s 
      ELSE grant_s 
    END ) t
FROM
  (
    SELECT
      username,
      'CREATE USER ' || username || ' IDENTIFIED BY password DEFAULT TABLESPACE USERS;' create_s,
      'GRANT create session TO ' || username || ';' grant_s
    FROM t
    ORDER BY username
  ) a,
  dual 
    CONNECT BY LEVEL <= 2
ORDER BY 1, 2;

This works, sort of. Adding a DISTINCT clause makes it work...but I only have 1 row for each user. Hmmm...

I add the rownum into the outer query to see what it's doing.

OK, using the rownum was the wrong approach. How about using ROW_NUMBER in the inner query? I could set the ORDER BY clause to order by the username.

SELECT
  username,
  ( CASE WHEN MOD( my_rownum, 2 ) = 0 THEN 1 ELSE 2 END ) c,
  ( CASE 
      WHEN MOD( my_rownum, 2 ) = 0 THEN create_s 
      ELSE grant_s 
    END ) t
FROM
  (
    SELECT
      ROW_NUMBER() OVER ( ORDER BY username ) my_rownum,
      username,
      'CREATE USER ' || username || ' IDENTIFIED BY password DEFAULT TABLESPACE USERS;' create_s,
      'GRANT create session TO ' || username || ';' grant_s
    FROM t
    ORDER BY username
  ) a,
  dual 
    CONNECT BY LEVEL <= 2
ORDER BY 1, 2;

Closer. But not quite right. I'm not getting the order I want (CREATE USER then GRANT). Let's use dual for that rownum.

Here's my final query (sorry, I'm tired and crazy):

SELECT DISTINCT
  username,
  ( CASE WHEN MOD( b.m, 2 ) = 0 THEN 1 ELSE 2 END ) c,
  ( CASE 
      WHEN MOD( b.m, 2 ) = 0 THEN create_s 
      ELSE grant_s 
    END ) t
FROM
  (
    SELECT
      username,
      'CREATE USER ' || username || ' IDENTIFIED BY password DEFAULT TABLESPACE USERS;' create_s,
      'GRANT create session TO ' || username || ';' grant_s
    FROM pass_sync
    ORDER BY username
  ) a,
  (
    SELECT rownum m
    FROM dual
      CONNECT BY LEVEL <= 2
   ) b
ORDER BY 1;

So what's the easy way?

SELECT
  username,
  'CREATE USER ' || username || ' IDENTIFIED BY password DEFAULT TABLESPACE USERS;' || chr(10) ||
  'GRANT create session TO ' || username || ';' s
FROM p

Yeah, that || CHR(10) || would have worked just fine. Oh well.

Wednesday, February 3, 2010

PRIMARY KEY and NOT NULL

I've seen this far too often. A table with a primary key (good) and a check constraint (NOT NULL) on the same column.

Stop doing it. Watch.

CREATE TABLE t
(
  id NUMBER
    CONSTRAINT pk_id PRIMARY KEY
);

SH@I_HAVE_NO_IDEA>INSERT INTO t ( id ) VALUES ( 1 );

1 row created.

Elapsed: 00:00:00.33
SH@I_HAVE_NO_IDEA>INSERT INTO t ( id ) VALUES ( NULL );
INSERT INTO t ( id ) VALUES ( NULL )
                              *
ERROR at line 1:
ORA-01400: cannot insert NULL into ("SH"."T"."ID")

As HillbillyToad said,

It is better than no constraint, that's for sure. The heart was in the right place...

Monday, February 1, 2010

Database Table Size

I've always wondered how big a table is...up until recently I depended on the DBAs to retrieve such information for me.

Thanks to my good, and very helpful, friend, Mr. Thomas Roach, I no longer have to wait or bother the DBAs.

%_SEGMENTS contains a column called BYTES. Use this column to determine the size of your table, with just a little math.

SELECT segment_name, SUM( bytes ) / 1024 / 1024 mb
FROM user_segments
GROUP BY segment_name
ORDER BY 1
/

SEGMENT_NAME                           MB
------------------------------ ----------
BMP_DIVNBR_CUST                    13.375
BMP_DIVNBR_JOINFACT               37.0625
BMP_DIVNBR_PROD                    13.375
BMP_DIVNBR_SALES                  78.6875
CUST                                  940
DIV                                  .125
IDX_CUSTSKDIVNBR_SALES          2676.6875
IDX_PRODSKDIVNBR_SALES          2701.6875
JOIN_FACT                        4298.125
PIM                                   312
PK_DIVSK                             .125
PK_PIMSK                                8
PK_TMSK                               .25
PROD                            2274.4375
SALES                          116122.375
TIME                                 .875
UQ_CUSTSKDIVNBR_CUST               40.125
UQ_PRODSKDIVNBR_PROD                  144

There's a plethora of these scripts out in the wild...but I was originally inspired by helping Mr. Neil Kodner out back in November (which I refer to as the "missing" month). Read his take on it here.

Sunday, January 10, 2010

Dynamic Single Row Query

Once in awhile doing analysis, I'd like to see the count from each table, just to get an idea of how much data I am working with. Of course this doesn't measure width, but it is a metric.

Of course I could go through and write a SELECT COUNT(*) from every table, that works if there are like 4 tables. Anything more and...well it sucks.

Yes, I could gather stats on the schema and then reference NUM_ROWS, but this is an occasion where I don't have the necessary privileges.

I've been trying to do this for years, with no luck, until yesterday. Typically I would do something like this:

DECLARE
  l_count INTEGER;
  l_sql VARCHAR2(200);
BEGIN
  FOR i IN ( SELECT table_name
             FROM dba_tables
             WHERE owner = 'SYSTEM' ) 
  LOOP
    l_sql := 'SELECT COUNT(*) INTO :1 FROM SYSTEM.' || i.table_name;
    EXECUTE IMMEDIATE l_sql USING l_count, i.table_name;
    dbms_output.put_line( i.table_name || '-' || l_count );         
  END LOOP;
END;
/

Which would of course would give me this:

DECLARE
*
ERROR at line 1:
ORA-01745: invalid host/bind variable name
ORA-06512: at line 10

Only now did I realize a fatal flaw with that...I was trying to bind the variable into the string, which would obviously never work.

For whatever reason, this time I decided to research it. I came across Flavio Cassetta's site and this post, SQL Error: ORA-01745: invalid host/bind variable namI'll skip a few iterations and get to the final product (because I went through this exercise on Friday night and quite honestly, don't remember all the permutations).

SET SERVEROUTPUT ON

DECLARE
  l_count INTEGER;
  l_table VARCHAR2(61);
BEGIN
  FOR i IN ( SELECT table_name
             FROM dba_tables
             WHERE owner = 'SYSTEM' ) 
  LOOP
    l_table := 'SYSTEM.' || i.table_name;
    EXECUTE IMMEDIATE 'SELECT COUNT(*) FROM ' || l_table INTO l_count;
    dbms_output.put_line( l_table || '-' || l_count );         
  END LOOP;
END;
/

...snip
SYSTEM.LOGMNR_USER$-0
SYSTEM.LOGMNR_OBJ$-0
SYSTEM.LOGMNR_DICTIONARY$-0
SYSTEM.LOGMNR_DICTSTATE$-0
SYSTEM.OL$NODES-0
SYSTEM.OL$HINTS-0
SYSTEM.OL$-0

PL/SQL procedure successfully completed.

Done.

Thursday, September 17, 2009

SQL: Calculate Wasted Time

Inspired by this today:

I answered with this:

I started to do it in SQL, but then I got impatient. I finished it in Excel just to get it done. Some time later, I decided to do it in SQL...just because it's fairly easy.

I know there are better/different ways to do it, so share please.


SELECT
  ROUND( d * 10 * 14 * sal_per_minute, 2 ) yearly_cost,
  ROUND( ( d / 4 ) * 10 * 14 * sal_per_minute, 2 ) quarterly_cost,
  ROUND( ( d / 12 ) * 10 * 14 * sal_per_minute, 2 ) monthly_cost,
  ROUND( 5 * 10 * 14 * sal_per_minute, 2 ) weekly_cost,
  ROUND( 10 * 14 * sal_per_minute, 2 ) daily_cost
FROM
  (
    SELECT
      COUNT( CASE
              WHEN TO_CHAR( s, 'D' ) NOT IN ( 1, 7 )
              THEN 1
              ELSE NULL
             END ) - 10 d,
      MAX( sal ) sal_per_minute
    FROM
      (
        SELECT
          TO_DATE( '31-DEC-2008', 'DD-MON-YYYY' ) + rownum s,
          ( 100000 / 2080 ) / 60 sal
        FROM dual
          CONNECT BY LEVEL <= 365
      )
  );

YEARLY_COST QUARTERLY_COST MONTHLY_COST WEEKLY_COST DAILY_COST
----------- -------------- ------------ ----------- ----------
   28157.05        7039.26      2346.42       560.9     112.18

Those minutes are precious...so don't waste them.