ORACLENERD: constraints

Showing posts with label constraints. Show all posts

Wednesday, February 3, 2010

PRIMARY KEY and NOT NULL

I've seen this far too often. A table with a primary key (good) and a check constraint (NOT NULL) on the same column.

Stop doing it. Watch.

CREATE TABLE t
(
  id NUMBER
    CONSTRAINT pk_id PRIMARY KEY
);

SH@I_HAVE_NO_IDEA>INSERT INTO t ( id ) VALUES ( 1 );

1 row created.

Elapsed: 00:00:00.33
SH@I_HAVE_NO_IDEA>INSERT INTO t ( id ) VALUES ( NULL );
INSERT INTO t ( id ) VALUES ( NULL )
                              *
ERROR at line 1:
ORA-01400: cannot insert NULL into ("SH"."T"."ID")

As HillbillyToad said,

It is better than no constraint, that's for sure. The heart was in the right place...

Thursday, September 17, 2009

T-Shirt: Constraints

Just in case you ever forget what the 5 constraints in an Oracle database are...

Monday, July 27, 2009

IDs (NUMBER) vs. Codes (VARCHAR2)

A slightly different way of putting that is Surrogate vs. Natural keys.

Not that I really want to get into the entire discussion...but I will...a little.

On Friday having a discussion with a developer friend after I showed him my Party model. It's all about boozing it up. OK, wishful thinking on my part, it's just a database design based on the Party (or Entity) Model. I've pondered it here and here.

I'm not planning on debating the merits of the Party Model. I do like it though, it seems to me a natural end point. It does not take into account testing of the design, nor does it take into account places where you might denormalize...and many choose to denormalize by keeping addresses in line with people or organizations.

On to my point.

To maintain data integrity without using Check constraints, I'll create a small lookup table.

ADDRESS_CODES

PHONE_CODES

EMAIL_CODES

STATE_CODES

Perhaps a few others. ADDRESS_CODES, as the name suggests, would use a code as the key, a VARCHAR2. HOME, WORK, OTHER, etc. Same goes for the others.

I posted my query on twitter Friday morning and received a pretty solid response from 4 people.

The response was swift:

Which was exactly what my conversation with my friend had led to.

In this case, why would Natural keys, VARCHAR2s, be bad.

1. If a user enters them in, they could misspell something, CELL would be CLEL or something. Fair enough.
2. If #1 happens, you just fix the name field instead of the key.

So I started to come around to it, until I got to STATES. States have a 2 letter abbreviation. Why not use that? It's not like they'll change (hah!) right?

Perhaps a distinction is in order then. If it's user entered content, then use a surrogate key. If not, use the supplied code.

I don't know. Is this just my personal preference? Is there a standard? What do you do?

Tuesday, July 21, 2009

Constraints to the Max!

I ran across this question today on the Oracle-l list:

Hi List,

I have read the following but I am looking for a way to create a
case-insensitive database. Is it even possible?

http://askanantha.blogspot.com/2007/07/making-oracle-case-insensitive.html

Thanks,

Roger Xu

Later in the thread, Niall Litchfield, replied with the following:

Assuming that you mean you want to make all string data case insensitive and that the requirement has come from developers who don't want to check their inputs for case errors (though I bet they want to say business logic is an application function) then a check constraint on each column that the inserted/updated value is equal to its uppercase representation is a start. Then they'll have to either check for the constraint failing or start discussions with you about database input validation aka constraints.

That gave me an idea.

CREATE TABLE t
(
  x VARCHAR2(30)
    CONSTRAINT pk_x PRIMARY KEY
    CONSTRAINT ck_upper_x_t CHECK ( x = UPPER( x ) )
);

What does that do? It insures that the value put in X is always uppercase.

PARTY@TESTING>INSERT INTO t ( x ) VALUES ( 'home' );
INSERT INTO t ( x ) VALUES ( 'home' )
*
ERROR at line 1:
ORA-02290: check constraint (PARTY.CK_UPPER_X_T) violated

PARTY@TESTING>INSERT INTO t ( x ) VALUES ( 'HOME' );

1 row created.

Cool. Now the Application Developers have to deal with it. No wiggle room there.

How far can you go?
Let's see.

DROP TABLE t PURGE;

CREATE TABLE t
(
  x VARCHAR2(30)
    CONSTRAINT pk_x PRIMARY KEY
    CONSTRAINT ck_upper_x_t CHECK ( x = UPPER( x ) )
    CONSTRAINT ck_nospaces_x_t CHECK ( INSTR( x, ' ' ) = 0 )
    CONSTRAINT ck_charonly_x_t CHECK ( REGEXP_INSTR( x, '[[:digit:]]' ) = 0 )
);

I'm sure I could go on and on...but it's kind of fun.

ARTY@TESTING>INSERT INTO t ( x ) VALUES ( 'HOME SCHOOL' );
INSERT INTO t ( x ) VALUES ( 'HOME SCHOOL' )
*
ERROR at line 1:
ORA-02290: check constraint (PARTY.CK_NOSPACES_X_T) violated

No spaces allowed.

PARTY@TESTING>INSERT INTO t ( x ) VALUES ( '12345' );
INSERT INTO t ( x ) VALUES ( '12345' )
*
ERROR at line 1:
ORA-02290: check constraint (PARTY.CK_CHARONLY_X_T) violated

No digits allowed!

Why?
Constraints are awesome. The very definition of constraints is awesome. Some of the best ideas come to you when you are constrained by something, usually time. OK, maybe not time in software development. Time in writing I've heard is pretty cool. Limit the colors an artist has available and see what they come up with. Less choices sometimes means better.

Typically I'll constrain the crap out of a data model. With development and testing, some of those will be relaxed. Some will be added.

The point is, don't be afraid of them. Constraints are a very good thing indeed.

Monday, June 29, 2009

Constraints: ENABLE NOVALIDATE

Yesterday while perusing the Concepts Guide, I stumbled across the ENABLE NOVALIDATE keywords for the definition of a Foreign Key constraint. I've always known it was there, just never used it, or thought to use it.

It can be a big benefit while working on a legacy system.

Suppose you have a table, T_CHILD:

CREATE TABLE t_child
(
  child_id NUMBER(10)
    CONSTRAINT pk_childid PRIMARY KEY,
  soon_to_be_parent_id NUMBER(10)
);

INSERT INTO t_child
  ( child_id,
    soon_to_be_parent_id )
SELECT
    rownum,
    TRUNC( dbms_random.value( -9999, -1 ) )
FROM dual
  CONNECT BY LEVEL <= 10;

This table has been around for quite some time. You decide that you would like to constrain the values in the SOON_TO_BE_PARENT_ID column. First, here's the data that exists:

CJUSTICE@TESTING>SELECT * FROM t_child;

  CHILD_ID SOON_TO_BE_PARENT_ID
---------- --------------------
         1                -5560
         2                -1822
         3                -2499
         4                -7039
         5                -8718
         6                -1019
         7                -9997
         8                -9553
         9                -4477
        10                -1458

Now I'll create a table that will contain the values I want to constraint SOON_TO_BE_PARENT_ID to, call it a lookup or reference table.

CREATE TABLE t_parent
(
  parent_id NUMBER(10)
    CONSTRAINT pk_parentid PRIMARY KEY
);

I'll populate it with some data:

INSERT INTO t_parent( parent_id )
SELECT rownum
FROM dual
  CONNECT BY LEVEL <= 10;

CJUSTICE@TESTING>SELECT * FROM T_PARENT;

 PARENT_ID
----------
         1
         2
         3
         4
         5
         6
         7
         8
         9
        10

10 rows selected.

Now I'll add the constraint that references the PARENT_ID column of T_PARENT

ALTER TABLE t_child
  ADD CONSTRAINT fk_parentid
  FOREIGN KEY ( soon_to_be_parent_id )
  REFERENCES t_parent( parent_id )
  ENABLE
  NOVALIDATE;

and rename the column to PARENT_ID:

ALTER TABLE t_child RENAME COLUMN soon_to_be_parent_id TO parent_id;

What will this do? I should no longer be able to enter a value into T_CHILD.PARENT_ID that does not exist in T_PARENT, but it will ignore anything that already exists.

INSERT INTO t_child
  ( child_id,
    parent_id )
VALUES
  ( 11,
    11 );

INSERT INTO t_child
*
ERROR at line 1:
ORA-02291: integrity constraint (CJUSTICE.FK_PARENTID) violated - parent key not found

Perfect! Now I'll add a value that does exist in T_PARENT.

INSERT INTO t_child
  ( child_id,
    parent_id )
VALUES
  ( 11,
    10 );

1 row created.

Win!

This is just another reminder why you must read the Concepts Guide. By the way, I found the quote I was looking for from Mr. Kyte (h/t @boneist)

"...if you simply read the Concepts Guide...and retain just 10%..., you’ll already know 90% more than most people do"

Tuesday, December 2, 2008

Index those Foreign Keys

I've been reading about this phenomenon for years over on asktom, but I had never actually encountered the problem, until today that is.

I'm helping out doing a small piece on another project. Trying to get back into the habit of "good" unit-testing, I have created some test data. One build script that runs the whole thing, a few user scripts and finally a teardown script that deletes all the data I have created.

Naturally, I run it via SQL*Plus; turning feedback and echo off and using a liberal number of PROMPT directives (very similar to how the ApEx install goes coincidentally). This is what my teardown script reports:

...deleting from child table 1
...deleting from child table 2
...deleting from child table 3
...etc, etc

Nothing fancy. Then I realized on the final DELETE it was hanging up. Any locked objects? Nope. Maybe it's the trigger? I checked, only saw ON INSERT OR UPDATE, confirmed that with another developer. He suggested unindexed foreign keys. Huh? Really? That's impossible you say.

It wasn't impossible. It was true. So I ran my copy of the Tom's "find unindexed foreign keys."

Quite a few "****" which is not good.

So I went searching for them myself and came up with this little query (for my specifc table):

SELECT 
  b.table_owner,
  b.table_name,
  b.constraint_name,
  b.column_name,
  a.index_name
FROM
  dba_ind_columns a,
  (
    SELECT 
      a.owner table_owner,
      a.table_name, 
      a.constraint_name, 
      b.column_name
    FROM 
      dba_constraints a,
      dba_cons_columns b
    WHERE a.constraint_name = b.constraint_name
      AND a.owner = b.owner
      AND a.table_name = b.table_name
      AND a.r_constraint_name = ( SELECT constraint_name
                                FROM DBA_CONSTRAINTS
                                WHERE TABLE_NAME = 'MY_TABLE'
                                  AND owner = 'SCHEMA_NAME'
                                  AND constraint_type = 'P' )
  ) b
WHERE b.table_owner = a.table_owner (+)
  AND b.column_name = a.column_name (+)
  AND b.table_name = a.table_name (+)
ORDER BY 
  b.table_name, 
  b.constraint_name

That gave me a list of all the columns referencing the primary key (some 37 different tables). If index_name was NULL, then I knew I had to add one. Since I have this obsession lately with generating code, I just used this query and added the following:

  ( CASE
      WHEN a.index_name IS NULL THEN
        'CREATE INDEX schema_name.' || b.table_name || '_' || b.column_name || '_idx
        ON ' || b.table_owner || '.' || b.table_name || '( ' || b.column_name || ' )
        TABLESPACE my_tablespace;' 
    END ) sql_text

Now I have my CREATE INDEX statements and all is well.

I run the teardown script again and it finishes like it should.

Tuesday, July 1, 2008

Lookup Tables

Also known as reference, crosswalk and a few other names.

I love 'em.

I'm not afraid to use them. It certainly makes that table count go up, but you know what you're getting.

I have ADDRESS_TYPES, PHONE_TYPES, PERSON_TYPES (in an intersection table of course) and any other kind of TYPE you can imagine.

I could use CHECK constraints I guess, but if it's anything other than Y or N, I typically create a lookup table to go with the table.

Let's take an ADDRESS table. ADDRESSTYPECODE becomes an attribute of an address. It gets a Foreign Key to the ADDRESS_TYPES table and also (many seem to leave this one out), a NOT NULL constraint. Every address has to have an type.

To make it somewhat easier, I use codes (as opposed to IDs which I tend to associate with numbers) so a join isn't absolutely necessary. If 'HOME' is the ADDRESSTYPECODE, you would rarely need to join as it's self evident what that means. If the lookup table is large, I'd typically use ID (or numbers) for the key.

Like I said, it bumps up that table count and makes things look a bit "messy," but you know exactly what belongs in what column. And if you're using ApEx, administrative screens are a snap!

Just don't ask Duke Ganote whether type is a good name or not!

Thursday, March 27, 2008

How Does Oracle Make Development Easier?

Continuing on the theme of late, what are the basic things that you can do to reduce the amount of code that needs to be written?

In the post linked above, I mentioned Constraints as probably the easiest way to reduce the amount of coding. For example:


CREATE TABLE t
(
  id NUMBER(10)
    CONSTRAINT pk_id_t PRIMARY KEY,
  first_name VARCHAR2(30)
    CONSTRAINT nn_firstname_t NOT NULL,
  middle_name VARCHAR2(30),
  last_name VARCHAR2(40)
    CONSTRAINT nn_lastname_t NOT NULL,
  gender VARCHAR2(1)
    CONSTRAINT nn_gender_t NOT NULL
    CONSTRAINT ck_morf_gender_t CHECK ( gender IN ( 'M', 'F' ) ),
  ssn VARCHAR2(9)
    CONSTRAINT nn_ssn_t NOT NULL
    CONSTRAINT ck_9_ssn_t CHECK ( LENGTH( ssn ) = 9 )
    CONSTRAINT ck_numeric_ssn_t CHECK ( REGEXP_INSTR( ssn, ?, ?, ? ) )
    CONSTRAINT uq_ssn_t UNIQUE  
);

ID - is just a sequence generated key, no big deal there.
FIRST_NAME - is not optional you hence the NOT NULL constraint.
MIDDLE_NAME - is optional (no constraint).
LAST_NAME - is not optional (NOT NULL).
GENDER - is not optional (NOT NULL). Also, you want to exclude everything but 'M' or 'F', thus the CHECK constraint.
SSN - is not optional (NOT NULL). The length of the value must be 9 characters (CHECK). The characters may only be numeric (CHECK). Unfortunately I don't yet know the REGEXP_INSTR function yet to truly demonstrate. Finally, the UNIQUE constraint on SSN since they shouldn't duplicate across people.

This is a simple demonstration of how you can potentially use constraints to reduce the amount of code necessary. Though I would probably check/validate these as well in code because the error that is generated will not be unique so it would difficult to tell. The point is, if you make a mistake in your validation code it will be easily caught by the constraints forcing you to fix it.

This will give you much more reliable data, which as we all know, is the most important thing.

I'd like to do more of the posts pointing out the easiest methods to reduce the amount of code you have to write by using Oracle.

What kind of solutions do you have or do you use?

Tuesday, March 25, 2008

Use the [Oracle] Database dammit!

Dom Brooks recently posted an article about the Dea(r)th of the Oracle RDBMS. It seemed to struck a chord.

I've written about MySQL Friday or Application Developers vs. Database Developerswhich were similar in thought; the database is a bucket.

Ultimately, my take is that application developers don't know and don't want to learn how to use a database. PL/SQL specifically, is a platform in and of itself. You can do so much in the database now that you essentially need an application only for display, to determine the row color if you will.

The usual caveat follows:
If you are building applications that are supposed to be database independent, then the logic belongs in the application. The database is a bucket.

If you are building business applications specific to Oracle though, use the damn thing. Application/web developers are then forced to work on the design and user interface, not application/transaction logic.

Easy steps to actually utilize your database:
1. Use as many constraints as humanly possible - This will reduce the amount of code you have to write and you'll have the security of knowing the data will be what you constrain it to be.
2. DEFAULT columns in table definitions - create_date or load_date can be default to SYSDATE and thus left out of any application code. I've gone so far as to use SYS_CONTEXT( 'MY_CONTEXT', 'USERID' ) as the DEFAULT value for the create_user column. That along with a NOT NULL (or CHECK) constraint, makes life that much easier.
3. Did I mention constraints? Primary Key and Foreign Key constraints are very important to maintain data integrity (ensure you have the data you expect). Don't forget to index those foreign keys.
4. Security - VPD (Virtual Private Database) or Fine Grained Access Control. No longer do you need to maintain two separate schemas (or databases), just add a column and only allow those with the value set see that data. If you are using ApEx, this is incredibly easy to do.
5. Security (Roles and Privileges) - No more table based authorization, let the database do it through roles and privileges. GRANT EXECUTE ON my_package TO some_user

That's my short list for today. Like Dom, this makes me angry. If there were some rational logic behind it, great, convince me. I haven't seen it yet though.

Thursday, October 11, 2007

Code Style: Tables

Tables are easy.


CREATE TABLE t
(
  col1 NUMBER(10,0)
    CONSTRAINT  pk_col1 PRIMARY KEY,
  col2 VARCHAR2(32)
    CONSTRAINT nn_col2_t NOT NULL
    CONSTRAINT uq_col2_t UNIQUE,
  col3 VARCHAR2(400),
  col4 VARCHAR2(1) DEFAULT 'N'
    CONSTRAINT ck_yorn_col4_t CHECK ( col4 IN ( 'Y', 'N' ) )
    CONSTRAINT nn_col4_t NOT NULL
);

Remember to always name your constraints. While I am at, use constraints as much as humanly possible, at least in your OLTP systems. You'll be able to reduce the amount of code you need to write and actually let the database do it's job. I'd much rather let the database do it than rely on code to maintain my data integrity.

For the datawarehouse, you'll need to think about constraints a bit more as it may slow down load times. I'm still all for constraints, but I would never say always use them.

For child tables:


CREATE TABLE s
(
  col5 NUMBER(10,0)
    CONSTRAINT pk_col5 PRIMARY KEY,
  col1
    CONSTRAINT fk_col1_s REFERENCES t( col1 )
    CONSTRAINT nn_col1_s NOT NULL,
  col6 VARCHAR2(30)
);

For Foreign Key constraints, you do not have to declare the type as it will be inherited from the parent table.

This would be helpful if someone up and decided to change the NUMBER(10,0) to a VARCHAR2(10) or something (please don't ever do that!).

As for STORAGE or other table options, I typically leave that up to the DBA or work with them to add them. They may have a particular setup for certain tables that you can't possibly know (if you don't talk to them).

To recap:

Use constraints as much as possible

Always name your constraints

Work with your DBA for table options

Always name your constraints

Tuesday, September 11, 2007

Inline vs. Out-of-line Constraints?

Today one of my more feisty colleages and I had a discussion about constraints. I had asked him if he wouldn't mind naming the NOT NULL constraints to a particular table definition. With this guy, nothing is easy. ;-)

So he came over and we talked about inline vs. out-of-line constraints. I asked him the advantage of out-of-line. He asked me the advantage of inline. I'm sure his answer was better than mine as he is much more articulate.

For me, it's mostly a style thing, except for FOREIGN KEY constraints which, if defined inline, will inherit the data type of the parent column. That makes life easier if you ever need to change the data type of a parent key (not a recommended best practice mind you). I just think it looks prettier (factual based evidence). It's all in one file and I like to see how many constraints I can put on a single table. I'm shooting for at least one per column to save myself time down the road of coding exceptions. Let the database do it's job!

Here's a sample of inline constraints:

DROP TABLE s;
DROP TABLE t;

CREATE TABLE t
(
  id NUMBER(10)
    CONSTRAINT pk_id PRIMARY KEY,
  name VARCHAR2(30)
    CONSTRAINT nn_name_t NOT NULL,
  age NUMBER(2,0)
    CONSTRAINT nn_age_t NOT NULL
    CONSTRAINT ck_gtzero_age_t CHECK ( age >= 0 )
);

CREATE TABLE s
(
  sid NUMBER(10)
    CONSTRAINT pk_sid PRIMARY KEY,
  id
    CONSTRAINT fk_id_s REFERENCES t ( id ),
  something_unique VARCHAR2(30)
    CONSTRAINT uq_somethingunique_s UNIQUE
);

And out-of-line Constraints:

DROP TABLE s;
DROP TABLE t;

CREATE TABLE t
(
  id NUMBER(10),
  name VARCHAR2(30),
  age NUMBER(2,0)
);

ALTER TABLE t ADD CONSTRAINT pk_id PRIMARY KEY ( id );

ALTER TABLE t ADD CONSTRAINT nn_name_t CHECK ( name IS NOT NULL );

ALTER TABLE t ADD CONSTRAINT nn_age_t CHECK ( age IS NOT NULL );

ALTER TABLE t ADD CONSTRAINT ck_gtzero_age_t CHECK ( age >= 0 );

CREATE TABLE s
(
  sid NUMBER(10),
  id NUMBER(10),
  something_unique VARCHAR2(30)
);

ALTER TABLE s ADD CONSTRAINT pk_sid PRIMARY KEY ( sid );

ALTER TABLE s
  ADD CONSTRAINT fk_id_s FOREIGN KEY ( id ) REFERENCES t ( id );

ALTER TABLE s
  ADD CONSTRAINT uq_somethingunique_s UNIQUE ( something_unique );

So, what are the advantages for either method? I couldn’t find much via google or asktom, but I probably just didn't search on the right terms. I find it hard to believe that this topic hasn't come up before.

* Note that I did use 5 constraint types