Design: The Entity = The Party
I've discussed the merits of design and the entity
here and
here.
In a recent interview, the prospective employer mentioned the "party" model. I had no idea what they were talking about at the time only that it was similar to my entity model.
In another interview, I was asked about subtyping. I didn't know the vernacular as it pertained to database modeling, but I went on to explain the entity model. He told me they were one and the same! Now I have a name for it
and I came to something that others had already "invented." While it would have been easier to read one of the books on modeling that discussed the Party Model, but I can't seem to read technical books (online is a different story for some reason). I also think it's pretty cool I came to the same conclusion as others outside of their influence. I do have to wonder though if I took it in at some point of time but don't explicitly recall it.
Anyway, it's definitely nice to have an idea validated.
Below are some general links on data modeling and specific ones on the Party Model:
Data Modeling on Wikipedia
A Universal Person and Organization Data Model
Siebel/Oracle -
Party Data ModelParty Information Framework
Labels: database, design
Design: The Entity?
Last month, I
posted my preference
for maintaining addresses (and other entity attributes) via an ENTITY table. I got some good feedback, but I would like more.
Option 1

Option 2

In short, I prefer Option 1 as it requires no further modifications (if adding another entity) and you won't have to write funky SQL to get all the addresses (though you would have to UNION the people and organizations table and any subsequent entity). The one bad thing perhaps is that you would have to keep a close eye on your code to prevent someone from subverting your process...I think anyway.
So please vote and sound off in comments as to the "Why" of your choice.
(If you choose neither, please, please, please explain!)
Labels: database, design
Lookup Tables
Also known as reference, crosswalk and a few other names.
I love 'em.
I'm not afraid to use them. It certainly makes that table count go up, but you know what you're getting.
I have ADDRESS_TYPES, PHONE_TYPES, PERSON_TYPES (in an intersection table of course) and any other kind of TYPE you can imagine.
I could use CHECK constraints I guess, but if it's anything other than Y or N, I typically create a lookup table to go with the table.
Let's take an ADDRESS table. ADDRESSTYPECODE becomes an attribute of an address. It gets a Foreign Key to the ADDRESS_TYPES table and also (many seem to leave this one out), a NOT NULL constraint. Every address
has to have an type.
To make it somewhat easier, I use codes (as opposed to IDs which I tend to associate with numbers) so a join isn't absolutely necessary. If 'HOME' is the ADDRESSTYPECODE, you would rarely need to join as it's self evident what that means. If the lookup table is large, I'd typically use ID (or numbers) for the key.
Like I said, it bumps up that table count and makes things look a bit "messy," but you know exactly what belongs in what column. And if you're using ApEx, administrative screens are a snap!
Just don't ask
Duke Ganote whether type is a good name or not!
Labels: apex, constraints, database, design, development
Design - The Entity
I love designing databases. Specifically, data modeling. I love trying to figure out how the data fits together. Is this an attribute of that? Is it shared with something else? What's the relationship? One to many? One to one? Many to many? Hierarchical?
Let's start with people and addresses. A fairly easy mapping. One person can have one or more addresses, or one to many. The basics of that are drawn out below:
The primary key (whether natural or surrogate, defined here as a surrogate) for PEOPLE is stored as a Foreign Key in the ADDRESSES table. To guarantee that a PEOPLE key is supplied for each address, you should add a NOT NULL constraint along with the FK constraint.
Not too difficult.
What if down the road you need to add organizations? Do you shove them into the people table or do you create a new table for them? My initial thought is to create a new table called ORGANIZATIONS.
Easy enough right?
Now I want to track addresses for the ORGANIZATION table. Hmmm...how to do that? My first thought was to just create another address table.
Hmmm...now I have addresses in two different tables. I've introduced the chance of having non-standardized addresses (two entry points, or maybe I don't have the same attributes in both tables). Writing a query to get all the addresses (for a mailing perhaps) will be a bit more difficult.
How about just adding another column to the original ADDRESS table?
That works...sort of. I must remove the NOT NULL constraint for PEOPLE and I definitely can't add one for ORGANIZATION. But they're in one table and I can have one entry point (it would have to be conditional though). What if in the future I add another table? Do I just add another column (FKd) to ADDRESS.
It starts to become a bit unwieldy.
So, enter the ENTITY.
The ENTITY just becomes a placeholder, providing keys for PEOPLE (one to one) and ORGANIZATIONS (one to one). ENTITYID is stored in the ADDRESS table and voila! Perhaps this is called a supertype (not real sure on my terminology here). You now have one entry to the ADDRESS table (well...hopefully). To pull all addresses is easy.
You can use the surrogate key from ENTITY as your primary key in PEOPLE and ORGANIZATIONS, so if you want to bypass the ENTITY table, you can do so easily.
That's my solution anyway. How about you? Same, different? Am I crazy (I'm starting to think so...)?
Labels: database, design
How Do You Audit?
I'm not necessarily talking about system auditing, which I understand to be pretty much like throwing a switch, I'm talking about table level auditing.
Given the following table:
CREATE TABLE t
(
id NUMBER(15) PRIMARY KEY,
ssn VARCHAR2(9) UNIQUE,
last_name VARCHAR2(40),
first_name VARCHAR2(30),
dob DATE
);
The requirement is to track any changes to everything but the SSN. ID is just a surrogate key.
I typically do something like this:
CREATE TABLE t
(
id NUMBER(15) PRIMARY KEY,
ssn VARCHAR2(9) UNIQUE,
last_name VARCHAR2(40),
first_name VARCHAR2(30),
dob DATE,
create_date DATE,
create_user VARCHAR2(30),
end_date DATE,
end_user VARCHAR2(30)
);
I added the create_ and end_ columns to see who did what when. If someone comes in and decides to change the record, the end_date and end_user are populated (thereby "terminating" the record) and a new record is created with the updated values. That then becomes the "current" record.
I know there are other ways, but I'd like to hear some of your ideas/methods.
UpdatedI realized that I should not have put ID as the primary key, I should have used a unique key (SSN) instead but generated a surrogate key (ID). I've updated it to use SSN as the unique key, so the primary key can and will change (it's a surrogate), but the SSN cannot.
Updated IIThen it would fail when you create a new record because the SSN would no longer be unique. Oy, I usually read and re-read anything technically related.
So I'll remove the UNIQUE constraint from SSN and that should do the trick. SSN will be used to find the record of the appropriate person, along with END_DATE IS NOT NULL (a "current" row).
Labels: database, design, development