SQLTeam.com | Weblogs | Forums

Keys in Super-Type/Sub-Type

Here is an interesting take on creating supertypes/subtypes in MySQL...

https://stackoverflow.com/questions/17405702/mysql-supertype-subtype-design

Is it really necessary to use composite keys?

About halfway down the page, someone proposes this design...

create table vehicles (
  veh_id integer not null,
  veh_type_code char(1) not null,
  other_columns char(1) default 'x',
  primary key (veh_id),
  unique (veh_id, veh_type_code),
  foreign key (veh_type_code) references vehicle_types (veh_type_code)
);

He says...

And then...

create table cars (
  veh_id integer not null,
  veh_type_code char(1) not null default 'c',
  car_type char(3) not null,
  other_columns char(1) not null default 'x',
  primary key (veh_id ),
  unique (veh_id, veh_type_code, car_type),
  foreign key (veh_id, veh_type_code) references vehicles (veh_id, veh_type_code),
  foreign key (car_type) references car_types (car_type),
  foreign key (veh_type_code) references veh_type_is_car (veh_type_car)
);

Comments/Questions:
Why would the author combine veh_id and veh_type_code in a unique index and use it as a foreign key?

By definition, a supertype should be generic and apply to all subtypes, so there should be no reference to cars or semis in the vehicle supertype.

Other examples I have seen online just use the same key between supertype and subtype.

This entire topic is still very confusing to me... :frowning:

This is a perfectly good design, in my opinion. Having a narrow composite key as in this case is not a disadvantage. Having the vehicle type code in the vehicle key makes it useful in scenarios, for example, to get the number of different types of vehicles, one needs to query only the main vehicles table.

Seeing vehicle type code in the sub tables -where the value will be the same in each sub table - makes it seem a bit unnatural, but that is really more of a question of aesthetics.

I would even take it further with energySourceType. Right now this model assumes that only cars consume certain type of energy source and trucks run on air :grin:
Then if you ever need some analytics or reports by energysourceType you have that as a distinct entity.

Based on my reading, "veh_type_code" is often referred to as a "Discriminator", but I haven't seen people make it part of the key.

Furthermore, having the discriminator in the subtypes is redundant and serve no purpose linking the supertype and subtype as they are already linked via the PK.

Since reading more about super/subtypes it seems like the link above is bad design.

If youw ant to see a HUGE rant on this topic, see this thread...

https://stackoverflow.com/questions/4896831/how-to-implement-referential-integrity-in-subtypes/4898324#4898314

Since veh_id is a primary key, it is by definition already unique. It makes no sense, and has zero positive impact on performance, to include it in an index.

Indeed, the reason not to include it is in case somehow the veh_type changes in the future.

Subtypes can be attribute (column) in the supertype -- not as common, but it can be done -- but it shouldn't be part of any identifying key.

To confirm this, fill in some of the "other_columns". Then apply 3NF and BCNF rules: do all of the 'other columns' directly depend on the "whole key and nothing but the key?" With the veh_type in there, my guess is that they do not, as the veh_type is not needed for dependency.

To be clear, we are talking about someone else's example from StackExchange, so I can't speak to their entire intent. But my point was that from what I have read, you only need an ID in the supertype as PK, and in the subtype as a PK and FK.

VEHICLE

  • id (pk)

SEMI

  • id (pk)(fk)

AUTO

  • id (pk)(fk)

Again, I'm just commenting on the link above and not on my actual application.

That's ridiculous. There's no reason to restrict it to just a single ID. It may be that some other type of value is more natural for that data and thus better.

Get away from the myth that "every table should be keyed an artificial identity value." That's not only false, but it's also extremely damaging to overall db performance.

I thought I included the original thread from StackOverflow...

If a Supertype is supposed to be generic, then why would you create a composite key with "veh_id" and "veh_type_code" and then have the same as the foreign key in the subtype?

In the thread a contributor says,

If the Vehicle is generic, then how could it be considered a "car" row or a "semi" row?

Again, not my application, just asking questions about someone else's work...

Apparently in another thread (listed below) one person seems to think the person in the first thread is a fool...

https://stackoverflow.com/questions/4896831/how-to-implement-referential-integrity-in-subtypes/4898324#4898314

And to your last comment, how is it "ridiculous" to use ID as the PK in the supertype table and ID as the PK/FK in the subtype tables when either wouldn't have more than a few hundred records?

I might agree with your comment on choosing a PK if we were talking about some very large table with millions of records and an obvious "natural" key, but that doesn't apply to my situation.

Generally/automatically restricting an ID to an int is ridiculous. I've seen table "designs" where every table is automatically given a $IDENTITY pk. That's horrible.

Of course in certain situations an identity as key may make perfect sense.

Now, in this specific case, it would be somewhat ridiculous to use an int for a few hundred rows, since a smallint would be better in that case.

That means a "car" row can't possibly reference a "semi" row, even by mistake.

Again, since veh_id by itself is unique, no reference based on veh_id can be to the wrong row, regardless of type.

I appreciate the feedback, but this thread isn't about tuning tables or queries.

First I need to understand how to properly model my business and I believe that I need to use a super/subtype, but I'm still not entirely sure.

If you could answer my unanswered questions in my other threads related to this topic it would be much more helpful to me than debating the merits of artificial keys. :wink:

(Later on, I'd love to learn more of why your way is better on this separate topic...)

So it sounds you agree with me that the design in the thread I posted is wrong?

If you look at the ERD in this thread, I think it is filled with superfluous fields and keys...

https://stackoverflow.com/questions/17405702/mysql-supertype-subtype-design#17407959

At any rate, it appears that we're getting farther into the weeds discussing someone else's database and design.

I sent you a PM last night. If you have time to help me figure out me earlier two threads, that is probably the best way to help me get this supertype/subtype issue resolved.

Thanks. :slight_smile: