SQLTeam.com | Weblogs | Forums

Many-to-many, how do I handle IDs?


#1

I'm creating a database with three tables from VB.Net. One table has over 100 000 rows, one has like 200. The third one is just a link table. I would like to know what's the smartest way to insert data into these so that the IDs would match.

So table one has 100k rows, which are like
ID Data
1 something
2 somethingelse
...
100000 andyetanotherline

And table two has 200 rows, which are like
ID Data
1 anything
2 anythingelse
...
200 yesthisonetoo

And table three is then just a link:
ID tableoneID tabletwoID
1 1 1
2 1 2
3 2 1
4 2 2
...

So it is a many-to-many reference split into two one-to-many references.

Now when I insert the data to table one and table two, it is critical to be able to track the IDs so that I can create the table three also - obviously the links have to point to the right things.

In .net side I have datatables that have the data, but since SQL Server does the addition of new ID, how should I handle the situation to keep the referential integrity under control? Should I manually add the IDs to the data and then alter the tables to transform those ID columns to actual identify columns?


#2

We are the DB guys, so we do not really care how the tables got into the schema, or their row counts. We do care about is that there is no such thing as a “link” table in RDBMS. That term is used in network databases, to refer to pointer structures that set up the traversals in those tools.

In RDBMs, we have relationship tables that have references. This is a totally different data model. Furthermore, there is no such crap as a generic, universal “id' in RDBMS. Since the relational model is based on logic and sets, columns have to be “id”, which is unique to each set of entities. Look at the ISO-11179 standards for “” naming rules.

This is called “The Law of Identity” and it has absolutely nothing to do with the proprietary IDENTITY table property from Sybase/T-SQL. If you did not take a logic course in freshman Philosophy 101, you can look it up; it gets summarized as “A is A” in the literature.

This request makes no sense. There is no magic “id” to match! You discover the keys, you do not invent them in RDBMS. They are attributes that are innate properties if the entity or relationship.

Let's give a real example, with real table names, if only in a skeleton form:
CREATE TABLE Books
(isbn CHAR(13) NOT NULL PRIMARY KEY, -- look it up if you do not know ISBN
book_title VARCHAR(125) NOT NULL,
..);

CREATE TABLE Authors
(author_tax_nbr CHAR(10) NOT NULL PRIMARY KEY,
author_name VARCHAR(35) NOT NULL,
..);

Now the m:n relationship, with its own table:

CREATE TABLE Authorship
(isbn CHAR(13) NOT NULL
REFERENCES Books(isbn),
author_tax_nbr CHAR(10) NOT NULL
REFERENCES(author_tax_nbr),
PRIMARY KEY (isbn, author_tax_nbr)
);

It would be a good idea to add “ON [DELETE | UPDATE] ” clauses to the Authorship table.

No. Many-to-many relationships cannot be split in general. Do you know what a Normal Form is? Look up the classic 5NF example of (lender, seller, buyer) which creates false data when split into (lender, buyer), (seller, lender) and (buyer, seller).

The old Sybase proprietary IDENTITY table property is not a column. A key has to be a column, not a table property. This is straight, basic RDBMS. This is how noobs try to fake pointer chains; I see it in the first weeks of an intro RDBMS class with older mainframe programmers who used IMS, IDMS, Total, etc. back in the 1970's.