Question on design

Student

StudentID BIGINT IDENTITY(1,1) NOT NULL,--Primary Key
Name Varchar(50) NOT NULL,
DOB Datetime2(3) NOT NULL,
SubjectID Char(4) NULL,
Createddate Datetime2(3) NOT NULL,
Updatedate Datetime2(3) NOT NULL

StudentSubjects

                                 ---Is it required to create a identity column with primary key for  StudentSubjects table..

StudentID BIGINT Not Null,--Foriegn Key on student table for StudentID Column
SubjectID Char(4) NULL,--Foreign Key on Subjects table for SUbjectID Column -Note- the column SubjectID needs a bit operator.

Subjects

SUbjectID Char(4) NOT NULL,--Primary Key
SubName Varchar(100) NOT NULL

subjects table would have

1 langArts
2 Science
3 Maths
4 Social
5 History

Please let me know above design is fine and anything needs to be add/modified and the subjectID the select query would be BIt type is true or false 1 or 0.

Thanks for your help in advance..

OK so this is homework. why do you have the subject id in the student table? Isn't that what the StudentSubjects table is for?

Don't know what you mean by this. Post your select query.

Personally I would keep the columns as "narrow" as possible - it saves disk space, and means you get more entries in each Index page, so things run faster.

On that basis:

I would not use BIGINT for StudentID. Even INT might be generous and the school I went to would fit in TINYINT !! It seems to be the "norm" to use INT for IDENTITY / ID unless you anticipate an absolutely massive number of rows

Similarly I would use DATETIME not DATETIME2 for the Createddate / Updatedate and for DOB I would use DATE - unless you actually plan to store the Birth TIME - then i would use DATETIME (or even SmallDateTime) and not DATETIME2

Looks like that needs to be INT rather than CHAR?

Why do you think that? (It doesn't seem right to me, but it would help me explain WHY it is not right if I knew why you thought that was the right choice :slight_smile: )

Thanks for the responses..

On that basis:

I would not use BIGINT for StudentID. Even INT might be generous and
the school I went to would fit in TINYINT !! It seems to be the "norm"
to use INT for IDENTITY / ID unless you anticipate an absolutely
massive number of rows

The records will be massive 20 millions expected

Similarly I would use DATETIME not DATETIME2 for the Createddate /
Updatedate and for DOB I would use DATE - unless you actually plan to
store the Birth TIME - then i would use DATETIME (or even SmallDateTime)
and not DATETIME2

I agree

SUbjectID Char(4)

Looks like that needs to be INT rather than CHAR?

The SubjectID has values with

01LA langArts
02SC Science
03MT Maths
04SS Social
05HS History

That's why i have created the char(4)..

The subjectID the select query would be BIt type

Why do you think that? (It doesn't seem right to me, but it would
help me explain WHY it is not right if I knew why you thought that was
the right choice )

I may be wrong..but the data provided to load into tables with a format consists of students with subjectsID true or false indicator..

Is it good design to have identity column with primary key for the table StudentSubjects

Thanks for your help again..

You are still safe with INT :slight_smile: - max value for INT is 2,147,483,647 ... whereas the max value for BigInt is 9,223,372,036,854,775,807

OK - although that sort of data normally raises a debate. You have the combination of a surrogate key here ("01", "02" ... "05") plus a "Human Readable key" ("LA", "SC", ... "HS") and the debate rages over whether that is good, or bad. Users like to overload the Key with other information - so that, at a glance, they can see what it refers to. "01" means nothing to them, "LA" means Languages & Arts. So either use "01" (which the computer likes) or "LA" (the User's preference) but not the combination. Generally I prefer "01" - it will never change, whereas "LA" might - say they decide to rename the department to "Modern Languages" or somesuch ... IF the key is "01" you can freely change the Mnemonic or the Name, they are just data fields, but changing the Key is a nightmare - all associated records have to be changed to match.

OK, got it. Your StudentSubjects table will store the combination of StudentID and SubjectID. The import file may have StudentID and then True/False for each of the Subjects 01...05 - so you just need to convert the TRUE ones into a row in StudentSubjects table. You don't need any BIT columns in your database, the True/False thing is just how the import data is presented.

I don't have a problem with it, but it isn't perfect and there are other contenders.

For me Primary Key should ideally be:

  • Never, ever, changing
  • As short as possible (maximises the number of entries in each Index Page)

An INT storing an IDENTITY is short, and does not need to change ever.

We used to commonly use something like "First 4 letters of the family name and then a tie-break number 01, 02, ...". Back then that helped because names were guessable-ish. For Smith you could type SMIT01 and if that wasn't it you could try SMIT02. Nowadays Applications provide easy means of searching for names, so you don't actually need to know that SMIT01 has the ID 18376297 - which is definitely not memorable!

However, in the old days if you mis-typed SMIT01 as SMIY01 the chances were good that that did not exist, and that would prevent you saving the fault data. Now, if you mistype 18376297 as 18376298 then that record is likely to exist, so you get no warning.

Another problem with IDENTITY is that you don't know what it is going to be UNTIL you INSERT the new record into the table. If you need to insert additional rows, at the same time - e.g. into the StudentSubjects table - then you have to INSERT the Student record first, and THEN get the allocated ID back again, and THEN you have the StudentID to use in the rows you want to insert into StudentSubjects. If you are inserting students IN BULK then it is harder to get the associated Identities back again and synchronise them to then insert the StudentSubjects (in bulk).

If you were to use, for example, a GUID instead pf an INT/IDENTITY then you could assign the GUID in your APP and then you know the StudentGUID before you insert the new record, and you can insert in bulk into Students and into StudentSubjects without having to wait / get-back the newly assigned ID from the server.

But GUIDs have lots of other problems - they are big (compared to INT) and are random, so cause lots of index page splits during bulk inserts - not a good candidate for a Primary Key if you have lots of inserts.

So in answer to "Is it good design to have identity column with primary key for the table StudentSubjects" my answer would be "it depends" :slight_smile: but, in general, it is rarely a bad choice - but sometimes may be a better choice.

Student Table Comments:
StudentID should be int, as noted
There should NOT be just one "Name" column. You should have, instead, at a minimum:
FirstName
MiddleName
LastName
SuffixName --Jr.,Sr.,etc.
Believe me, you don't want to be forced to try to parse this out later!
DOB should be date rather than datetime: do you really care what time they were born??
As noted, SubjectID should not be in this table.

Subject Table Comments:
Needs completely rethought out.
Forget "char(4)". Forget putting any meaningful letters in the name. Use some combination of small integers. People will memorize small numbers about as well.
And probably forget just "langArts". What, there is only ever one language arts class? No, for example, create writing vs business writing vs journalism?

StudentSubject Table Comments:
Yes, this table will eventually inherit the major keys from the other two tables.
But there is lots of data still missing from this table:
Semester code | Start date & end date
audited vs fully attended
etc.