Best DB design -1 table, 2 items with same attributes

Johnse · July 2, 2019, 11:07pm

I have a table or entity that is fruit. And in this entity or fruit table, there is Apple and Orange. They have exactly the same attributes.

However they get updated differently in say different monthly periods. So let’s say this month I can update Apple because there is updated information to it. However no updated information are for orange. Or sometime orange can get update and not Apple. Update is by posting the month added with the new information.

The question I have is, as best practice do I keep Apple and Orange in one table/entity or break it into two separate table/entity because they both don’t get updated information in the same monthly period.

Please advise ! Thanks

yosiasz · July 3, 2019, 6:27am

here is one way to separate concerns

create table itemtypes(itemTypeId int not null identity(1,1), 
itemtypename nvarchar(150), itemtypeCode nvarchar(10));

create table dbo.items(itemId int not null identity(1,1), 
ItemName nvarchar(150), itemTypeId int not null);

create table attributes(attributeId int not null identity(1,1), 
attribute nvarchar(150));

create table itemAttributes(ItemId int, attributeId int, 
atrributedOn datetime)


insert into itemtypes(itemtypename, itemtypeCode)
select 'Fruit', 'FR'

;with src
as
(
select itemTypeId, ItemName
  from itemtypes it
  cross apply (select 'Apple' as ItemName 
                union select 'Banana') f 
  where itemtypeCode = 'FR'
)
insert into items(ItemName, itemTypeId)
select ITemname, itemTypeId
  from src
  where not exists(select 1 
                      from items tgt 
					  where tgt.ItemName = src.ItemName 
					  and tgt.itemTypeId = src.itemTypeId)

  insert into attributes(attribute)
  select 'Sweet' union
  select 'Sour' union
  select 'Seasonal' union
  select 'Hard' union
  select 'Soft' 


  insert into itemAttributes
  select itemId , attributeId, getdate()
    from items i
	cross apply (select attributeId 
	               From attributes 
				   where attribute in ('Sweet',  'Seasonal') ) a

select i.ItemName, a.attribute, month(atrributedOn) as _month
from items i
join itemAttributes ia on i.itemId = ia.ItemId
join attributes a on a.attributeId = ia.attributeId

drop table itemtypes
drop table dbo.items
drop table attributes
drop table itemAttributes

ScottPletcher · July 3, 2019, 1:10pm

When data gets updated has nothing to do with its logical structure or design. You should never split apart a table based just on that. You can have a separate table that provides info on updating, if you want to.

Johnse · July 8, 2019, 1:11am

yosiasz, thanks for your code and advice. It might just work. It looks interesting.
I have to look at it closely.

ScottPletcher, thanks for your answer, that makes sense. But it does affect my query though (but that is not a problem it can be done easily) if in one table has different dates or just one date. I have to find the other date for the other category, example depicted.

How about this situation. Only if you can picture this.
I have a list of items, lets say 2000 of them. However in this list of items, some items gets updated monthly and some don't. The one that does I make it appear in this table in the month that it is updated. So lets say Feb 2018, out of 2000 items, 500 gets updated. In this Feb 2018 month 500 items gets tag, if someone was to do a query for Feb 2018 they don't see the whole list of 2000 items, they only see the 500. This is the table design base on convenience of import, less work and no redundancy.

However we need to see the complete list for every time someone does a query for a month even if the item is not updated (the reason is because is useful for application development purpose). My answer to the table design was to create a store procedure that would update a new table that would have the complete list. For example we are in December 208, the store procedure will go though all 2000 items and then find any items it can for December 2018 and put it in the new table, whatever it can't find it will search for the next last date of the item and put it in the new table and then it will do so for the next date that an item is updated until the complete 2000 item are listed. It will do this until there is no more updated date. Hope I make sense.

Do you think this is a good database design or should we just import all 2000 items even though some of it's item is not updated (this I think is redundant work but is easier on the store procedure and query if we ever need the data to create applications).
Please advise !

harishgg1 · July 8, 2019, 7:24am

hi

please help me understand

Your comment
However we need to see the complete list for every time someone does a query for a month even if the item is not updated (the reason is because is useful for application development purpose

one parameter that could effect your decision ... one table or two tables
is the amount of data

if it is in millions (crores) ... or just in hundereds and thousands

millions = one table or two table
hundreds , thousands = one table or two table ( might not make a difference here )

ScottPletcher · July 8, 2019, 4:48pm

Separate tables for each update month will be a real headache to maintain.

I think you can use a single table and use a column to keep track of the last time it was updated, something like this. But keep in mind that I don't know other details of your table:

supply varchar(100) NOT NULL,
niche_category char(1) NOT NULL,
last_updated_date date NULL,
...other_columns...
PRIMARY KEY ( supply, niche_category )

When the other columns get updated, you update the last_updated_date to the current day.

If you need to see only item for Feb, you can do:
SELECT ...
FROM dbo.table_name
WHERE last_updated_date >= '20190201' AND last_updated_date < '20190301'
and so on for any other month.

Leave off the WHERE if you need to see all rows.

Johnse · July 10, 2019, 3:25am

Thanks for the code example. You're right !

Johnse · July 10, 2019, 3:36am

At the rate it's going it will take 20 years to get to 2 million records.
I don't know what you think about the volume on this and if a store procedure would work.

I did a graphical presentation that I think will help you see and picture this better.
The very right hand side of the table design is the best for application development.
However it is redundant and takes longer time to import.

The very left hand side is not good for application development, but not redundant and takes the least time to import. But it harder on the store procedure to get to right hand side table.

The middle is not good for application development, its medium in redundant and takes medium time to import. It needs store procedure to convert it to the right hand side table.

What table design should I use as import and should I use store procedure or not. Ideally we want the right hand side table to make it easier for application development purpose and no store procedure is require to make it appear as that.
But there are the cons as stated above.
Please advise.

Johnse · July 11, 2019, 5:18pm

Currently what I have in place is the middle or second table design. It’s a bit redundant and I don’t need to import every month, only the month that is updated but I import the completed list. I was thinking I will use a store procedure to create another table and the end result will look like table 3.

Table one on the left hand side is not redundant is only the updated data.