ORDER table questions

CatB · September 4, 2019, 1:31am

I am building an ecommerce site and have a classic ORDER -||----|<- ORDER_DETAILS ->|----||- PRODUCT set up.

In my ORDER table, I would like to include the following...

- id
- order_no
- customer_id

- order_subtotal
- order_shipping
- order_sales_tax
- order_total
- payment_amount

In the real world, any order would have the above fields, but I'm unsure if it is a mortal sin to store calculated values in the table?

I understand the downside of this, but here it seems more practical to be able to easily reference these values in one table versus having to do fancy queries and what not.

Thoughts?

graz · September 4, 2019, 12:01pm

Well, maybe not a mortal sin...

The downside is you have to maintain them in two places now. Have you considered creating a view that generates those values? You could write the logic to compute them once and just reuse it.

CatB · September 4, 2019, 5:56pm

So for my ecommerce site it is better to run a query or code to calculate things like the "order_subtotal" every time I need it?

If I have this relationship: ORDER -||-----|<- ORDER_DETAILS

How would I calculate the "order_subtotal"?

ScottPletcher · September 4, 2019, 6:09pm

No, mostly not a sin at all. Order headers almost invariably directly contain the order total without having to re-calculate it.

I do have some concerns though, in rough order of importance, most to least:

payment_amount should definitely be stored in a separate table, since there always can be many payments to one order
I don't see why you'd need an id and an order_no: isn't the order_no unique already?
If the order_total is simply a sum of other values in the same row, then it should be a computed column, not a column with a data type. That is, like this : order_total AS order_subtotal + order_shipping + order_sales_tax rather than order_total decimal(...,...)

CatB · September 4, 2019, 7:25pm

I have been away from doing database design and development for some time and trying to get my head back into things. Am working on my data model currently.

I can share more if you'd like, since the above is just a snippet of the table.

Valid point, but for right now I am only offering one form of payment.

The id is autogenerated and I was thinking of maybe having a more user-friendly order_no too.

For instance, wouldn't it look weird to have "Order Number" = "1"?!

Maybe I should share more of what I have in my data model?

Here is what I have so far...

ORDER

id (pk)
order_no (uk)
member_id (fk)
created_on (order date)
subtotal
order_discount
shipping_method
shipping_total
sales_tax
order_total
ship_to
shipping_address1
shipping_address2
shipping_city
shipping_state
shipping_zip

ORDER_DETAILS

id (pk)
order_id (uk)(fk)
product_id (uk)(fk)
quantity
purchase_price
item_discount
taxable_yn
item_sales_tax

PRODUCT

id
name
description
unit_price
and so on...

ScottPletcher · September 4, 2019, 7:47pm

Data types would help. I'll assume id is an int data type. I'll also assume SQL Server as the dbms but these points are valid for other dbmses as well. [My first major programming job was in an Order Entry, Shipping and Credits system, so I'm very familiar with related data structures.]

If payment is just the main method that customer uses to pay, it should be in the CUST table. If it's payment(s) for a specific order, it/they should be in an ORDER_PAYMENTS table. There can be multiple payments and/or credits/adjustments, etc., for a single order.

The ORDER_DETAILS key should not be just ( id ) but rather ( order_id, id ). This will cluster the details rows by parent key first and the SQL engine will "know" they are clustered that way. ( * )

Personally I think street addresses should always be stored in a separate table, for efficiency and for ease of scrubbing the data. Thus, the ORDERS table would have a shipping_address_id rather than the entire address. Since a given customer tends to ship to the same address over and over, this can cut down a lot on storage, while making data easier to scrub. This approach also makes it easy to pull up all the addresses used by a single customer.

( * ) I'm a DBA, trust me on this one. Best clustering will allow an extremely efficient MERGE join between the tables, even if a lot of orders are being joined. Btw, the "default" approach of automatically clustering every table on an id column is terrible overall for performance. It's the most damaging myth in physical db design. The clustering key is the single most important factor for table performance. Thus, it should always be chosen based on specific table structure and usage, not on an extremely generic (supposed) "rule".

CatB · September 4, 2019, 8:12pm

@ScottPletcher,

Thanks for taking time to help!

I would be using the autoincrement feature and the datatype would be some type of int

(This will be done in mySQL actually - don't hate me)

Then you're the person to talk to for help!

The payment_amount would pertain to an order and not a customer. I could create an ORDER --||------|<-- ORDER_PAYMENTS

I think you misunderstand what I typed.

I was taught that every table has an "id" and that it uses auto-increment.

Then I also picked up that on a "junction" table, you use the primary keys from the parent tables to form a second unique key that is a composite (?) key.

ORDER_DETAIL

id (pk)
order_id (uk-part1)(fk)
product_id (uk-part2)(fk)

So I believe I have what you were describing above...

Yes, I could have an ADDRESS table as well.

Not sure what a MERGE join is...

Didn't completely follow you on that last paragraph.

ScottPletcher · September 4, 2019, 9:26pm

The auto-increment part is OK, as long as you do not make it a Primary Key (PK) by default, because in MySQL that makes it the clustering key.

The single most-damaging myth in physical db design is that auto-increment should always alone be the PK. The clustering key, which will be the PK for MySQL, is the single most important factor for table performance. Thus, it should always be chosen based on specific table structure and usage, not on an extremely generic (supposed) "rule".

For your ORDER_PRODUCTS table, you should insure that the combination of ( order_id, id ) is the primary key. Id can still be an auto-increment column, just be careful not to declare that as a PK, since by definition a table can have only one PK.

For MySQL, the CREATE TABLE will look roughly like this:

CREATE TABLE ORDER_PRODUCTS (
    order_item_id ... NOT NULL AUTO_INCREMENT,
    order_id ... NOT NULL,
    ...other_columns,some may be NULL...,
    CONSTRAINT ORDER_PRODUCTS__PK PRIMARY KEY ( order_id, order_item_id )
) Engine=InnoDB;

CatB · September 4, 2019, 10:34pm

@ScottPletcher,

Thanks for the thoughts on the key - when I am farther along maybe we can revisit this.

For this thread, I am most interested in everyone's thoughts on the columns I have in my ORDER, ORDER_DETAILS, and PRODUCT tables...

Specifically, what about things like...

ORDER

subtotal
sales_tax
order_total

ORDER_DETAILS

purchase_price
item_discount
taxable_yn
item_sales_tax