Modeling: Company

CatB · June 17, 2019, 7:46pm

I could use some advice on how best to model a "Company" in an expense database...

This system will keep track of all travel expenses a person incurs. For each "Order" there will be one and only one "Company".

The problem is that most Companies have multiple locations.

For instance, let's say that Sally is on a roadtrip and she stops and buys food and office supplies at 3 different Walmart locations:

Walmart
1100 Belt Drive
Albany, NY

Walmart
2700 Tech Way
Trenton, NJ

Walmart
1000 Constitution Drive
Philadelphia, PA

Originally, I was just going to do this...

COMPANY -||------0<- ORDER

That would work, but it wouldn't be normalized, and maybe it would be harder to roll up the data for reporting, say when you want to see how much we spent at Walmart in 2019?

So I am wondering if it makes sense to break things out like this...

COMPANY -||------|<- COMPANY_LOCATION -||------0<- ORDER

From a database standpoint, I suppose that is more correct, but then it probably will make coding the front-end more complex, and maybe unnecessarily so?!

What do you think?

yosiasz · June 18, 2019, 1:02am

One way to do it
Create a company table
Then a company locations table

1 to many.
So the complication could happen on back end or the fe.
But this is not complicated at all imo

CatB · June 18, 2019, 1:21am

@yosiasz,

Repeating what I already said adds no value to this thread...

yosiasz · June 18, 2019, 1:37am

I am agreeing with your proposed design approach by repating one of the options

CatB · June 18, 2019, 2:03am

The way you worded things, it sound like you were proposing something new.

WHY do you think have a "Company" and "Company_Location" table makes more sense?

One strike against that design is that the "Company" table wouldn't really add a whole lot more value, unless you plan on interacting with the corporate headquarters a lot or had lots of fields related to the headquarters.

Having two tables feels better from an academic standpoint, but I'm uncertain if it makes sense in practice.

What do others think?

yosiasz · June 18, 2019, 2:26am

Always ask WHY indeed!

One can do whatever one wants but there are consequences to the design approach we use.

Imagine we went with the one table approach where we record all of the locations for the same company. In that scenario you will get repeated data on CompanyName
Walmart 1201 SW 5th, Queens, NY 10023
Walmart 5500 N 32, Queens, NY 10023
Walmart 56333 Dumbo St, Manhattan, NY 10023

This is one approach but let us say you have thousands of these rows for Walmart, the minute Walmart changes to ShopSmart, you will have to update all of these rows. And you could choose to do that. Here is another problem. Since these are individual entries there could be someone that does a bad entry and instead of Walmart they enter Wamart, or Wallmart. These are supposedly, for the persons that entered the data, all Walmart.Let us say most of the purchases made by employee #1 was at Wallmart, now all of your reports will be messed up because of different entries for the same company.

To mention a few things. Nothing academic here, in practice you just done messed up on your reports.

CatB · June 18, 2019, 3:25am

@yosiasz,

You make some valid points, but this is actually just a personal travel-expense database for me as the sole user.

Obviously I see the value is have a "Company' and "Company_Location" tables since I brought it up. But then I also wonder if the extra work provide extra benefit.

Not sure if you have any development experience, but let's think about this from a GUi standpoint...

When I travel, I would enter in a new "Receipt" (form) and then the details of the receipt in "Receipt_Details" (subform).

And in the main form, I would like the ability to easily either enter in a new Company which doesn't exist, or pull up a Company and Company location which does exist.

How would you expect this to work?

For an existing "Company", would you prefer to select a "Company" first (e.g. dropdown box), and then have another "Company Location" dropdown box?

Maybe something else?

Really I think HOW the user uses the system should determine which approach I take in this case.

Thoughts?

yosiasz · June 18, 2019, 3:57am

There will be extra work whichever way you slice your design approach.

I dont try to push any specific approach but I will lay out clearly to people the pros and cons. then again there are best practices that many people have found to be effective. In your case I would keep it as simple as you can.
You are welcome to eat soup with your fingers (some cultures do)
but using utensils sure makes it much easier.

yosiasz · June 18, 2019, 12:53pm

For display purposes would be easier to have only one drop down where both company name and location concatinated together. That is if you take the one to many design approach

ScottPletcher · June 18, 2019, 6:48pm

No, "how the user uses the system" has 100% nothing to do with a proper data design.

You need to start by modeling the data, not database tables.

That is, a logical design first, not a physical design. This should be done at the business level, not at a technical level. Do not worry about keys or indexes yet (indexes don't exist in a logical model anyway).

Determine all the entities you need to keep data about. Btw, keep consistent with business terminology: an "expense" and and an "order" are two completely different things.

How to decide "entities"? [Entities, btw, will each become a table(s) during the physical design.] Go thru your cases / scenarios and note the main objects.

You mention "Sally". Therefore you have an "employees" entity.
You mention "roadtrip". You likely have some type of "employee_activities" entity.
You mention "expenses". You have an "expenses" entity.
You mention "company". You have some type of entity there. "Companies" is a rather generic name, but it will do for now.
A company has different locations. A location is definitely a separate entity from the company, because it has its own attributes (specific data, which will become columns), just as company does. Needing to keep data about that specific thing is what identifies it as needing its own entity (table/tables), rather than trying to not make a table for it or to combine it with another entity.

Work out and document the details on the relationships between entities. For example, an "expense" would presumably require an employee relation, an activity relation and a company relation. In business terms, the company only approves expenses for: approved activities of approved employees using approved companies.

Continue fleshing out these details until you feel you have everything at least reasonably fully captured. Then, and only then, should you convert it to a physical design.

The application design and "feel" is completely separate from the data used in the app.

CatB · June 20, 2019, 5:42am

Sure it does. Final solutions are often DE-normalized because how things work in the real world doesn't match theoretical logical modeling.

If you read my OP, you'll see I have done that on this point.

I am building a database to keep track of all my expenditures (aka "expenses").

When I performed logical data modeling, I chose the textbook ORDER - ORDER_DETAILS - PRODUCT constructs for convenience.

I have pined over whether I should be calling things "expenses" or "receipts" or "orders".

In this case, they are really all the same.

If I buy a book online on Data Modeling, to me it can be seen as an "expenses" and an "order" and I have an email "receipt".

If I go to lunch at McDonalds, to me that is a lunch "expense" and I also have a "receipt" where I can enter in the lunch "order" that I placed at McDonalds.

For now, I think ORDER - ORDER_DETAILS - PRODUCT seems most natural.

Nope. Contrived example. See above.

Do how practical is it to break out a "Company" table to keep track of the fact that I bought Post-It notes at Target?

Is having an extra table with two fields (i.e "ID" and "Company_Name") really worth it?

See above. This is just a way for me to keep a registry of here all of my money goes.

I have done all of that, and I am unsure of how to handle my company/merchant data...

ScottPletcher · June 20, 2019, 1:56pm

By proper data design, I meant logical design. Code is irrelevant to the logical design ("model"). You're talking about the physical design, which will indeed be adjusted to allow for the details of app and coding.

Almost all developers have an extremely hard time separating logical from physical, i.e., ignoring any physical considerations while creating a logical design.

I've been data modeling professionally for 30 years, but you seem to know much, much better, so good luck.

CatB · June 26, 2019, 1:58am

I did model things in the "logical model" the proper way.

So maybe I used the wrong term.

I am asking how I should implement the "logical_model" into a "physical_model"...