Efficient SQL for Sub Headings in a report

Kristen · November 24, 2017, 3:35pm

ABC Adam Churchill
Col1 Col2 Col3
123 XXX YYY
456 AAA BBB

DEF David Frost
Col1 Col2 Col3
567 MMM NNN
890 OOO PPP

The name, and repeat of Column Headings, is displayed when it changes.

In my "stuff" I can include a "TemplateID" for the Sub Heading, on a row when it changes, to cause the sub heading to be displayed. A NULL TemplateID causes nothing to happen all (but the first) Detail Rows.

But I have to also provide the Name Details (columns) in the resultset. That means that that (redundant) data is present in every row.

I suppose I could:

Output one resultset of all the Heading Details
Output a second resultset for all the Detail data
Have the report writer "walk" both data sets in sync., and output the Heading data each time it changes.

But it would be quite a challenge to have two queries, one for headings, one for detail, which were a) in sync (i.e. no Bugs!) and b) that was not inefficient because of 2x the number of queries.

Only practical thought I have had is to use a CASE statement so that the Name (Sub-Heading Columns) details is NULL except on the row when it changes, so that the least the amount of data [but not the number of columns] flows from SQL to APP is reduced

I've not considered that the Report Write will do single-row queries to get the Sub heading data - each time. Seems like way to many RBAR queries!

One approach that we do use is to only output just the Headings, and then a "Drill down" button that the user can press (which does an AJAX query for just the child-data for that section of the report). This has a place, but I don't think its the solution for all reports.

For example we may show:

Product-ABC
    [+] Sales Jan 2017
    [+] Sales Feb 2017
...
Product-XYZ
    [+] Sales Jan 2017
    [+] Sales Feb 2017
...

The assumption here is that the User will only want to see the Sales for some months, and some products, and when they click on [+] the short delay is acceptable (in "user time") whereas shipping all the data to the Client, at the outset, would be the sort of delay that the user feels is "thumb twiddling time"

Hopefully I'm not missing something obvious?

yosiasz · November 24, 2017, 5:25pm

Are you querying a flattened datawarehouse-ish data set or the oltp?

Kristen · November 24, 2017, 5:41pm

OLTP

(actually a bit of both, so I could perhaps go either way)

We have OLTP and then Intranet DB.

Purpose of Intranet is to provide APP with consistent interface for all APPs and Databases in the company.

So the Accounts staff know how to operate the Accounts APP, but everyone else makes Accounts Enquiries using Intranet (rather than learning how to use Accounts APP itself).

That same Intranet interface is available for all the other databases that we have in the company (we have a SQL database for each significant 3rd party APP that we use, and there are several of those).

Intranet DB also provides some storage - e.g. if we need "a bit of extra data" related to one of the main APPs, but there is nothing within that APP, itself, that solves that problem.

The Intranet DB also warehouses some data - where query speed is poor if we query the actual DB direct. That Warehouse data is refreshed overnight, but is stale within the working day.

However, mostly what people are doing needs "latest data", so in the main the Intranet reports show data from the actual OLTP database(s).

But insofar as it makes a difference I could consider using either the Warehouse OR OLTP source.

yosiasz · November 27, 2017, 5:48pm

I am confused. there is something fishy here.
warehouse is stale? why?

Kristen · November 27, 2017, 6:26pm

Warehouse is refreshed overnight

(Feels like I'm answering the wrong question though?)

yosiasz · November 27, 2017, 6:28pm

wonder why only overnight why not every 15 minutes? does it seem to me your environment has different data sources? this is not a code review

Kristen · November 27, 2017, 6:38pm

The "overnight" is so that the data is static during the day.

It is critical that the data is accurate. Data entry involves "Person A enters it" and "Person B checks it", that may manifest as "Persons A & B do data entry all morning, and then cross-check each other's work all afternoon"

But even if not there is a continuous stream of Data Entry and Data Checking all day long ... the data entry crew are required to complete all checking by close-of-play.

So, basically,a clean copy of yesterday's data is more accurate (in this sense!) than looking at the live database

yosiasz · November 27, 2017, 6:50pm

data entry sounds like Federal Gov job or some very sensitive stuff, or involves $$. there must be a reason for this vetting of the accuracy of data? not sure why whatever data entry tool they are using does not do the accuracy check. I am sorry but it sounds a bit outdated process from first glance. which manifests itself in how the db design and process is done. and sounds like you do not have much control/say on the way they do their stuff. just thinking out loud here, not criticizing.
Actually it is constructive criticism.

Kristen · November 27, 2017, 7:24pm

Its data relating to the Matters that we handle. There is an enormous amount of formality (in each territory that we deal with) compliance, plus our Legal bods give Opinions based on the data. If the data is wrong we miss critical formality deadlines and/or give duff opinions. It just cannot be allowed to happen, period, so the data entry is a "two eyes" process.

We do that even for accounting data on the basis that the whole company operating on "Source data is accurate" means we can live and die by the data downstream.

We're using an off-the-shelf Matter package (which is specific to our vertical market). An operator can call up a record and change it, how each user-company handles validating it is up to them. Might seem absurd not to have some sort of batch process (which vets the data before applying it), but the international formalities requirements are an all-over-the-place type deal (huge variation/exceptions from country to country), so realistic "data entry" can require calling up numerous records, change them as appropriate, and saving them in order to achieve "Data entry of Document-X". Data entry is highly skilled (well - "Requires a hugely broad knowledge")

There is a full audit trail and so on, but data entry is record-card based, rather than the way that electricity meter readings used to be done by key-to-disk by two separate operators

The Warehouse imports into tables which broadly follow the original schema. (We aren't consolidating for trends / cumulative totals, we still need all the detail.)

The Warehouse does need to process the data somewhat differently, so we need different indexes for efficient queries and so on.

Plus I don't approve of some of the design choices the Matter system vendors made. I probably should have left well alone, but umpteen-part-PKeys give us all sorts of headaches in our APP, so some changes are made to have narrower-keys. Of course there is a cascade impact of making that design decision ...

We have some "add on data" which is stored in our Intranet APP/DB - e.g. data which the Matter APP doesn't have. For example we have an Invoice Production component which does an "instead of" job for invoicing (because we cannot generate invoices to suit OUR client's requirements using the Matter Vendors solutions). We have to upload invoices electronically to our Clients Accounts system (i.e. do their blinking data entry job for them <spit>) so our Intranet DB stores reference numbers appropriate for our client accounting systems (as one example). That is not something that the 3rd party Matter system is capable of AT ALL ... its a typical example of why we use our custom Intranet solution, we get the business from those Clients BECAUSE we can accommodate their requirements.

Tangent: Latest one is that several companies (all at the same time, with no warning, and retrospectively - must have been a Consult they employees, or some upgrade to whatever accounts system they are all using) have told us that we have to invoice time in 6 minute increments. We have had to credit all unpaid invoices (several months back) that are already issued but do not conform. I have no idea what plonker thought this was a good idea ... they are quite happy that we round-up all our time recordal to 6 minutes. Our timesheets are recorded in real-time, we don't do "15 minutes minimum billing units", it we are on the phone to you for 30 seconds that's what gets logged - push a button for "Start job", push a button for "End job". Except that now we are going to bill you for 6 minutes ...

Yeah, I know, clearly they are going to compare billing rates from different suppliers ... but they are going to pay an average of +3 minutes on every time record we log.

The 3rd party Matter system may handle that at some future date. But we need that RIGHT NOW, this month, retrospectively for all open invoices for those clients. So that's what we have done, because we can ... just some hours MOD'ing the invoice package.

The Matter Management system is intended to be used by Private Practitioners. Our firm is just that, but the service we offer to our clients is more along the lines of an outsource in-house counsel ... so we have all the Manage Reports that in-house folk would expect to see, but we also have all the formality management side (which the external practitioner is normally responsible for). Our Intranet provides any middle-ware and associated data to achieve that.

I have VIEWS for Internal and External presentation of a table.

If the Matter Database has a table called NAMES with columns FirstName and LastName, I have two views IN_NAMES and EX_NAMES. They have the same columns, indeed the EX_NAMES view would (in many/most cases) be used to "pull" the data to refresh the Intranet Warehouse Table (and then IN_NAMES is basically a SELECT * from that ...)

So in my query if I need current, live, data then just using EX_NAMES instead of IN_NAMES does that job ...

The EX_NAMES view, because it presents the same columns as IN_NAMES, may have some horrible inefficiency - e.g. for some top-up columns that are calculated, and flattened, during the overnight refresh of the warehouse. So generally not good news if an APP needs to be built using EX_Views direct onto the Live database.

For Invoice production all the data is direct from the Live database (timesheets and disbursements and the like).

For Attorneys to look at Matters they are looking at the "yesterday" warehouse data, so that nothing is changing under foot.

And very welcome. I have very little opprotunity for peer-critique, so it is most welcome.

Shall I send you a plane ticket to come over and do an inspection / audit / review?

yosiasz · November 27, 2017, 7:32pm

I can pack an oldtimey extra gun powder whiskey flask I keep under my desk for situations just like this.

Kristen · November 28, 2017, 9:13am

Is it for your musket, or does it have Whisky in it?