The AWS instance has 32 Gb RAM and a Quad-core CPU. It has 4 SQL instances running on it. The instances are memory-capped as follows:
2048 Mb (our problem child)
5000 Mb
7500 Mb
16000 mb
for a total max 30548 Mb usable by SQL, leaving a minimum of 1.5 Gb RAM for the OS.
Believe me, I am thinking long and hard about moving off SQL Server, at least for this particular DB. Unfortunately, that is a major architectural and development project. All my instances work with each other, they've been SQL Server since before SQL 2000. Never had a problem with this.
I am not aware of any throttling on AWS end, besides, both the server and the client are in the same AWS subnet, so their data exchange is internal to that AWS subnet.
1.5 is a minimum, that's if all my instances use 100% of their memory. Which pretty much never happens.
All instances are healthy. In fact, this very same instance has no problems whatsoever executing all other queries (no timeouts) except for this one, which consistently times out. Actually, there's one more, on a totally different table, that exhibits identical behavior. I have about 150 SPs in this database, 75-80% of them are being constantly used, most of them are a lot more complicated than "select top 1".
The only thing that's setting this one apart is that it's executing a lot more often than anything else.
mmm... About 10 years, with no changes in code or the underlying table structure.
Not the same app, but from a different app in a different AWS instance, on the same subnet
LOL yeah, it got decrepit in its old age. No, nothing changed since I moved the whole thing to AWS about 3 years ago. And it's been working OK after that for a while, too. I'm beginning to see this in the past couple of months, and it looks to be getting worse. Yes, it's being called from an SP. I recompiled the SP, I updated stats, I even rebuilt the clustered index as per above suggestions.
Here's the kicker... "select * from stack with(nolock)" consistently returns in about 420 ms. As soon as I introduce the WHERE clause - consistency goes right out the window.
Sorry, the site blocked any new posts for me for another 21 hours
The DB has a lot more going on. This one TABLE keeps getting inserts, updates and deletes with the speed of a machine gun, but it never grows beyond about 20 rows.
They say after 3 years AWS slows down so they can make you pay more. Net neutrality type of thing. ha! Just kidding...I think.
Can you move the db or restore the db to a healthy instance and change connection string to app and see what happens?
Also why is the db not simple recovery for 20 rows full seems over the top
Thought about it. Will try next. I'm just worried about the fact that this IS a stack, really. Multiple async processes are hitting it to pop off the top record and begin processing it. If two or more processes grab the same record I may end up with some very unusual situation on the phone.
Besides, immediately after getting that record the SP does an update on it to make sure it's "popped" off the stack.
And, thinking about it, it just looks like such a band-aid to do that
Ok, did some debugs. It appears that I may have an insufficient memory problem, or an overloaded CPU. Even following your advice, running a select into a temp table, then adding indexes — I get hung up on either adding the indexes or selecting from temp table with filters (with or without indexes). Straight data selects are fine, as soon as I add any kind of a computational element (cast/convert) or sorting — I get hung. That either means I don't have enough memory or enough CPU to perform the sort or conversion.
Yousiaz, gentlemen, thank you for your time and assistance, I greatly appreciate it!
AWS has various different type of server instances, some more suited to SQL.
Its sounds like you may have to review what you using and determine if its still suitable.
Lucky it's easy enough to change for a day and see if things get better.
Does not matter. The results are still just as unpredictable. I ran the select from a temp table 5 times in a row, the first 4 times returned in 420 ms, the 5th try took 25500 ms. Debug from the final try below.
If I create indexes AFTER I populate the table — the bottleneck is index creation, and the select flies.
If I create indexes BEFORE I populate the table — the index creation is immediate, the populate is immediate, but the select with the WHERE clause takes forever.
By the way, that last try — the temp table was populated with 10 or 12 records, none of which matched the WHERE conditions, so the final resultset was empty, the debug screenshot shows a set of literal values being returned when RS is empty.
Yes, clustered index on all 4 columns being used in the WHERE, another non-clustered descending on Priority and another non-clustered ascending on time