I made some changes in my VM which worked well, but did little on their server. I don't have remote access, so I've scheduled an on-site visit for next week. I will report back again afterwards.
Mike.Sheen wrote:I've downloaded and run these stored procs against a copy of a customer database I have - 1,988 records in CR_Main and 168,727 records in CR_Trans.
Running the stored procs with the parameters for all creditors, as at todays date - both ordered by AccountNo and Name and BOTH stored procs return in 3 seconds - 42 rows returned.
Your experience is what I would expect - neglible difference regardless of which way you run it.
I've got a test script set up so that I can choose between "one", "some", and "all" creditors just by un/commenting the right lines.
My "some" returns 105 records in about 10 seconds. It takes 3 minutes on their server.
As an aside, I've noticed that many of our customers' servers seem to run slower than a VM running on my desktop PC or laptop. I suspect it's because they don't listen to their consultants and have spinning rust in a RAID configuration instead of a mirrored SSDs. But I digress...
Mike.Sheen wrote:
So, It's not a general issue - it's something about the nature of your data or environment which is impacting things. I think you'll find the problem will simply go away after a while, once things grow a bit more and indexes and trees change.
The problem has been there pretty much since they upgraded to 7 (18 months back now). It's now just gotten bad enough that they want it fixed.
I notice your test returns just 42 records - the full set for me is 271. So while I think that you're probably right about it being something with this specific database, I'm not ready to rule out any possibilities. This is the same customer who bumped into the transaction detail performance problem in their creditor maintenance screen - they have some creditors with very large transaction sets. Their top 2 have 52,000 and 30,000 records respectively.
Mike.Sheen wrote:Limiting to just a smaller range of creditors would be enough to reveal where the bulk of the time is spent.
I picked just one to start with.
The two insert queries in the creditor-balance-as-at procedure are what take up all the time.
The most expensive part of both of those queries was the sort needed on the CR_TransAlloc version.
There are gains to be had at several layers -
1) the transaction period calculation; my customer only uses PeriodType == 2, so an inline TVF can replace the complex scalar function.
2) the multi-statement gain/loss scalar function could possibly benefit from conversion to an inline TVF (less of an issue on new SQL Server versions, but still worth pursuing)
3) the creditor balance-as-at procedure has four inline sub-queries that can be reduced to two "outer apply" blocks.
4) some initial testing suggests that the creditor balance-as-at may work well as a TVF when the dataset is big enough - the difference was immaterial on smaller sets.
5) a TVF for the creditor balance-as-at would allow the removal of the loop in the summary sproc.
Apart from adding an index for the TransAlloc version (which helped), I've not looked too closely at the indexes yet.
Of course, none of that explains why the order of execution (including or excluding "order by creditorid") would make the slightest difference to the performance of whatever "creditor balance as at" code is running inside the timing blocks.