We all know that the common way to test whether approach "a" is better than "b" is to do the two things in a CFLOOP a million times or some such and compare the time spent. I've shown that if instead one instead tests the two approaches by using a load testing tool (perhaps running the request thousands of times), the differences between the two coding choices are wiped out in the noise of starting/stopping a request, etc. In such a case, it makes me wonder if such tweaks are worth the bother, unless indeed you are doing the given thing a million times in a loop.
This is how I started a discussion on a closed list of high-level CFML developers some months back, where a discussion had arisen about choosing one coding technique over another based on such loop testing. The subject has come up today on another list so I wanted to share the proposal here to point to on that thread (and for others to consider in general).
I had asked for feedback to that posting on that private list and got back nearly unanimous confirmation from some of the leading lights in the CFML industry. I've obtained permission from each to repeat their replies here.
Sean Corfield responded:
I've been arguing this exact same thing for years so I totally agree with you. Anyone who bases a code choice on the results of running some code fragment in a loop on their own workstation is living in a dream world!
Then Rob Munn wrote:
I like best practices, but I don't try to squeeze every millisecond out of every piece of code. I would rather build an app that can scale horizontally. I would rather focus my performance tuning on areas of code that are most widely used in an application....
To which Simon Horwith replied:
I have to agree that looking at syntax for ways to shave extra milliseconds is a low priority. That's not to say it isn't important, though. As was said, the database, SQL, and database connection settings is the first thing I'd look at. CF server settings in general are another biggie - having debugging turned on or off for example,can make a huge difference in performance (as can trusted cache and many other settings). I also pay close attention to memory usage - particularly in a high traffic site. Too much data in the session scope can bring a server to it's knees when a search engine begins crawling it. I also look at response sizes. I use CFSCRIPT a lot - not only because I find the syntax cleaner but also because of it's whitespace suppression. Pages with huge blocks of whitespace don't return to the client as fast. Trying to shave milliseconds is a last resort in performance tuning, in my opinion.
Dave Watts concurred with my original proposition:
This has absolutely been my experience. First, the "loop a thousand times" testing process doesn't appear to correspond with load-testing results at all, in many cases. Second, these things tend to vary across versions of CF, so one way might be marginally faster with, say, CF 5 while another way might be faster with CFMX 6.1 and so on. Finally, and most important, these sorts of "premature optimization" things don't seem to do anything except distract the developer's focus from the real bottlenecks in an application.
I haven't encountered an application yet which couldn't benefit from further examination of the database schema and application caching mechanisms.
Finally Pete Freitag wrote:
One thing to keep in mind in the load test vs monster loop debate is that since CF engines use a JVM now, the hotspot compiler comes into play. The hotspot compiler optimizes the java byte code based on the frequency of execution. So when your doing a loop test you almost always going to invoke the hotspot compiler, and you are really just testing which code the hotspot can optimize better - but in the real world execution the function call may never end up in the hotspot.
So there you have it. Next time someone proposes that you "throw some code in a loop" to test its performance, consider sending them to this posting to let them ruminate on these thoughts. Granted, neither I nor they are offering any demonstrated proofs of our assertions.
Someone should, at some point, put together a real documented test case to prove it for the nay-sayers. Still, these conclusions comes from probably combined nearly 100 years of CFML experience. I think they deserve serious consideration.
Even so, I'm sure some will want to argue against this conclusion. That's what blog comments are for, right? :-) Fire away.
You and the others are absolutely correct. If squeezing milliseconds out
of an app is that important, there are better places to spend your time.
These include database tuning, SQL tuning, JVM configuration, web server
tuning, and even hardware choices.
Great point about the other "bigger hitter" opportunities for improvement.
I should have elaborated on those and more. Thanks for doing so. As for the
query within a loop, I'll note that it's also often a sign of a missed
opportunity for a join, which would generally perform better. (Like so many
things, "it depends", but we're talking in generalities here.)
That may be true, but if you run analogous code in a loop between two
languages, if one language is considerably faster than the other then I
think this is still telling you something. This was the case when I ran
BlueDragon.Net vs. native C# .Net doing the same exact thing.
Tester, I'd still argue that any test of code between two languages using a
loop for comparison would be similarly contrived and lacking in validity.
If a typical app once deployed would never do such a monster loop, then
using this as a point of comparison is no better than the whole assertion
made above. A load test would be the far more effective and valuable
comparison.
Let me play devil's advocate here, despite the big names I'm arguing
against. I can totally agree with the sentiment that over optimizing CF
code is likely not worth the effort. But to argue against looping over code
to determine which method is fastest (for your given version of CF
presumably) doesn't seem to make sense to me beyond that initial issue of
"why bother".
Doug (don't know which Doug in the CFML world this is), I'm sorry to hear
you argue for the conventional wisdom on this. It doesn't seem that you're
giving credibility to the very heart of the proposal: that a loop just DOES
NOT equate to the same behavior/performance cost as a real load test.
Well sure, I'd like to see proof then. I understand there are instances
where the test would not be valid, but that doesn't mean it doesn't have
valid uses. From what you're saying, it sounds like the test is not valid
period, so I'd like to know exactly why.
I know it's been a few months since I last commented on this, and I didn't
mean to leave Doug hanging. I just couldn't then (or before now) give time
to putting together a test suite to document my assertions. I do plan to
have that time next week, finally, for anyone still following along.
"Premature optimization is the root of all evil." ~Donald Knuth
Well sure, I'd like to see proof then. I understand there are instances
where the test would not be valid, but that doesn't mean it doesn't have
valid uses. From what you're saying, it sounds like the test is not valid
period, so I'd like to know exactly why.
Fee, I do hope to oblige you and others at some point with a documented
test case, but until then, can I remind you that this is not just my
opinion but one shared by several leading lights in the CFML tuning world.
Do you really want to argue against them all so readily?