Blog

1Dec
2006
ColdFusion Is Not A DBMS!

I just spent some time helping a very irate customer. You know, the "ColdFusion sucks" "we're going to dump CF" "how can you sell this ****" brand of irate. The cause of all of this anger and frustration was very serious ColdFusion performance issues. And I am sure the individual has been under a lot of pressure and so the ramblings are somewhat understandable.

But, at the end of the day, as is so often the case, the bottleneck was in database access and query processing. No, there was nothing wrong with the DBMS. And no, there was nothing wrong with the SQL used. And no, the connection and data source settings were not the problem.

So what was the problem? Simply, ColdFusion was being made to do the DBMS' job! I found dozens of queries that were returning data that was being used solely to feed other queries. I saw queries being returned and then filtered client-side, within ColdFusion. I even saw queries (some that were quite long running) that were never being used once executed (leftover code from previous edits, I assume). And more.

I've been saying this for years, but, let's go through this again. ColdFusion is NOT a DBMS. Whatever database back-end you are using, chances are that it is a big powerful application that is designed to do one thing and do it well, manage and manipulate data. It makes absolutely no sense to give your DBMS less work to do while making ColdFusion work hard doing what the DBMS is supposed to do, what it is designed to do, what it is optimized to do.

So, a few rules:

  • Don't ever retrieve data you don't need.
  • Don't ever retrieve too much data, only to filter the results within ColdFusion (via querying a query, or excluding data while looping).
  • Don't ever retrieve data if that data is only going to be used to drive additional queries. If you have a query, and then are using the results (perhaps in a <cfloop>, or via a ValueList()) in subsequent queries, then you need to find a way to consolidate those queries and do the work on the DBMS. And yes, that may mean writing JOINs or using stored procedures.
  • Don't ever perform calculations and aggregations within ColdFusion, unless the data being used is already retrieved for some other purpose. Yes, the SQL aggregate functions (and having to use GROUP BY and figuring out how to make that work while using WHERE or JOINs) can be a little tricky. But do it anyway.
  • Don't ever underestimate the important of caching query results. If you use the data frequently, and it is not changing as frequently as it is used, then cache it!
  • There's more, but you get the idea.

Keep the following in mind:

  • ColdFusion is never going to be able to query/sort/filter/count/sum/average/etc. as fast as your DBMS will. That is not what ColdFusion is optimized to do, and it is exactly what your DBMS is optimized to do.
  • Sending unnecessary data from your DBMS to ColdFusion creates unnecessary network traffic, puts unnecessary load on the ColdFusion server hardware, unnecessarily increases server memory load, and unnecessarily slows down your and other requests.
  • More often that not, your ColdFusion box is going to be under heavy load while your DBMS box sits with cycles to spare. It makes no sense to not leverage that power and potential.
  • And as a side benefit, by moving data processing and manipulation back where it belongs, on the DBMS, you significantly increase the likelihood that you'll be able to reuse that work and effort.

And a couple of side notes:

  • None of these comments are ColdFusion specific, and are just as applicable to applications developed in PHP, ASP, JSP, etc.
  • You MUST learn how to use your DBMS. In the situation that triggered this post, one particularly problematic query was taking over 10 seconds to execute. Adding an index to the table used dropped that down to a couple of milliseconds!
  • "My DBA does not let me write stored procedures" is not a valid argument, and is all the justification you need to get a new DBMS or a new DBA!

Considering how data centric ColdFusion applications tend to be, all of this is critical!

Bottom line, you have a DBMS for a reason, use it!

Comments (35)



  • Wim

    One of the things I used to do (years ago) is select * from table just to get a query.recordcount... should be select 'x' from table which doesn't return the ENTIRE table... perhaps a "performance hints and tips" chapter could be a nice addition to a new CFWACK?

    #1Posted by Wim | Dec 1, 2006, 11:32 AM
  • Jim Wright

    This is an excellent post, and something that should be read by everyone starting out in CF development (and perhaps any DB driven web development). If I was going to add one thing to it, it would be to look at the GROUP attribute of CFOUTPUT and really understand what it can do for you...understanding that can lead the way to eliminating a lot of extra trips to your DB.

    #2Posted by Jim Wright | Dec 1, 2006, 11:57 AM
  • Tuyen

    Great post. We need to remember to let database does MORE work... CF like a presentation layer (may be some business logic, I think)

    #3Posted by Tuyen | Dec 1, 2006, 12:21 PM
  • Jose Lora

    This is a great post. I’ve seen some .Net developers bringing entire tables to variables just to iterate and calculate some results. It’s amazing that we still need to remind developers that they have to use the RDBMS at its full potential instead of trying to re-code it on their apps.

    #4Posted by Jose Lora | Dec 1, 2006, 12:38 PM
  • Peter Bell

    Agreed 100%. One small detail is you MAY want to bring back more data that you need as part of a caching strategy as if you have lots of similar queries against relatively static data it may make sense to return a little more data and to cache that in memory. Perfect example? If you only have 200 pages in the db, no reason to requery the database for every page view just to create a personalized sidebar - you can do that quicker in CF.

    Also, CF query caching is fine, but especially if your app "owns" the db (no third party apps change the db without your apps knowledge) consider an IsDirty caching strategy. If there are third party apps, try to implement some kind of distributed IsDirty notification - time based db caching really isn't an optimal approach for most use cases.

    http://www.pbell.com/index.cfm/2006/6/26/isDirty--...


    Ben - don't know if you ever say Charlies article:
    http://carehart.org/blog/client/index.cfm/2006/10/...

    The easier captcha is really nice and still seems to block the spam . . .

  • Jeff Houser

    There are exceptions to every rule...

    I once had a project where data is stored like this:

    Users (Userid, mydata1, Mydata2, mydata3, mydata4, etc.. )

    instead of like this:

    Users (UserID )

    Users_Data (userID, dataID)

    Data (DataID, MyData)

    The design violated some form of relational DB design for sure. The business decision (above me) was made to "deal with it" instead of fixing it. When we had to perform aggregate functions off the data, we ended up retrieving the data, creating a query inside CF along the lines of:

    ComputedQuery: (UserID, DataID, MyData)

    And then using CFs QOQ's aggregate functions to get the totals we needed.

    I wish we could have fixed it, but...

  • Peter Bell

    Depending on the DB, you *could* actually solve it using stored procedures. The SP could generate temp normalized tables and then operate on them, or you could keep two separate representations of the data - one default and the other normalized with triggers to handle updated and synchronization. To be honest I'm not enough of a SQL person to do this, but I know there are people who live for this stuff.

  • Joe Zack

    Long Time Listener, First Time Caller

    Excellent post, SQL was created for the sole purpose of managing data. Let it do it's job!

    #8Posted by Joe Zack | Dec 1, 2006, 02:06 PM
  • Greg Nilsen

    As a DBA at my current job (to go along with my CFMX 7 Certification), I have to agree 100%. Let the database do it's work, and if you're current DBA can't get it done, find one who can! We're out there!

  • Jeff Houser

    Peter,

    Right you are.

    Today, I can think of alternate "more-SQL" oriented approaches to the problem.

    If I come across this again, I'll probably create a view which contains normalized data (using the Union operator) and the can just run the aggregate functions from the view.

    At the time, we chose to address this in CF. I don't remember if there were other circumstances surround that decision.

  • Ben Forta

    Jeff, as Peter suggested, SPs would be a good way to solve that one. Also, creative use of VIEWs could maybe be a solution too. I am not disagreeing with you, every rule does indeed have exceptions, but when there is a DBMS side solution I'd much prefer that.

    --- Ben

    #11Posted by Ben Forta | Dec 1, 2006, 02:24 PM
  • Ben Forta

    Hey, look at the comment time stamp, we both suggested VIEWs as an option at the same moment. :-)

    --- Ben

    #12Posted by Ben Forta | Dec 1, 2006, 02:25 PM
  • O?uz Demirkap?

    You have a great point Ben!

    This is very important if you have a project that you are sure that it will work on a platform that you know.

    But we had some difficulties when I was working for a company which has a big content management system. Depending on customer needs we would like to suggest as much as possible different DB platform as solution and we had decided to do minimal work on DB level and we wre using DBs such as storage platform with some minimal work such as indexing etc.

    Our system that we had created is still working on MSSQL, MySQL, PostgreSQL, Oracle, DB2 and Sybase. There are some DAOs for every database platform and they deals with SQL stuff depending on customer needs.

    I mean it also depends on your project target.

    I think the main issue is caching for application performance. I am working on a customer project now where my main problems are caching queries and objects. I have some pages that they gets same lookups from DB again and again or some pages where I have same object creation lots of time. We need to educate developers to learn how they can create optimized codes.

  • Jose Lora

    > But we had some difficulties when I was working for a company which has a big content
    > management system. Depending on customer needs we would like to suggest as much as possible
    > different DB platform as solution and we had decided to do minimal work on DB level and we were
    > using DBs such as storage platform with some minimal work such as indexing etc.

    To quickly comment on this, there is no reason why we should lower DB load process to the least common denominator. If we are going to support multiple DBs, let’s do it the right way by providing specialized interfaces that make good use of the different DB capabilities.

    #14Posted by Jose Lora | Dec 1, 2006, 03:14 PM
  • O?uz Demirkap?

    Yes. This is a reason of using DAOs. I also suggested to use Stored Procedures and other DB solutions on DAOs/ But then there was maintance problems because of having different solutions depending on your DB platforms.

  • Peter Tilbrook

    This is the sort of issue that does more hurt to ColdFusion than good. Any language should not try to do what a dedicated server (be it DBMS, file repository, email, etc). Use ColdFusion as a way to better, more efficiently, interface with those systems - not as the all-in-one fix-every-issue solution.

    The same issue would occur with PHP, .NET or any other language. ColdFusion is good (well after 11 years using it fantastic). But as Ben says, let it do what it does best and let your DEDICATED support systems integrate nicely and not hamper it.

    If you are doing anything serious with DBMS or email it is always better to share the load across servers rather than hulk it onto a single box running everything. Hardware is cheap. Patience is a virtue few of us have these days.

    If running a decent ColdFusion shop you would at the least have a seperate server for ColdFusion, Email and DBMS, decent HDD space and RAM, and good bandwidth. With close monitoring you should rarely have major issues.

  • Tony Petruzzi

    From the duh department.

    I can't believe that you even had to write something like this up. Are people out there still that stupid that they don't understand how to write queries from a database. Have people out there never heard of views and indexed views?

  • Anj

    Provide developers a language that allows a user to get work done without any real database knowledge... shield them from the most basic SQL statements (<CFINPUT>, for example)... and you get developers who take the DB side completely for granted.

    #18Posted by Anj | Dec 2, 2006, 09:14 PM
  • Jeff Houser

    Anj,

    CFINPUT is used in conjunction w/ CFFORM, and has nothing to do with database queries. You probably meant CFINSERT and/or CFUPDATE?

    Tony,

    I don't think you can get very far in any web application w/o knowing how to write a query from the database. I would expect that anyone who can write a SQL query in a CF app should be able to write a view at the database level. The problem here is that people aren't taking advantage of the database. It may very well be that they haven't heard of views.

    Perhaps it's also because "It works, and the boss / customer is happy, so why change it."

  • Gareth Edwards

    One of the issues about using your database to take care of certain things is that those things can be engine specific. Although I agree that you should make the db engine do most of the work, there are plenty of things you can to do optmize performance when using Coldfusion to do data specific things.

    Altough when you start talking thousands of records, things can get a tad slow.

    1) Good Database design helps write fast queries that return the data you require.
    2) INNER JOIN (try and get everything you need in one go, with out going crazy with outer joins,etc) - don't loop over a master query to do 100+ seperate queries.
    3) CREATE FAKE INDEXES (Use coldfusion to create structures with keys that allow you to "link" back to master records, this helps reduce amount of queries required, and also the amount of looping required.)

    These are just a couple of things that help keep things under control.

    Cheers
    Gareth.

    #20Posted by Gareth Edwards | Dec 4, 2006, 06:02 AM
  • Stephen Cassady

    While clients like these are frustrating, it's really good to be able to come in and actually resolve their problems. The added bonus is being able to demonstrate or explain what the issues are and identify where the bottlenecks are.

    Not only does it feel good to see clients honestly smiling about the performance improvements through the database, but I've been able to generate that with code overhauls as well (rewriting a system from spagetti code to Fusebox, there was dramatic performance issues).

    It's sad though that clients get to the point where they're so frustrated by the performance of their application, and in many ways IT/IS is to blame because we fail to use clear and simple language to communicate what their application does, how, and the process it interats with other components. We are terrible at talking shop and making systems "incomprehesable" to our clients, and even people or other departments as required.

    Yeah, I often find that clients who have gotten to the stage of rage that you describe Ben have had IT or support individuals who never communicated in a manner that encouraged trust, faith, or belief in the system as well - leaving clients with the impression they're stuck with a "very expensive black box that's too expensive to fix" and needs to be replaced like a commodity - all of it in total.

  • Craig M. Rosenblum

    I agree strongly with this.

    Sometimes we focus so much on oo, flex, ria's that we forget the basics of making sure the applications are scalable.

    I am somewhat knowledgeable about databases, sql server in particular. But I can't imagine a company that has a big and complex database, to not want to hire a dba, to help manage and maintain it.

    Some areas do require special knowledge, skills and experience.

  • Brian Swartzfager

    I disagree with all of the commenters who think this issue is a “no-brainer.” The benefit of writing queries within the DBMS may be obvious to folks who write database-intensive enterprise-class applications, but it is not obvious to developers with small intranet applications who may not have access to the DBMS system for one reason or another.

    I work at a university where many of the colleges and departments have web sites and web applications hosted by the central IT office. More than a few of them have ColdFusion web applications that access data from one of our Oracle servers. The developers of these various applications range in experience from full-time developers to enlightened undergraduate students, but none of them are allowed direct access to the database system: the only way they can access their data is through ColdFusion calls. They have no line of communication to the DBAs, and even if they did, there’s no way the few DBAs we have would be able to handle their query requests in addition to their current workload, certainly not in a timely fashion anyway.

    Let’s not forget that one of the wonderful aspects of ColdFusion is that is it very easy to learn how to build basic web applications with just a little time and effort. As a result, a number of ColdFusion developers (I won’t say many, I really don’t know) start out with very little formal training, working on small projects that utilize shared back-end resources they themselves have little control over. When they need to write a new query to retrieve data, they probably refer to whatever ColdFusion book they have at hand for guidance, and while that book might mention stored procedures, it’s not necessarily going to give them the details they need to write such a procedure in their DBMS.

    I’m all for promoting good coding and development practices and encouraging people to follow such practices, but we shouldn’t characterize the developers who don’t follow those practices as being deliberately ignorant or “stupid”: they’ll just tune us out, and we’ll have missed a chance to have a positive impact on our development community as a whole.

  • Ben Forta

    Brian,

    Very valid point, there are exceptions to every rule. Check out another post I made a short while ago at http://www.forta.com/blog/index.cfm/2006/10/3/Use-..., I think we are in agreement. But having said that, I'd like to think that we can raise the bar somewhat (albeit slowly and as needed), failing to do so comes back and bites CF (and the CF developers) in the butt.

    --- Ben

    #24Posted by Ben Forta | Dec 5, 2006, 06:45 PM
  • Russ Michaels

    Brain,

    Such situations wouldn't arise if developers spent a little time learning how to correctly design and use a database first. If you are taking the time to learn CFML then why not everything else that goes with it? Really its common sense.
    You wouldn't try and build a house just by reading a book on how to lay bricks.

    Unfortunately the seeming simplicity of CFML probably does promote more poorly built apps than other languages.

  • Chris Velevitch

    Ben,

    Does this mean Scorpio now returns a query object for insert, update and delete and I can test the number (RecordCount) of rows that were actually inserted, updated and deleted? Or do I still need to write extra code (database or otherwise) to determine that number?

    #26Posted by Chris Velevitch | Dec 8, 2006, 01:28 AM
  • since1968

    Brian, you make a good point that it's not always possible to adopt best practices when accessing data. I've also had clients who only allow access to data via CF. One way to mitigate this problem is to make sure all of your data access is in CFCs--i.e., your CFM pages really shouldn't care where they're getting their data. This doesn't speed up page performance, but it DOES mean that when you eventually get access to the DBMS itself you can rewrite your CFCs without having to rework your presentation pages.

    Also, if many people @ your workplace are having the same problems with limited access to data, you should get together and petition for your DBA to create "views." SQL Server and Oracle both provide views for your exact use/case: your DBA can define a view and say precisely what it contains and who has permission to use it. As long as your DBA is willing to manually index the views your performance will be almost (but not quite) as good as using stored procedures.

    Anyway, great post Ben.

    #27Posted by since1968 | Dec 8, 2006, 10:05 AM
  • Dave

    I am trying to use Query of Queries to get publication stats for authors. The problem is that the output is needed in grid format from fist year to highest year. My output data only has values for years with data. I dont know how to get the SQL to give the same results. The QofQ works, but yes-- you can see how slow it is.
    Please see this thread and comment if you have any better ideas.

    http://www.adobe.com/cfusion/webforums/forum/messa...

    Thanks, Dave :+)

    #28Posted by Dave | Dec 8, 2006, 10:21 AM
  • Peter Bell

    Hi Dave,

    Just one general comment. I am just good enough to be dangerous with SQL, but whenever I have a question I can't answer I go to Experts Exchange. For the price of a cup of diner coffee I get $90,000 a year SQL experts climbing all over themselves to help me out. They won't get all the column names right and test the code for you, but I can't remember the last time I didn't get what I needed to figure out a given SQL problem in under an hour.

  • Russ Michaels

    you can help for FREE on cf-talk

  • Ben Forta

    Chris, actually, CF 7.0.2 will return the number of rows affected by INSERT UPDATE or DELETE in the RESULT structure.

    --- Ben

    #31Posted by Ben Forta | Dec 8, 2006, 02:01 PM
  • Kevin Sargent

    "One of the things I used to do (years ago) is select * from table just to get a query.recordcount... should be select 'x' from table which doesn't return the ENTIRE table... perhaps a "performance hints and tips" chapter could be a nice addition to a new CFWACK?"

    Shouldn't that be select count(x) ? ? you're still returning many many rows with select x, just not as many columns.

    #32Posted by Kevin Sargent | Dec 15, 2006, 07:33 PM
  • Joe

    Also remember that turning on Unicode in the CF Administrator while using SQL Server and then having fields in your database that are varchar instead of nvarchar, etc. causes the ColdFusion/SQL Server connector to do a conversion on the field and makes it terribly slow. Simply calling a stored procedure eliminates this issue. Or you can change your field type to the Unicode equivalent.

    #33Posted by Joe | Feb 13, 2007, 03:37 PM
  • Rodney Johnson

    How is it that people who know nothing about databases are writing database applications? I mean 'programmers' that are writing procedural code in CF because they don't know a simple inner JOIN?!?!? When I first started doing CF work in '96 (ver. 1.5) SQL was new to me and one of the first things I did was call Jeremy Allaire and ask what SQL book he suggested! Ever since I always noticed that the database execution was the chief bottleneck to all DB applications and taught myself how to tune queries and write indexes and often ended up being the defacto-DBA for many of the projects I worked on! Come on people, if you are using CF then you are probably making a DATABASE application and you need to learn how relational DATABASES work before you write one line of procedural (or fake-OOP, i.e. procedural code that looks like OOP) code. Stop using features just because they are there! Usually new features are afterthoughts (and thus should be viewed as 'Last Resorts') to be used by experienced programmers to code around bad initial design (i.e. they'd never INITIALLY use these features). Learn the difference! I just can't believe how much such an elegant mix of SQL and server-side include scripting within web-pages could be so FUBAR'd by a multitude of newbs that are essentially so scared of Databases that they don't bother themselves with learning how to actually use them.

    P.S. Ben, I don't know how you do it, I really don't... the few hairs on your head you have left must be terribly grey! I know I'd have pulled out all of mine years ago if I was you!!!

  • passager

    coldfusion is not a DBMS,but I think it should have some sqls function.Because now some company need more and more data management.coldfusion will and should be strong.

    #35Posted by passager | Sep 25, 2007, 11:54 PM