Blog

11May
2007
Scorpio File I/O Enhancements

At one of the usergroup sessions this week someone asked if there was a way to get file information (size, date time, etc.) easily using a function. I said they should use <CFDIRECTORY>, but afterwards remembered that we did indeed add a new function to Scorpio called GetFileInfo() which returns a structure containing:

  • canread
  • canwrite
  • ishidden
  • lastmodified
  • name
  • parent
  • path
  • size
  • type

Which makes this a good opportunity to review some of the file i/o changes coming in Scorpio.

For starters, if you have ever had to work with large text files in ColdFusion (maybe parsing a large CSV file) you'll know that doing so is very inefficient. You probably use code like this:

view plain print about
1<!--- Read entire file --->
2<cffile action="read"
3    file="#fileName#"
4    variable="myFile">

5<!--- Loop through file variable one line at a time --->
6<cfloop list="#myFile#"
7    index="line"
8    delimiters="#chr(10)##chr(13)#">

9    <!--- Do stuff with line here --->
10    ...
11</cfloop>

This is slow for two reasons. Not only does ColdFusion read the entire file into memory in a variable all at once, but also looping through the file requires treating it as a list which involves lots of parsing which can also be resource intensive.

Well, inefficient no more. In ColdFusion Scorpio you'll be able to replace the above code block with this:

view plain print about
1<!--- Loop through file one line at a time --->
2<cfloop file="#fileName#"
3    index="line">

4    <!--- Do stuff with line here --->
5    ...
6</cfloop>

This code block open the file, reads one line at a time, and closes it when done. I actually used this myself recently in a ColdFusion code snippet that had to parse a massive tab delimited file, turning each line into a query row. Replacing the old <CFFILE> <CFLOOP> with a new <CFFILE FILE=> cut down processing time from several minutes to under 10 seconds.

Oh, and although reading files line by line is the more common use case, you can also read by n characters at a time, like this:

view plain print about
1<!--- Loop through file 100 characters at a time --->
2<cfloop file="#fileName#"
3    index="chars"
4    characters="100">

5    <!--- Do stuff with line here --->
6    ...
7</cfloop>

In addition to the <CFLOOP> enhancements, we've also added lots of new file i/o functions that you can use to access and manipulate files directly. The new functions include:

  • FileClose()
  • FileCopy()
  • FileDelete()
  • FileIsEOF()
  • FileMove()
  • FileOpen()
  • FileRead()
  • FileReadBinary()
  • FileReadLine()
  • FileSetAccessMode()
  • FileSetAttribute()
  • FileSetLastModified()
  • FileWrite()
  • GetFileInfo()
  • IsImageFile()
  • IsPDFFile()

Comments (11)



  • Rob Wilkerson

    Am I safe in assuming that the file loop releases each line from memory when it's processing completes? Otherwise the memory issues will persist, but you didn't mention it explicitly so I thought I'd verify.

  • Ben Forta

    Rob, yes, it reads a line at a time, it does not keep previously read lines in memory, and so it does indeed solve that memory issue.

    --- Ben

    #2Posted by Ben Forta | May 11, 2007, 01:16 PM
  • Rick Root

    Ben, this is awesome. I especially like being able to do file/directory operations in cfscript without having a bunch of UDFs!

    #3Posted by Rick Root | May 11, 2007, 01:19 PM
  • tony of the weeg clan

    you have just made my day! thanks for the info ben.

  • Damien Jorgensen

    I cant wait to see the performace increase when dealing with CSV style files

  • Stephen Cassady

    Arrgh. So just release the @#$%@$#%^ thing already! :-)
    Seriously, looks like we're getting some very nice robust enhancements.
    Some which I can use like now! :-0

    Anyways, I do look forward to the release and your teaser points are nice to read. It's comforting to know the there are advantages to an owned and fiscally supported product - ie: features, and that Adobe's first full release under their own name (I think the current version has one of Adobe's longest official products names, something like "Macromedia ColdFusion MX 7 Standard Edition By Adobe", or something like that) should be a substantial re-launch for the product.

  • James Holmes

    In the words or Dr Who, "Fantastic!"

  • Marko

    Oh boy, this is going to solve so many problems I've had in the past using cffile/cfloop.
    Fantastic new feature.

    #8Posted by Marko | May 13, 2007, 09:38 AM
  • Shuns

    I am wondering though if there is any functions for directory manipulation to complement then file I/O functions? That would be great also.

    #9Posted by Shuns | May 13, 2007, 09:45 PM
  • Abhi

    Hello Ben
    This is just great work. Thanks for all you did.

    #10Posted by Abhi | May 16, 2007, 05:44 PM
  • Downs

    This is an old thread but I am parsing a large CSV file with data that might contain commas in each quoted string. For instance, one record might look like this: "454454545","Last Name, First Name","City, State" It's the comma INSIDE the quotes that is throwing everything off. How can I allow for this?

    #11Posted by Downs | Mar 10, 2008, 06:35 PM