A ColdFusion user asked me for a way to programmatically determine if a URL exists, so I threw together this UDF. It uses
Updated as per Steve’s sugestion and Gus’ important feedback.
A ColdFusion user asked me for a way to programmatically determine if a URL exists, so I threw together this UDF. It uses
Updated as per Steve’s sugestion and Gus’ important feedback.
May I "steal" this for cflib?
You might want to use CFHTTP method="head" instead of the default method="get". Its faster because it only retrieves the headers not the full page content, although it does depend on the the webserver being configured to accept the head method.
Ben,
This function is not quite correct. You need to first check that status code is actually returned. If you run your code against a domain that doesn’t resolve, you won’t get a status code to check and will throw an error. Try it with http://www.benfrta.com
The corrected code should be:
<!— Does a URL exist? Checks for 404 status code. —>
<cffunction name="URLExists" output="no" returntype="boolean">
<!— Accepts a URL —>
<cfargument name="u" type="string" required="yes">
<!— Initialize result —>
<cfset var result=true>
<!— Attempt to retrieve the URL —>
<cfhttp url="#ARGUMENTS.u#" resolveurl="no" throwonerror="no" />
<!— Check That a Status Code is Returned —>
<cfif isDefined(‘cfhttp.responseheader.status_code’)>
<cfif cfhttp.responseheader.status_code EQ "404">
<!— If 404, return FALSE —>
<cfset result=false>
</cfif>
<cfelse>
<!— No Status Code Returned —>
<cfset result=false>
</cfif>
<cfreturn result>
</cffunction>
Steve, Gus, good points, updated.
Ray, sure, go for it.
Posted.
May I borrow this code for a short time for use in Geonosis(tm) ? I promise to return it just as soon as it’s been enhanced, optimized or otherwise improved. Thx.
I realize this post is a few years old, but I still found it and tried to use it. I wanted to make note of a couple of issues I found for future users.
I think the ‘NOT IsDefined("cfhttp.responseheader.statuscode") ‘ should be in parenthesis. Maybe I am just a paranoid programmer (I haven’t actually tested it), but I think that having the ‘NOT’ at the beginning of the statement without putting it into parenthesis would make the not work against the whole statement. For instance the status_code actually returns 404, the ‘NOT’ would make the whole statement false and the url would be deemed valid, even though it is not.
…We’ll see if this makes it past the spam filter… Hopefully my second half will as well…
…Second half…
I ran into another issue on my server. Its DNS provider still sends 200 status code if the URL does not exist. Maybe it is a bug in their code, but I added a check for it. Other DNS providers may have similar issues, so check for a known bad URL first and adapt as necessary. I checked cfhttp.responseheader.server for a "OpenDNS Guide".
For some reason my comment is not showing up, but the second issue was with OpenDNS not returning 404.
<cfif (NOT IsDefined("cfhttp.responseheader.status_code")) OR cfhttp.responseheader.status_code EQ "404" OR (IsDefined("cfhttp.responseheader.server") AND cfhttp.responseheader.server EQ "OpenDNS Guide")>
Leave a Reply