The Errors of Error Pages
Over the past few months, the number of sites infected with malscripts has increased dramatically. Many of these injection infections are difficult to track. Unbeknownst to many site operators, “error pages” can actually complicate the detection process. This blog posting discusses what we call “The Errors of Error Pages”.
Frequently, if you mistype a word in a URL, the “Page Not Found” error page is displayed. The very plain, non-descriptive message is not terribly user friendly in that it gives minimal information. The error code produced by a “Page Not Found” is a 404.
If you request a non-existent page on a Microsoft IIS webserver you might see something like this:
Much has been written about preventing the typical “Page Not Found” error page from scaring away potential buyers. However, most of these marketing articles omit the critical discussion of how cybercriminals use these error pages to distribute their malware. This posting focuses on that topic.
The General Problem
When a site discloses Google’s moniker, “This site may harm your computer”, the user’s or host’s first response is to scan their website with anti-virus programs – rarely will this find the malscripts. Since Google prohibits the site from appearing as a normal search result while generating this message, the user aims to quickly find the injection infection. Once discovered, the site then seeks Google’s permission to reappear. We’ve handled many cases where everyone from the hosting provider, to friends, to the web developer, has checked “every file” and found nothing malicious on the site in question. Often, the error page is the source of the problem. However, they routinely fail to investigate the error pages – and cybercriminals know this.
To understand the criminal mind, one must first understand the various response codes generated by different requests. For example, when one uses their browser to request http://www.wewatchyourwebsite.com, the page actually exists. Therefore, the response code the browser receives is a 200. These codes don’t appear on the screen, but the browser sees them.
On the other hand, if one types in http://www.wewatchyourwebsite.com/fredflintstone.php, the browser would generate a 404 (Page Not Found) response code because there is no page with that name on the site.
To avoid a user receiving a 404 response, and the resulting “ugly” Page Not Found page, a website can be configured to generate a different response for those requests which would typically result in a 404 response code. Instead of a 404 response, you would see a page that’s been created to replace the “Page Not Found” response, or some substitue page that informs the visitor that the page they’ve requested has either moved or does not exist.
Use of Security Tools
In our work, we’ve tested various tools, vulnerability scanners, exploit engines, etc. seeking a vulnerable script file or software exploit, and found that if the tool sends a request to a website that generates a response of any kind, often times the tool considers the exploit successful. However, if the website being tested is setup to return a custom error page rather than the basic “Page Not Found” page, the security tool will record that attempted exploit as successful, thus, rendering a false positive.
For example, a security tool may be used to check for a vulnerable version of some shopping cart software. If the website being checked is set up to return a customized 404 error page, the security tool will see that it generated a webpage response to it’s request for the vulnerable shopping cart URL. If the tool detects a webpage in response to it’s check, the tool will assume that the site must have the vulnerable version of the shopping cart software – a potentially false positive.
Since hackers know that false positives arise under these circumstances, when they infect a website, they inject their infectious code into the default error pages. As cybercriminals also know, frequently, these pages are neglected by those working to detect infections on websites.
Clues to Find and Methods for Searching
Knowing all of this, during a search for infections, we always check for fredflintstone.php. (When we start seeing websites with a webpage with this name, we might switch to betty.php, wilma.html, barney.cfm or dino.asp.) Nevertheless, by checking for pages that we know don’t exist, we are confident that we have scanned for this obvious point of infection, and thereby detected possible cybercriminal activity.
Further, many shared hosting services use a folder off of the root folder named something like “error_docs”. Often, the hosting provider will fill that folder with basic webpages that a site uses as responses when visitors request webpages they aren’t allowed to see or simply don’t exist. Sometimes these files will be named with the response code, e.g. for a “Page Not Found” error the resulting webpage might be called 404.html. Other times, the webpage will be called by the error name it’s produced by – like “page_not_found.html” for a 404 response code.
Every host or site owner should determine how their site handles these different responses and check those files for any malscripts. At the end of this article, we suggest a valuable tool to conduct such checks.
In the course of our work, we recently discovered a rather ingenious way of delivering malscripts through the use of 404 error pages. Apache Web server software can be configued differently to a request for a webpage that doesn’t exist.
One basic response is in the configuraton file: httpd.conf, and it would look like this:
- ErrorDocument 404 /404.html
If you’re on a shared hosting plan (you’ll know if you’re not), you probably (hopefully) don’t have access to this file. But you will have access to .htaccess (yes there is a period in front of that file name). This file might also have the same entry for ErrorDocument listed in there.
How do hackers use this to infect visitors to one of their distributional assets?
One of two ways.
First, they can see what file is used for the 404 (or other such response codes) and inject their malscript into that page. This can be found during a scan of the files residing on the webserver.
Or, they can instead insert their own malicious URL replacing the /404.html in the line ErrorDocument…
Instead of this: ErrorDocument 404 /404.html
They would put: ErrorDocument 404 http://hackerswebsiteinsertedhere
That way when someone scans all the files with a search tool, it won’t find the malscript because the malscript isn’t in any of the files located on that server. It’s located on a server miles away.
This is why it’s always important to know how a site is handling 404’s and other errors. The specific method used by the hosting provider must be checked. Any suspicious looking should be checked and verified.
As hackers become more sophisticated, website owners and developers must as well. Therefore, while the hackers increase their attempts to infect websites, so too, must we all increase our efforts to detect and to block them.
How can you check your site?
I recommend a tool I learned about from Kaleh (a moderator on www.badwarebusters.org and a frequent contributor on Google’s Webmaster forum). The tool is a website: http://web-sniffer.net. Simply, enter a URL in the box at the top, add “/fredflintstone.php” (no quotes) to the end of it, and hit “Submit”.
Scroll down to the bottom of the screen to see what HTML/code the site sends to a visitor’s browser when they request a page that doesn’t exist (404 error).
If you see something that looks out of place, you should suspect that code, research it and possibly remove it. If you ever have any doubts, please contact me and I’ll review it for you. We have deobfuscation tools available and can usually determine what a piece of obfuscated script is really doing.
Should you have any questions or wish to continue this discussion, please post your comments below or contact directly at firstname.lastname@example.org