By

The Errors of Error Pages

Over the past few months, the number of sites infected with malscripts has increased dramatically. Many of these injection infections are difficult to track. Unbeknownst to many site operators, “error pages” can actually complicate the detection process. This blog posting discusses what we call “The Errors of Error Pages”.

Frequently, if you mistype a word in a URL, the “Page Not Found” error page is displayed. The very plain, non-descriptive message is not terribly user friendly in that it gives minimal information. The error code produced by a “Page Not Found” is a 404.

If you request a non-existent page on a Microsoft IIS webserver you might see something like this:

 404-iis1

Much has been written about preventing the typical “Page Not Found” error page from scaring away potential buyers. However, most of these marketing articles omit the critical discussion of how cybercriminals use these error pages to distribute their malware. This posting focuses on that topic.

The General Problem

When a site discloses Google’s moniker, “This site may harm your computer”, the user’s or host’s first response is to scan their website with anti-virus programs – rarely will this find the malscripts. Since Google prohibits the site from appearing as a normal search result while generating this message, the user aims to quickly find the injection infection. Once discovered, the site then seeks Google’s permission to reappear. We’ve handled many cases where everyone from the hosting provider, to friends, to the web developer, has checked “every file” and found nothing malicious on the site in question. Often, the error page is the source of the problem. However, they routinely fail to investigate the error pages – and cybercriminals know this.

Relevant Codes

To understand the criminal mind, one must first understand the various response codes generated by different requests. For example, when one uses their browser to request http://www.wewatchyourwebsite.com, the page actually exists. Therefore, the response code the browser receives is a 200. These codes don’t appear on the screen, but the browser sees them.

On the other hand, if one types in http://www.wewatchyourwebsite.com/fredflintstone.php, the browser would generate a 404 (Page Not Found) response code because there is no page with that name on the site.

To avoid a user receiving a 404 response, and the resulting “ugly” Page Not Found page, a website can be configured to generate a different response for those requests which would typically result in a 404 response code. Instead of a 404 response, you would see a page that’s been created to replace the “Page Not Found” response, or some substitue page that informs the visitor that the page they’ve requested has either moved or does not exist.

Use of Security Tools

In our work, we’ve tested various tools, vulnerability scanners, exploit engines, etc. seeking a vulnerable script file or software exploit, and found that if the tool sends a request to a website that generates a response of any kind, often times the tool considers the exploit successful. However, if the website being tested is setup to return a custom error page rather than the basic “Page Not Found” page, the security tool will record that attempted exploit as successful, thus, rendering a false positive.

For example, a security tool may be used to check for a vulnerable version of some shopping cart software. If the website being checked is set up to return a customized 404 error page, the security tool will see that it generated a webpage response to it’s request for the vulnerable shopping cart URL. If the tool detects a webpage in response to it’s check, the tool will assume that the site must have the vulnerable version of the shopping cart software – a potentially false positive.

Since hackers know that false positives arise under these circumstances, when they infect a website, they inject their infectious code into the default error pages. As cybercriminals also know, frequently, these pages are neglected by those working to detect infections on websites.

Clues to Find and Methods for Searching

Knowing all of this, during a search for infections, we always check for fredflintstone.php. (When we start seeing websites with a webpage with this name, we might switch to betty.php, wilma.html, barney.cfm or dino.asp.) Nevertheless, by checking for pages that we know don’t exist, we are confident that we have scanned for this obvious point of infection, and thereby detected possible cybercriminal activity.

Further, many shared hosting services use a folder off of the root folder named something like “error_docs”. Often, the hosting provider will fill that folder with basic webpages that a site uses as responses when visitors request webpages they aren’t allowed to see or simply don’t exist. Sometimes these files will be named with the response code, e.g. for a “Page Not Found” error the resulting webpage might be called 404.html. Other times, the webpage will be called by the error name it’s produced by – like “page_not_found.html” for a 404 response code.

Every host or site owner should determine how their site handles these different responses and check those files for any malscripts. At the end of this article, we suggest a valuable tool to conduct such checks.

More Examples

In the course of our work, we recently discovered a rather ingenious way of delivering malscripts through the use of 404 error pages. Apache Web server software can be configued differently to a request for a webpage that doesn’t exist.

One basic response is in the configuraton file: httpd.conf, and it would look like this:

  • ErrorDocument   404   /404.html

If you’re on a shared hosting plan (you’ll know if you’re not), you probably (hopefully) don’t have access to this file. But you will have access to .htaccess (yes there is a period in front of that file name). This file might also have the same entry for ErrorDocument listed in there.

How do hackers use this to infect visitors to one of their distributional assets?

One of two ways.

First, they can see what file is used for the 404 (or other such response codes) and inject their malscript into that page. This can be found during a scan of the files residing on the webserver.

Or, they can instead insert their own malicious URL replacing the /404.html in the line ErrorDocument…

Instead of this: ErrorDocument    404    /404.html

They would put: ErrorDocument    404   http://hackerswebsiteinsertedhere

That way when someone scans all the files with a search tool, it won’t find the malscript because the malscript isn’t in any of the files located on that server. It’s located on a server miles away.

This is why it’s always important to know how a site is handling 404’s and other errors. The specific method used by the hosting provider must be checked. Any suspicious looking should be checked and verified.

As hackers become more sophisticated, website owners and developers must as well. Therefore, while the hackers increase their attempts to infect websites, so too, must we all increase our efforts to detect and to block them.

How can you check your site?

I recommend a tool I learned about from Kaleh (a moderator on www.badwarebusters.org and a frequent contributor on Google’s Webmaster forum). The tool is a website: http://web-sniffer.net. Simply, enter a URL in the box at the top, add “/fredflintstone.php” (no quotes) to the end of it, and hit “Submit”.

Scroll down to the bottom of the screen to see what HTML/code the site sends to a visitor’s browser when they request a page that doesn’t exist (404 error).

If you see something that looks out of place, you should suspect that code, research it and possibly remove it. If you ever have any doubts, please contact me and I’ll review it for you. We have deobfuscation tools available and can usually determine what a piece of obfuscated script is really doing.

Should you have any questions or wish to continue this discussion, please post your comments below or contact directly at traef@wewatchyourwebsite.com

Thank you.

By

New Website Infection Method

Working with a website owner recently, we came across a new method of delivering infectious code (drive-by downloads) – at least it’s a method we’ve never seen before.

The scenario: Website owner gets the email from Google telling them their website is serving up malscripts to visitors and adds “This website can harm your computer” to all their SERPs. The website owner can’t find the malscript anywhere.

We scan their site and find nothing. Our scanning spiders their site, all links and even spiders the sites they link to.

Someone from another vendor says they found malware on a webpage that we didn’t even see. I start screaming “Why didn’t we find this page?” We try to manually download the page and we get a 404 error – page not found.

Turns out, the page didn’t even exist. We try to access the non-existent webpage with a sandboxed browser (sandboxed means it’s a system that can’t be infected due to all the security measures we’ve taken. It also records any attempted file changes, registry changes, etc.).

Bam! We see in the 404 error page that there’s some redirect code in there trying to access martuz.cn. Interesting.

We look at the address bar in our browser and see that it didn’t redirect to a custom 404 error page, it still shows the URL we typed in with the john_doe.html page at the end. We know from our scan that this client is running their website on an Apache 2.0 server.

Our research showed that in the Apache installation folder under a sub-folder of “error”, the HTTP_NOT_FOUND file had been modified and the malscript added.

Which begs the question, why would a cybercriminal go through all that trouble to only deliver the martuz.cn malscript to people who type in a non-existent webpage?

Not sure on that one.

We also found these files had been added to the default directory on the webserver:

  • bad_gateway.html
  • bad_request.html
  • forbidden.html
  • internal_server_error.html
  • method_not_allowed.html
  • not_acceptable.html
  • not_found.html
  • not_implemented.html
  • precondition_failed.html
  • proxy_authentication_required.html
  • request-uri_too_long.html
  • unauthorized.html
  • unsupported_media_type.html

Each of these pages looked like the default Apache error pages but with the martuz.cn malscript inserted between the closing HEAD tag and the opening BODY tag.

We found that Apache uses one of 4 options when handling error responses:

  1. output a simple hardcoded error message
  2. output a customized message
  3. redirect to a local URL-path to handle the problem/error
  4. redirect to an external URL to handle the problem/error

It didn’t appear to be redirecting as the URL in the address bar was still what we had entered. So we eliminated options 3 & 4.

At first when we saw the malscript only being delivered with 404 responses, we thought that maybe there must be some line in the httpd.conf file like:

ErrorDocument 404 /404.html

But there was no such entry in the httpd.conf file. It was definitely the default Apache error page with the martuz malscript inserted.

Further investigation found our theory was correct.

Lesson: When trying to find all the infectious pages on your site, don’t overlook the non-existent webpages as well. In this particular case, those were the only files serving infectious code.

By

Website used by Federal Government Hacked!

It was discovered that GovTrip.com, a website used by federal government employees for booking travel reservations was hacked and serving up malicious code through redirects.

The site is currently unavailable as they perform their forensic investigation and clean up the mess.

According to reports, “sometime” before February 11th, cybercriminals compromised the site and inserted redirect code that sent visitors to a website serving up malicious code. The site is used by such government agencies as: the US Environmental Protection Agency, departments of Agriculture, Energy, Health and Human Services, Interior, Transportation and Treasury.

The website is also used to reimburse employees for travel expenses so all sorts of information is stored there, however, it is not yet known what information was compromised during this breach. I personally don’t think the cybercriminals would have done both – insert redirect code and steal the data available. If the cybercriminals thought the data was valuable, they probably wouldn’t have risked inserting the redirect code as this could have, and did, alert others to the compromise.

The GovTrip.com website is managed by defense contractor Northrop Grumman.

The site had been blocked when the proper authorities were notified. Government agencies using the website were issuing warnings which could have only exacerbated the situation due to human curiosity. Frequently, when you tell a large number of people not to do something, you’re going to get a large percentage of those people to do exactly what they were told not to do.

Cybercriminals know this and use it all the time.