Friday, April 27, 2012

How A Single Missing File (404) Can Saturate Your Bandwidth Limits

Most next generation web applications use custom 404 pages to help reduce bounce rates. It can be used as a tool to recommend alternative places to go to in case a file you are looking for is suddenly missing. This perfectly makes sense in the UX perspective. However, with a small mishap like someone accidentally misspelled the image reference or so, it can easily become a recipe for disaster and consume ALL your bandwidth.




To summarize the conditions of this disaster,

1. You have a custom 404 page applied to the rest of your web application.
2. You have at least one missing reference (image, video, css, js etc) in the custom 404 page.
3. You have sufficient normal traffic (10 req/sec or so) spread throughout the web application.

Satisfying the above conditions, one request to your web application would theoretically take it down. Why? Cause the missing reference will request for the custom 404 page, then requested the custom 404 page also requests another instance of it and so on. In short, it created an infinite amount of custom 404 calls with just one request. In practice, one request won't be enough however as the browser and servers will halt the process due to default request nesting level limits, request throttling and timeout. So that's where the third condition comes in. If your normal traffic volume if enough, that will consume all of your bandwidth. From what I've seen so far, a 100rpm (request per minute) or ~2 req/s, it consumed all 50Mbps of bandwidth.

So how do you fix this?

1. Do not apply the same custom 404 pages to page-types or object-types which are a natural components of a page and most importantly not the components of the custom 404 page. You can have a default 404 image for images, a default 404 page for static assets like css, javascript and even for embedded content like AJAX. Limit the use of custom 404 pages for the "real" pages.

2. Sound quality control checks to ensure that a custom 404 page's components are always present.

This web application disaster scenario is nostalgic to an exception handler which throws a fatal error or exception itself due to a wiring bug or a too complex logic which consumes all resources resulting to system errors.
  • Related Links Widget for Blogspot

No comments: