Channels ▼
RSS

Tools

When URLs Point to Missing Content


Web sites need to be able to handle requests for missing content and deal with misspelled or bad URL formats. But how would you handle this in code? In Microsoft Web Forms, a typical solution enforced by the ASP.NET framework defines custom routes for common HTTP codes such as 404 and 403. Whenever users type, or follow, a badly formatted URL, they are redirected to another page where they are informed of the problem that has occurred. The trick works just fine and there's nothing you can complain about from a purely functional perspective. The rub lies with the behavior of such applications when their URLs are requested by search engines.

More Insights

White Papers

More >>

Reports

More >>

Webcasts

More >>

Imagine a search engine requesting a URL that doesn't exist in an application that implements custom error routing. The application first issues an HTTP 302 code which tells the caller that the resource has been temporarily moved to another location. The caller makes another attempt and finally lands on the error page. This approach is great for humans who ultimately get a pretty message; it is wrong for search engines which form the idea that the content is not missing at all — just harder than usual to retrieve. And an error page is cataloged as regular content and related to similar content.

This approach is good from a coding perspective in that it doesn't require much work. One alternative is writing a runtime filter — an HTTP module — to validate the incoming URL, then return HTTP 404 right away. Starting with .NET 3.5 Service Pack 1, Microsoft incorporated the routing module with the ASP.NET platform. The routing module is just an HTTP module that can be used for many different purposes, including filtering incoming requests based on routes. Honestly, the routing module is more helpful in ASP.NET MVC than in Web Forms when it comes to handling missing content. In Web Forms, in fact, it has to redirect to another resource. In ASP.NET MVC, it can serve a 404 status code right away with no additional roundtrip.

What's the best practice in ASP.NET MVC? I recommend completing the route collection in global.asax with a catch-all route that traps any URLs being sent to your application that haven't been captured by any of the existing routes. For obvious reasons, the catch-all rule must be placed at the very bottom of the list since routes are evaluated from top to bottom and parsing stops at the first match. The catch-all route simply maps the request to an Error controller. The controller, in turn, looks at content and headers and decides about which HTTP code to return.

The Error controller must be manually added to the project and to some extent its structure is still a matter of preference. The Error controller code here is a minimal implementation:


public class ErrorController : Controller
{
    public ActionResult Missing()
    {
       HttpContext.Response.StatusCode = 404;
       return View("Missing");
    }  
}

The missing .aspx view will contain any UI you want to show to users. This approach weds the best of SEO with coding effectiveness. Search engines quickly get a 404 and can use that information for the best. Users get a pretty message and can check if they did something wrong. For developers, it's not a huge amount of work.


Dino Esposito is a consultant and author who specializes in Windows development.


Related Reading






Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.
 

Video