Keyvan Nayyeri

God breathing through me

Simplify Your URLs

In my opinion the Nth rule of simplicity would be the simplicity of the internet URLs.  One of things that I always attend to on any new web application is this and for me it's very important to have a site with simple and readable URLs.

If we go back and forth and recall the history of the web, can believe that we've been moving forward in simplicity of the URLs in the time.  Coming from the days of long URLs with many folders included and long queries to days with shorter URLs that try to rewrite all public URLs from queries to simplify them.

There are many parameters of simplicity for URLs such as the length, exclusion of queries, exclusion of page extensions and ...  In this post I want to outline some parameters that make your URLs simpler and talk about implementing them for ASP.NET sites in general.  I use some of these methods on my site to simplify URLs.

Make URLs Shorter

Photo taken from http://www.toriton.com/domain/url.jpgThe most obvious way to simplify a URL is making it shorter.  We're passed the days with long URLs and long queries.  There are some techniques that help you to make your URLs shorter.  As site developers, you're able to choose a good strategy for designing your URLs to make them as short as possible.  Some of the next items can also be techniques to make URLs shorter but I'd prefer to talk about them separately.

One common problem with many ASP.NET sites and web applications is the fact that they use unnecessary items in the URLs that should be eliminated.  For example, as many ASP.NET blogging tools are inspired from original .Text engine, they use year, month and day numbers in the post permalinks which is unnecessary and just makes them longer.  This is the default behavior in .Text, Community Server and Subtext while new BlogEngine doesn't use this pattern.  You have to avoid such unnecessary usages of URL items.

On the other hand, the readability of URLs and being SEO friendly shouldn't become victims of shorter URLs.  You can simply use an identifier rather than a page name or item name to make it shorter but this isn't recommended at all.

A very good example of a very bad design for URLs is DotNetNuke!!  Taking a look at DotNetNuke URLs, you suddenly notice that not only URLs aren't readable but also they're long in many pages on many sites.

However, in general, making URLs shorter is something left to you and your strategies in the application that you write but is something easy to handle if you consider it from the first stages.

Use Subdomains

One common way to simplify URLs is using subdomains.  You can easily move a part of URL to a subdomain.  This has some benefits:

  • With subdomains you can categorize your site URLs and choose different subdomains for each category.  For example, you can move blogs application to its own subdomain and forum application to its own and so on.
  • This makes your URLs more readable and easier to remember.  With subdomains, URLs are readable and normally people can remember them easier.
  • Beside other benefits, using subdomains, you can have shorter URLs.

The implementation of subdomains in ASP.NET is easy and there are some options available for you as ready to go components or you can write your own code.

Rewrite Queries

Queries should be avoided in public URLs as much as possible.  Queries make a URL hard to read and remember and sometimes can make it longer.  With rewriting queries you get two benefits:

  • This improves the readability of your codes and makes them easy to remember.
  • Rewriting queries improves your SEO very much. 

While these are two positive points about queries, note that rewriting queries can make your URLs shorter or longer.  There is no general rule about this and it depends on your case but however it worth it and you always should rewrite your queries.

One famous open source component for URL rewriting in ASP.NET is UrlRewriter.NET that lets you rewrite URLs based on regular expressions easily.

Eliminate Some Characters

Sometimes some special characters appear in URLs that can be eliminated and aren't human readable and beautiful and just make URLs longer.

For example, you may simply use encoding techniques to replace "C#" with "C_2300_" in your URLs.  That "_2300_" is something that should be eliminated to simplify the URL and make it shorter.

As a general guideline, I'd prefer to escape any non-alphanumeric character (except dash) in URLs and this can be simply done with string manipulation methods in .NET.  For consistency, I also would prefer to lower case all my URLs before making them public.

Remove "WWW" From URLs

Photo taken from http://images.jupiterimages.com/common/detail/67/14/22741467.jpgHaving "WWW" prefix at the start of the URLs makes them longer and in some cases can break down the popularity of your pages by providing two different URLs.  But by removing this part of the URL, not only you have shorter URLs, but also have single links on your site and of course, in my opinion your URLs are beautiful in this case.

I had talked about this in good details before and outlined the negative and positive points of having or not having "WWW" in URLs.  I also gave a simple solution to remove "WWW" with an ASP.NET HttpModule.  Note that you can achieve this by using an ISAPI filter as well.  I talk about them in the next section.

Remove Default Pages From URLs

Default pages are the other parameter that make URLs long.  In ASP.NET technically it's not possible to exclude default pages from URLs in IIS 6.0 (and earlier) unless there is a default page physically located at the path or you have access to an ISAPI filter.  However, I've experienced that in many cases default pages can be dropped at the end of URLs without any ISAPI filter because there is a default page physically located.

Default pages have another drawback and that is they may break down your link popularity by providing two different access points to same page so for professional websites this should be a big concern.

The good news is this can be achieved in IIS 7.0 easier but you can do this in preceding versions by an ISAPI filter.  Before I get off to military service training, there were some discussions about common IIS ISAPI filters here and there about ISAPI_Rewrite and Ionic's ISAPI Rewrite Filter.  The first one is commercial and requires you to buy a license but the second one is open source and free and I use it on my VPS to simplify my URLs on this site.  For both filters, you can use patterns like this to remove default pages and redirect them permanently to new URLs:

RewriteRule (.*)/default.htm$ $1/ [I,RP]
RewriteRule (.*)/default.aspx$ $1/ [I,RP]
RewriteRule (.*)/index.htm$ $1/ [I,RP]
RewriteRule (.*)/index.html$ $1/ [I,RP]

All in all, this isn't easy to achieve to IIS 6.0 or 5.0 because needs you to install an ISAPI filter on the server and this may be impossible for you if you're hosted on a shared environment.

Drop Page Extensions

The last way of simplifying URLs that I can remember is dropping page extensions that makes URLs shorter and has an extra benefit that can help you to have technology independent URLs.  It means that if you suddenly decide to change the technology of your website then you don't need to redirect all old URLs and just have to put new pages in the same location.

Unfortunately like removing default pages, dropping page extensions isn't easy to achieve in IIS 5.0 or 6.0 and again you need to use an ISAPI filter but however I would recommend it for any professional website.

15 Comments

Reza
Oct 25, 2007 12:19 AM
#
very useful ;-)

cathal
Oct 25, 2007 5:33 AM
#
FYI: the dotnetnuke implementation you mention is known as "machine-friendly" url's, it's primarily designed to help with indexing, without impacting performance via a simple regular expression transform (note: it does allow for more human readable url's via the other rewrite rules in it's configuration). It's provider based, but so far few people have built alternative implementations so the forthcoming(4.7) version of dotnetnuke also add's support for "human-friendly" url's such as www.mysite.com/somepagename.aspx

mike
Oct 25, 2007 8:13 AM
#
Hi ... I'm wondering whether you could back up and explain why long URLs are inherently bad. The information here (very well presented) takes it as a given that a long URL is to be avoided. For example, who is inconvenienced by long URLs, in what application context are they a disadvantage, and so on. As the first commenter points out, there can be some advantages for long URLs, including indexing. The strategy of using date-based URLs in a (e.g.) a blog can make it simpler for a reader to browse the entries for a specific month or day. (And one could even argue that a date-based URL like http://site/blog/2007/July isn't even particularly long). Most users simply click links, of course, and do not care how long or short the URL for the link is. (Look at Amazon links, gosh.) Any thoughts you have on this would be very interesting. Thanks!

Keyvan Nayyeri
Oct 25, 2007 8:51 AM
#
Mike, What I wrote here are some guidelines not rules. So I'm not saying that any long URL isn't good. I'm saying that URLs should be as simple as possible not as short as possible. Let me explain this with an example. You can refer to tags pages on a blog with their identifiers that is shorter and with their names that is longer but you agree that using names is better because makes them simpler and easier to read and recall. Simplicity isn't equal to having shorter URLs but this can be a guide itself. I think it's obvious that a shorter text is simpler than a longer text and everyone can confirm this. Whether it's a blog post, a book, an article or a URL. Following on your points on having years and months in the URLs, I think that days of this kind of URLs are gone and almost all CMS tools and blogging engines and web applications in general are leaving this pattern. Take a look at Graffiti and compare it with Community Server, it doesn't use this pattern. The reason is simple: there is a better way to let search engines index your site and that is a site map. If you use a site map on your site, will notice that bots don't hack your URLs to find new pages and index your site anymore and just follow your site map references. I experienced this in my site diagnostic statistics after putting the site map. And finally your note on the importance of URL length for normal users can be a good research to see how much do they care about URLs but 100% shorter URLs are easier to understand and recall. As a personal note: I trust on new sites when they have simpler URLs and usually worry about sites with long URLs but this can be a personal habit though. For technical people and technical purposes, shorter URLs are easier to parse and follow. Thank you for your comment :-)

Robert
Oct 25, 2007 1:14 PM
#
-1 on the subdomains. They are considered a new site in the major search engines. You then have to work on getting relative weight to your site all over again. I personally like the MVC urls with route action and params, they are usually pretty good about not being too long and are still index friendly.

Kalpesh
Oct 25, 2007 2:09 PM
#
It will be great to build some kind of AI mechanism when handling urls eg. books.amazon.com or amazon.com/books should go to same location. Taken further, a site should act like a search engine. How many people would actually type a url instead of clicking on it? So, there should be a friendly version of url along-with url for search-engine? e.g. amazon.com/atlas shrugged and amazon.com/obid/exapp?de34559565.. In case, there is a possibility of multiple results, the website should let user go to specific results using tags. e.g mozilla.com/firefox+download+latest OR mozilla.com/firefox+extension+adblock Now, if you add intelligence to it - user can still go to relevant thing e.g. mozilla.com/firefox extension to block flash based ads Thats too much to ask for :)

mikeb
Oct 25, 2007 3:31 PM
#
As the previous mike indicated, I often favor URLs that have the date encoded in them in a readable fashion. This lets me sometimes locate pages that I might not be able to find via a search (incredible, but I have found this to be true more than once). I think your subsequent clarification of simpler URLs is more meaningful than 'shorter URLs'. I would also argue that having URLs with components that are meaningful to humans (such as components of the URL path being easily recognized as a date) can often be a good thing.

Oskar Austegard
Oct 27, 2007 11:14 AM
#
If you're not in control of the url and you need to communicating it verbally - you may want to try http://squrl.us. Link for this page is http://squrl.us/42

Jevon
Nov 05, 2007 3:29 AM
#

Hi, interesting post but I disagree with a couple of points you've made:

First - flat-out saying that date-based urls are bad isn't necessarily true. I actually like the idea, the problem comes from the implementation of it by many (if not most) sites. Take the url for this page/site for example - nayyeri.net/.../1024/">nayyeri.net/.../24simplify-your-urls.aspx">nayyeri.net/.../simplify-your-urls.aspx

I would expect that shortening it to nayyeri.net/.../1024/">nayyeri.net/.../24 would show me the archive page with all posts on 24th October 2007, and likewise shortening to nayyeri.net/.../10 would show the archive page with all posts (or links to posts) for October 2007. However, as with many sites, this isn't the case - I get a 404 instead. Given, not everyone would manually change urls in this way, but I don't think it's an unreasonable expectation for this to work.

Second - "Use subdomains" - I used to feel the same way but actually favour subdirectories now, for a number of reasons:

1) http://forum.mysite.com/ is no shorter than http://mysite.com/forum and is arguably no less clear/easy to remember.

2) You can just as easily categorise content based on first-level subdirectory as you can on subdomains.

3) Subdomains are more effort from a coding perspective if you want to share data across different parts of the site (in ASP/ASP.Net at least) - application/session state no longer share the same memory space. This could potentially mean caching data in memory twice. Arguably it /would/ potentially improve security and reliability, but the same effect could be provided by setting subdirectories up as their own applications [in IIS].

4) SSL Certificates - with subdirectories you just need one certificate and you can secure the whole site, whereas for subdomains you either need one for each subdomain (expensive), or you need a wildcard certificate (which is even more expensive).

A few things you may not have considered anyway :)


Ben Hoyt
Nov 05, 2007 12:35 PM
#
All very good points -- these things need to start with developers. But if the developers have already stuffed up (as in Amazon, YouTube, and many other popular sites), I hereby shamelessly plug my DecentURL -- http://decenturl.com/

Keyvan Nayyeri
Nov 05, 2007 7:32 PM
#
Ben, Yes, I've used your site a few times. Looks like a good alternative for similar sites with better features as Scott Water had pointed but IMO final URLs are longer in your service.

Kent J. Chen's WebLog
Nov 06, 2007 11:27 AM
#
The art of hyperlinking

Falah G.Salih
Jan 07, 2008 10:34 AM
#

my web site under construct now it is new site

can you help me about how to create sub domains by asp.net 2 script code in visual basic .net and how to put in my root

\

thanks

Older readers can remember my post in the latest days of 2006 that showed how to implement a simple HttpModule to eliminate the “WWW.” prefix from domain names in ASP.NET applications to have unique and shorter URLs for all the pages on a site. Later

It’s obvious that I’m trying to support the wave of NO-WWW ! A long time ago I had written a HttpModule to remove “WWW.” from URLs. Last year I wrote a post with some topics for simpler URLs in web applications including an alternative solution to accomplish

Leave a Comment





Ads Powered by Lake Quincy Media Network