Link Integrity - What is it and why is it so important ?
One of the features of the modern web is hypertext link integrity, but it is often overlooked by many web developers.
Nothing is more frustrating to users or damaging to a brand than broken links on a website. A "Page Not Found" (also known as a 404 error) is a very nasty thing. Broken links can cause accessibility problems. They can even lose you money - I mean directly. I have seen on many occasions the marketing managers of companies spend a fortune on web advertising only for their banners and Adwords to direct to non-existent pages. This is literally flushing money down the toilet.
Link integrity (also sometimes called referential or URL integrity) is about ensuring an uninterupted user experience by ensuring that links are never broken. Understanding how websites handle this can help you to make informed decisions about purchasing software and choosing a web designer or developer.
In the old days, individual webmasters spent hours creating links and navigation systems on their site. Inevitably people would call, complaining of broken links. This would cause a nightmare for the poor old HTML developer. They would then have to spend time constantly running link checkers over their sites after each update to ensure the integrity of the site. Especially if pages get renamed or moved around. Many webmasters still do things the old fashioned way.
A good content management system, however, goes a long way to solving these problems by automatically creating navigation for you. I stress the word good here. Many content management systems still ignore link integrity. The way they handle links also tends to differ from system to system.
In a nutshell, there are a few types of links on the web which handle connections between webpages and parts of webpages:
- Links to pages on the same site - we call these "local links"
- Links within pages
- Links to external sites
- Links from external sites
Of these, the ones that a CMS has the most control over (or direct control) are the first two.
Now I'll explain how.
"Local" Links and Link Integrity
The way we handle links to pages on the same site is what we call "logical linking". The reason for this is that it should be logical that if you move or rename a page, that all of the links to it are also renamed, and if you delete a page, that all links to it should be removed. You'd be suprised how many systems ignore this basic feature. Our Freestyler system features a number of logical linking systems built-in, so we actually guarantee it. Because of its dynamic real-time publishing model, it extends to all navigation, including search results. There is even a facility to multi-home content (put it in multiple locations around a site) and update them from one central place. For the most part, these are system generated links, that is, when you create a page, the system creates the navigation links for you. The result is that maintaining a website is a breeze.
Some website authoring tools that claim to be content management systems simply publish static pages. Although they can pre-check link integrity before the site is uploaded, many of their navigation models (particularly search) rely on technologies which cannot guarantee link integrity at all, especially if pages are added outside of the CMS.
Inline Links, Anchors and Link Integrity
Sometimes, however, users want to contribute links inline or freeform. We call the facility to do this an "inline local link". Making these follow a local linking system is a real technical achievement, but something that we as developers have been able to solve whereas others haven't. We have extended our rich text editor (or WYSIWYG editor) to include this feature.
External Website URLs and Link Integrity
External sites is a somewhat different kettle of fish. As the site doesn't have direct control over them, most systems use an external link checker or site maintenance spider which produces all sorts of reports. One I've used is Xenu's Link Sleuth, which is free and quite capable. Some content management systems have either integrated a similar tool or one of their own.
However an advanced dynamic real-time CMS can go a step further. It can pre-check that the page is available before linking. If a page is returned, then it can display the link; if not, then it doesn't display the link at all or may even recommend an alternative resource. Very few systems anticipate links like this, but it is possible, although all sorts of factors, such as website response time come to play and such a system needs to deal with all of these scenarios.
What is more important with external linking though is not the embarassment that the site you link to might not be there anymore, but the embarassment caused when the content is not what you expect. What if, for example, the site you link to is defaced. If it loads in the same window, users may think that the offensive content is part of your own site. For this, it is sometimes a good idea to open external links in a new window and even to accompany them with a disclaimer to keep the sites at arms length.
Some intelligent users are able to read the browser status bars to get an idea of what URLs are and will sometimes even correct mistakes - but vary rarely and the onus is squarely on you as a webmaster, not to expect that they will know when a link to an external site is or isn't valid.
Links from Other Websites and Link Integrity
This is another area where webmasters have very little control. Search engines like Google for example indexes thousands of pages each day and some of these pages are moved or renamed before it gets a chance to recheck them. The result is that Google serves up plenty of broken links. If someone happens to make a mistake with a link or you move the page that they are linking to, but still gets most of it right, there are things you can do.
Firstly if you want to check which sites are linking to you (I'm sure there is at least one), just write into Google the word link, with a colon and then your web address. For example, you can see which sites link to ours by putting into Google what is below:
link:http://www.datalink.com.au
Just substitute your site name and Google will produce a report for you. The number of links to your site is called "link popularity" and is important in lifting your site's ranking with search engines. The more the better. But anyway, I digress ...
A "Friendly 404" message can help these people who follow these links to find the page that they were looking for on your site. An intelligent content management system will generate a page instead of a nasty 404 error which has various navigation methods such as a sitemap, site alphabetical index and search to help users to find the page. More advanced systems may even be able to detect the URL and point to location where the page has moved.
Conclusions
Link integrity is just another reason why no site should be built without a content management system. The risk to your brand for any site that is maintained more than once a month is just too great.
The best way to maintain link integrity and avoid nasty hidden costs in web development is to purchase a decent content management system (CMS) and use it to implement your site.
Look at real-time publishing systems rather than static publishing systems.
Find out if they have the sort of features mentioned above, and if they have, assess which ones of these are most important for you and within your budget.
Comments
There are no comments.
Categories
- Datalink News (24)
- Project News (28)
- People @ Datalink (6)
- Emerging Technologies and Trends (21)
- Best Practice and Strategy (16)
- Web Design & Development (18)
- Internet Marketing (31)
- Website Watch (11)
- Book Reviews (6)
- Datalink Ramblings (17)
- Industry News (5)
- All Categories
Subscribe

