KISS Compliant Web Sites.

KISS Compliant Web Sites

By Gerry Patterson

Occam's Razor has become the cornerstone of post-Renaissance scientific philosophy. The infamous razor can be applied to systems analysis and software design, where it is usually referred to as KISS (Keep It Simple Stupid).

KISS is an excellent design strategy for websites. KISS websites can work, can work fast, can work with all browsers and can get listed with the search engines.

What more could you possibly want from a website?

Occam's Razor Also Applies To Software.

Pluralitas non est ponenda sine neccesitate
-- William of Ockham (ca. 1285-1349)

In other words, avoid unnecessary complexity (plurality). This is generally known as Occam's razor. The medieval English philosopher and Franciscan monk who wrote these words believed he was stating a general philosophical principal. And indeed he was! It was such a guiding principal which persuaded Renaissance scientists such as Galileo to abandon the complex Geocentric model of the solar system in favour of the Copernican or Heliocentric model. In fact, this principal has become a cornerstone of modern science and philosophy, a type of universal post-renaissance bullshit filter which has particular relevance to any scientific or philosophical endeavour that entails model building.

Occam's razor applies to computer science as it does to any science. The first stage of software design is the construction of a model or blueprint for the software. In other words a solution for a specified problem. As in many such areas of science which involve model-building, the number of possible solutions is infinite or very large. However, application of Occam's Razor to any set of solutions will quickly reveal the correct solution which is, of course, the simplest.

The reasons for preferring the simplest solution are so obvious that it should not require a philosophical proof, and yet it is remarkable that sometimes a more complex solution is chosen. This may be because a project is being driven from the top by parties with a political agenda. Or it can be for the venal reason of extracting more money from an unwary client. Sometimes it can be for a silly reason like the dictates of fashion (more about this later).

In systems analysis, this application of Occam's razor is often referred to as KISS (Keep It Simple Stupid). As well as simplicity, there are some other considerations. In general software should be guided by the following basic principals, in order of decreasing importance:

0. The application must work.
1. The application should work as quickly as possible.
2. The layout of input and output components should be standardised and concise.
3. Help and support for the application should be readily available.
4. The application should be packaged in an attractive manner.

I have started this list with the number 0 because, when dealing with software, nothing is more important. It may be possible to find a use for hardware that does not work. Non-functional automobiles can be stripped of parts or recycled as scrap. Non-functioning computers can be used as land-fill (after the safe disposal of toxic components). Non-working software, on the other hand, is totally useless. Software must work.

The next item is almost as important for online applications. This has been known for many years. I first started to work online thirty years ago. This was because, I was member of staff when I commenced my second degree in computing science. Less privileged undergraduates had to submit batch jobs. Not many people work with batch processes these days. In fact, it is rare to find programmers who understand the principals of batch programming. Which is unfortunate for those organisations whose financial calendars are still governed by monthly accounting periods. I see many examples of month-end processes that are being processed online, because programmers only have online experience. And if the only tool you have is a hammer, then every problem looks like a nail. And online applications have certain constraints.

Of course there have been many changes in the thirty years that have elapsed since I first started working with online applications. One thing that has not changed however, is user patience. Online users do not like waiting. The maximum wait time is of the order of one minute, and most of them start to get antsy after ten seconds. This was well known and understood three decades ago when computers were at least a million times less powerful and contained almost million times more metal. (I am not exaggerating, one of the computers I worked on weighed hundreds of tonnes and had hundreds of litres of iced water pumped through the core to keep it cool). Back then, successful analysts understood the limits of user patience and designed software within these limitations and the constraints imposed by the hardware which was as light in capability as it was heavy in sheer mass.

Less successful analysts ignored those constraints.

It is still the same today. If the response time for an application approachs half a minute, the only users who will stay with it are those who are locked in. Because they have no choice. They have to get some data from their financial institute or update their timesheet or whatever. If you operate such a site, you can count on the fact that while they are waiting, they will be thinking ill of you and your organisation. They will be calling into question the parentage of the designer, venting their spleen about the CEO and cursing the CIO.

If you are thick-skinned and/or you don't give a damn about your customers than this may not upset you. In Australia we have one major airline (with a second in the pipeline), one major telecommunications company (and a token alternative), one and only one (now privately owned) freeway going under the river and connecting the eastern suburbs of Melbourne with the airport, etc. These large organisations have an effective monopoly in their individual playgrounds and they can (and do) take the approach that if their customers are unhappy about the service or the response time on the Internet ... then, speaking biblically, they can go forth and multiply themselves.

However, if your customers have a choice, then you can be sure that no one is going to stay long at a website that always takes more than twenty seconds to load each page. Unless of course, those pages are really worth waiting for. Only customers who are locked in, will tolerate poor performance.

The next item (number 2) is also very important for web applications. If your page is difficult to read and use, users will also lose patience and move on.

The items at the end of the list are far less important. Of course the perfect website would fulfill all of these criteria. However if it doesn't meet criteria zero, one and two, it is a waste of effort devoting any resources to the less important criteria three and four.

I don't have a formal proof of these assertions. They are largely opinion. However, this opinion has been formed from many years experience designing and writing software. Furthermore it is possible to find similar opinions elsewhere on the Web. Nonetheless, empirical evidence does seem to be scarce (see bibliography).

The Great Nineties Goldrush.

These principals of software design are so obvious that they hardly need evidence to support them. They are a type of "common sense" for software development. The reason I mention them is because for a period during the nineties the list that I wrote in the previous section was applied in the exact reverse order. But only to websites, which were deemed to be magically exempt from "common sense".

Go back and have a look at the list and read it backwards. I'm sure you will agree that many sites during the mid to late nineties decided that the most important criteria for website design was that it should look cool. It was decided that this would be priority zero. Only when this criteria was satisfied the designer (often a male geek with limited life experience) might consider less important matters like making it work quickly or making it work at all.

This extended period of temporary insanity was brought on by greed, stupidity, speculation, a goldrush mentality and a devout faith in the principals of perpetual motion. Very similar to the motives that drive most exercises in Pyramid Selling.

This foolishness was further propagated by the Web itself. Journalists and commentators, would cite information obtained from the Internet which proved that the road to success could be found via a cool web site. CEOs and managers, some of them IT managers who should have known better, rapidly deployed websites and employed adolescents with impressive talents in the new technology, but little or no understanding of business, finance, analysis and daily operations. Everyone got rich on share options. No one actually had to work! And further more it would go on forever. This was a certainty. They had a mathematical proof! Which went something like this:

The Internet is a virtual space and can therefore sustain infinite expansion.

At long last perpetual motion had been achieved. There was no such thing as tomorrow.

I don't want to appear to be taking unseemly glee at the great dot com meltdown but if I say "I told you so", I am certainly not a voice crying in the wilderness (for more pessimism and post-bubble cynicism see the bibliography). Since then it seems that web designers are rediscovering the truth of what is outlined above, namely:

Users will not tolerate applications that do not work.
Online users do not like waiting for applications.
Users do not like applications that have a complex layout and are difficult to use.
If they have a choice and they are not happy, they will go elsewhere.

How To Attract Traffic To A Website

A common question asked by website owners is: "How do I attract traffic to my website". In an earlier article, I proposed that the Web and search engines are so complex that most organisations do not have the resources to devote to a rigorous and comprehensive study of this topic, which is why most of the discussion of search engines is opinion rather than fact. Nevertheless it is possible to approach any hypothesis about search engines in a scientific manner:

Build a model using your acquired knowledge.
Test the model with experiments.
Discard or re-work the model according to the data.
Go back to step 2.

Note that this contains an infinite loop. So be warned that applying this algorithm can change your life or drive you insane if you do not have a scientific mind. If you come from a discipline like politics or economics you would never try this approach. People in these disciplines, when they get to step 3, prefer to discard the data rather than modify their model.

It seems that the assumptions that I made so far about the way that search engines might operate may have been accurate. At least I have noticed positive results so far.

What follows is a list of the options that I implemented. If you are having trouble getting your site listed, you might try implementing them:

Option 1. Choose Basic HTML

The most significant decision I made, was choosing basic HTML. There are some highly gimmicky alternatives to plain old HTML. I won't go into them here. They are extremely bandwidth intensive (slow). And for that reason unlikely to attract the attention of professional Internet users. I believe that the decision to choose basic HTML was the right one.

Tests with perl confirmed that it was the ideal language to be used for CGI scripts. This meant that complex and difficult procedures could be coded as CGI scripts to be run on the server side, where there was tight and effective control of the environment. By using basic HTML in any forms, the system was less susceptible to changes at the client side (where there is little if any control). This approach suits the original open-ended design of the Internet without significant compromise in performance. Of course CGI scripts don't have to be written in perl, there are many suitable alternatives.

I also tried to make the HTML code compatible with as many browsers as I could. Although it seems that MSIE is the market leader, my research so far would seem to indicate that it is not the clear leader I had been led to believe it is. By this I mean that no single version of MSIE seemed to have more than forty percent of the market. My site statistics indicate that 74 percent of the average users (those coming direct from Google) are MSIE users. The leading browser is MSIE 6 with 34 percent. So even within the MSIE group there are quite a large number of older versions. The market is still quite fragmented. It would be unwise (not to mention dangerous) to tailor a site to one specific browser version.

Initially I would test each page before publishing. I used MSIE 6 on Windows 2000, Netscape 6 on Windows 2000, Mozilla on Linux, Konqueror on Linux, Galleon on Linux and lynx on BSD. I get weary just typing this list here. You can imagine how tedious I found actually doing it! Still I persisted. My reasoning was that there are numerous browsers out there, and I should try and support them all. Also the best way to test code for portability is to try it out on a number of platforms. This sample of browsers seemed to be a reasonably wide spectrum. If it passed this test it should be Ok for any browser.

It wasn't long before I started taking shortcuts. I would check the code with lynx, MSIE and Netscape and forget about the others. This was still a chore. As I became more confident about HTML, I did not have to test as thoroughly, so I cut this down to just lynx and MSIE. I like working with lynx and MSIE is the market leader. Still, this was not satisfactory, because I had to keep flipping between computers (I only have one Microsoft machine). I could see that I needed a single point validation.

I found a solution on the web. This was the W3C Validation service (see the bibliography), which seemed to offer a one-stop validation check. I was not able to fault it in the tests that I ran. After being passed as W3C compliant a web page would display perfectly with all the browsers I tested.

My site is not one hundred percent W3C compliant yet. Initially there will be a little pain converting all the pages to be W3C valid. There were some sloppy coding practices that have been propagated throughout the site. However, after these are eliminated the way forward will be easy. I can use a single validation check for all future pages.

Option 2. Go Easy On The Graphics.

Web pages that have too much graphical content do not load quickly. It seems the message is slowly getting through to some designers. However I still see occasional postings by web owners lamenting that they cannot get listed with the major search engines. Often such sites are amateurish, have little text content and huge graphics content.

This may come as a devastating revelation. Nobody is really interested in your graphics! You might mistakenly think that your customers are falling over themselves gushing about your cool graphics. As an experienced Internet user myself I can assure you that I am rarely enthralled by cool graphics. And if I do find graphics interesting I prefer to have a choice. A well designed site will present a link or a thumbnail and invite me to click for a larger close-up. If I am using a GUI browser, and a website imposes a huge graphic on me without my permission, I click cancel!

Often I don't behave like a typical user. I use lynx (or lately w3m), to browse. In this case I don't see any graphics. However, when I am behaving like a typical experienced web surfer I use Google and a well known GUI browser. The major search engines only index text (see later). So when an experienced user enters a search phrase, Google returns a result based mainly on the text in HTML pages. Even for specific phrases there can be hundreds or thousands of possible choices, although most people will scan only the first fifty or one hundred. The user scans down the list and clicks on those results that appear interesting.

So, as far as I am concerned, if a particular site takes somewhere in the vicinity of half a minute to load, I cancel and move on to the next one. And I warn you not to take heart from any reports of site activity. You may have got the hit in your statistics report, but because I canned it, I did not even see the page. I paused at the doorstep for twenty or thirty seconds, and then said "No thanks, I'm out of here". And this is the way most experienced users are:

They use Google.
They click on the link.
If it takes too long (must be under half a minute), you've lost them.

If you are one of the big graphics offenders, careful examination of your logfiles may confirm this.

So how much is too much? I can't really say. I try to keep 120K as an upper limit But it could be larger. I recommend testing the pages with a 56K modem. Check that all your pages load quickly. If they take too long then remove whatever it is that slows them down.

There are tricks you can use. Always specify the size in the tags that reference images etc. This enables the user to at least see and navigate the rest of the page while the images are downloading. But you need to be ruthless. If it's too big, just get rid of it. It's probably just crud. If you really think it is necessary then include links or thumbnails and invite users to click on them. You will be able to see whether the images really are popular by examining your logfiles. If you are not prepared to be ruthless with graphical crud you can rest assured that your potential customers will be. They will vote with their feet (or should that be fingers).

Possible exceptions are those sites that actually rely on graphical images. This could be true for sites offering pornography. I am not an expert in this area. However if the ones I have viewed, one from an advertisement for proxy servers as an example of what should be excluded, and another that referred visitors to the humour section of my site, then I might conjecture that customers may behave differently at such sites. Because if I was in the habit of regularly viewing photographs of naked young girls, which slowly rendered from the top down, I would indeed wait until the rendering got down to the lower portions of the image.

However I would offer this advice to such sites. If I was inclined to seek such material regularly, aggressive popups would lose me as a potential customer. Furthermore I would think that the pornographic sites that wish to have a long term future on the web should warn customers that the material features nudity, sexual explicit material which depicts blah blah, and may offend blah blah, and should only be viewed by adults, and most importantly feature this warning prominently on a toned-down home page. Small thumbnails would make the site more usable and allow it to load quicker.

Far be it for me to tell pornographers how to ply their trade. But even this area of commerce should be subject to "common sense". A warning page would demonstrate that the site operator despite being a pornographer is otherwise trying to behave responsibly, which may prove a good each way bet against conflict with future legal frameworks as international laws regarding the Internet converge. Also, appealing to human curiosity they may even drum up additional business with a plain wrapper or merely titillating home page. They may have research that indicates otherwise, but a more prudent approach would ensure their long term survival.

Option 3. Use Standard Navigation

Gimmicky navigation options add little to your site. The methods they use usually add cost in bandwidth. They are more likely to confuse users than keep them. If you stick to step one (using standard HTML), then this will not be an issue.

Once again I cite my own behaviour when I am using a GUI browser. You may have got me to your site because it is otherwise well presented and does not waste bandwidth on graphics. But if I can't see at a glance how to use it, then I won't. I'll just move on.

Tricky navigation procedures are less likely to be portable and more likely to lose customers.

Option 4. Get Listed In The Major Search Engines

So this is the burning question. I have already admitted that the topic is too complex to give a definitive answer (see the bibliography). However, if I can use my own experience, it seems that using the KISS principals is also the easiest way to get listed in the major search engines.

As well as keeping it simple however, it is necessary to concentrate on providing regular updates to the site. The search engines exist because they provide a vital service to their clients. They must have content in order to exist. If your site provides content, they will happily index it. If you have paid attention to making it simple and made sure that the HTML complies with excepted standards then you should eventually get listed. I don't have proof of the fact that W3C compliance helps. However it is worth considering. Certainly if the major engines were going to make any assumptions about standards, something like W3C is likely.

The other theory that I am now quite confident about is the importance of links with other sites. Even if they don't result in much additional traffic, they can enhance your ratings with the search engines.

Although I don't have proof of this theory, I also think it would be dangerous to try to abuse the system. I believe that sites that try to spam the search engines run a grave risk of discovery. In which case the penalty would most likely be a long holiday.

Be aware that search engines use spiders (robots) to crawl the web and index it. The big engines (like Google) use spiders that only see text. Do you want to see how your site appears to a spider? It's easy. Use lynx (or w3m or links). These plain text browsers are an excellent way to get a spider's eye view of the web. If your pages look good in lynx or some other plain text browser, they will look Ok to the spider.

Option 5. Use And Endorse Open Standards

Open standards does not mean anything that is not Microsoft. Although it looks as though it may be so very soon, unless Microsoft come to their senses and cease their ill-advised campaign against Open Standards.

By endorsing Open Standards small businesses are making alliances that are close to those that originally created the Internet. Members of the Open Source Community will have the commitment to stay with the Internet and nurture it through the lean times as well as times of feasting and plenty.

And the Open Source Community is indeed a community. After you make the commitment to join it you realise what holds it together. The sense of community comes from the fact that you are helping build the structures in co-operation with others. We all own it. We all care for it.

Many organisations large and small are beginning to realise that Open Standards represent significant cost savings and may be endorsing them for pecuniary reasons alone. The math is very hard to argue with. There is a more important philosophical reason why Open Source is superior. It is the basis of all past scientific achievements. By sharing scientific discovery and theories, scientists have been able to build huge databases of knowledge that can be used by future scientists for research. The Open Source Community on the Internet follow this successful scientific approach, and behave like a scientific community. From past experience I can say that such behaviour is not always altruistic. Competition is an important spur which can motivate people within a scientific community. However, it only works if they share the knowledge once it is discovered. And they will share the knowledge. Because most desire some form of recognition. Money would be nice too. But recognition usually comes first.

Open Standards have now achieved critical mass. The superior software and the price advantage will prove irresistible and will soon cause a rapid conversion of the general market to Open Standards.

There are many people within and without of the Open Source Community who contend that it is chaotic and fractious. This is true. People who can't cope with chaos should not use the Internet. The network has grown to such a size that it now behaves like a natural system. It appears chaotic because it is chaotic. That is the natural state of the universe. With diversity comes strength.

Conclusion.

The best way to gain a presence on the Internet and keep it is:

Use KISS design
Endorse open standards
Include regularly updated content.

This is because users prefer such sites. And search engines prefer such sites.

BIBLIOGRAPHY:

It is easy to find sources that support my assertions about user preferences. Although it is difficult to find hard data to backup these claims, empirical proof may not be necessary since most of these claims are no more than a statement of the bleeding obvious. Still I would like some evidence. If anyone has any please let me know.

Mark A. Nordstrom Assessment of Web Site Effectiveness. A summary of web site effectiveness and its' relationship to usability. Academic and comprehensive.

Jakob Nielsen Voodoo Usability Warns about the fact that market surveys may not be effective in assessing web user behaviour.

Tonya Layman Web users prefer content, speed to flashy graphics. This article repeats the assertions made in the previous two papers. Interestingly, the site seems to have a lot of large advertisements.

Christopher Reed That Damned South Sea A quick summary of the infamous mother of all bubbles. Well worth reading for the comparisons with the recent dot com bubble and other bubbles throughout history. Plus ca change, plus c'est le meme chose

Dan Byrnes Modern Taboos This essay also mentions the dot com burnout and the South Sea Bubble. Along the way, the author takes aim at numerous sacred cows.

W3C Validator HTML Validation Service The first online HTML validation service was created by Dan Connolly and Mark Gaither. This page explains the current service and lists the principal contributors.

Gerry Patterson Getting Listed In The Search Engines A discussion of search engine behaviour, largely opinion, though mostly justified. Also see Parsing The Apache Logfiles With Perl, which discusses robot behaviour.


Mark A. Nordstrom	Assessment of Web Site Effectiveness. A summary of web site effectiveness and its' relationship to usability. Academic and comprehensive.
Jakob Nielsen	Voodoo Usability Warns about the fact that market surveys may not be effective in assessing web user behaviour.
Tonya Layman	Web users prefer content, speed to flashy graphics. This article repeats the assertions made in the previous two papers. Interestingly, the site seems to have a lot of large advertisements.
Christopher Reed	That Damned South Sea A quick summary of the infamous mother of all bubbles. Well worth reading for the comparisons with the recent dot com bubble and other bubbles throughout history. Plus ca change, plus c'est le meme chose
Dan Byrnes	Modern Taboos This essay also mentions the dot com burnout and the South Sea Bubble. Along the way, the author takes aim at numerous sacred cows.
W3C Validator	HTML Validation Service The first online HTML validation service was created by Dan Connolly and Mark Gaither. This page explains the current service and lists the principal contributors.
Gerry Patterson	Getting Listed In The Search Engines A discussion of search engine behaviour, largely opinion, though mostly justified. Also see Parsing The Apache Logfiles With Perl, which discusses robot behaviour.