PGTS PGTS Pty. Ltd.   ACN: 007 008 568               Mobile Version Coming Soon

point Site Navigation







Valid HTML 4.01!






   Stop The Internet Filter!

   No Clean Feed

   The Internet Filter Is An Ex-parrot!





Setting up TEST for Web Sites

By Gerry Patterson

The concept of TEST and PROD are very familiar concepts in any professional IT environment. Certainly as a Web Site increase in complexity, the need for a test system becomes more obvious. Very small sites may not have a spare machine that can be used as a test system. This article presents a model for a test system co-existing with production on the same host. It includes some example scripts to help manage this model.


Designing your directory structure.

The preferred method of testing your website is to setup an apache web server on a non-routable subnet and configure it with a directory structure that is identical to your production system. If you are not using your site for large volumes of critical online updates and you have a copy of the most recent production system at hand such a test system could double as an emergency standby. However, if you are still saving up to purchase a test system at a later date, you can make a poor man's test system on the same webhost.

The important thing to decide on is the directory structure. Many sites place HTML pages in one directory and image files in another directory. The main advantage to this approach is that it helps organise your data. This may not seem important at first. But as the site matures, the advantages of organising the site will become more apparent. The apache web server will have a certain directory nominated as the document root. Suppose that this directory is /var/www/pub. If your domain is mydomain.com then depending on how you have set up DNS, the URL: http://www.mydomain.com/ should translate to the directory /var/www/pub.

Depending on how you have setup permissions, only directories below /var/www/pub will be read by apache. However other directories can be added with the Alias directive in the apache configuration file. Your test directory structure should mirror the published structure. If for example you have the following directories:

	/var/www/pub/pictures
	/var/www/pub/stories
	/var/www/pub/trivia

	This would translate to the following URLs:

	http://www.mydomain.com/
	http://www.mydomain.com/pictures/
	http://www.mydomain.com/stories/
	http://www.mydomain.com/trivia/

	If your home directory is /home/fred you will need the following:

	/home/fred/pub/
	/home/fred/pub/pictures
	/home/fred/pub/stories
	/home/fred/pub/trivia
This will be your test directory structure. Most sites are set up to report index.html or index.shtml as the document root. This depends on your site setup. However it has been setup you will need a copy of the root file in your test directory structure. This would allow the testing of all your pages before publication. However, it is probably a good idea to validate the pages with as many browsers as possible before publication. If you wish to test them on Microsoft system, you can use samba to mount the /home/fred as a Microsoft drive, for example drive M: This means that you can point your Microsoft browser at M:\pub\index.html (depending on how your system is configured) and this will load in a similar fashion to your home page.

However, this will require a lot of manual editing of URLs after you publish your pages. Such manual editing defeats the whole purpose of testing. For example a page stored as /home/fred/pub/trivia/s1101c.html might have a graphic in that is stored as /home/fred/pub/pictures/big_picture.gif. This can appear in the page as the URL: "/pictures/big_picture.gif" or as an absolute reference "/home/fred/pub/pictures/big_picture.gif". The absolute reference will not be satisfactory, since it will not translate across operating systems. The Microsoft browser would spit the dummy if presented with such a URL. Still by using the relative URL, you may get into trouble if you have used the Alias directive in adding directories to your document root. The safest approach is to strip off any leading ".." on your URLs for your site. This looks like a job for perl.

The script pub performs the necessary translation on the URLs. It reads the nominated folders (defined in the hash %ftypes) and checks the file types nominated. If it finds a file in the test directory that has a timestamp that differs from the same file in the published directory, it copies the file to the published structure. In the process, it strips off the leading ".." in front of any folder names and replaces "/" with "/". It also looks for the special string "Last Updated: " at the start of a line in each page that it publishes. The pages on my own site always have such a string in the footer. It then inserts the current time (GMT). Finally it updates the timestamps so that both versions (the published version and the test version) have identical timestamps. If a -t option is given on the command line then the script does not actually copy the file but reports what action will be carried out. If you wish to exempt a file from the copy action place the file name on a single line in a file called .hold in the test directory. For example, if you do not wish to publish home/fred/trivia/no_pub.html, then put a single line that contains the word "no_pub.html" on a separate line in the file: /home/fred/trivia/.hold. When you wish this file to be published remove the line.

Lastly, the pub script updates the file permissions in the test directory, because programs run from a microsoft system alter these. For example if you used "vi" on a Microsoft system (using CygWin), the file permissions are altered, by the Windows Operating system. The pub script should be run by a user that has write permissions on both file systems.

The other question that might arise is what to do if you want to move files around? For example you might wish to separate the icons from the images. If there are numerous pages on the site with links to the old location. They will have to be edited individually. And there is always the risk that some will be missed. A perl script such as the following will accomplish this relocation. It is a fairly rough script that has only been used once. I used it when I wished to separate the icons from the images ...

#!/usr/bin/perl
if ($ARGV[0] eq "-t" ){
	print "Test mode only -- no changes will be written\n";
	$test_flag = shift;
}
die "Cannot Find $ARGV[0]\n" unless ( -f $ARGV[0]);
@html_files = glob "*html";
while ( $html_file = (shift @html_files) ){
	$found = 0;
	open ( HTMV, $html_file) || die "Cannot open $html_file\n";
	@repl = (<HTMV>);
	foreach $i (0 .. $#repl){
		if ($repl[$i] =~ s#$ARGV[0]#$ARGV[1]#g){
			$found++;
			print "$html_file: $repl[$i]" if ( $test_flag);
		}
	}
	if ($found && !$test_flag){
		print "\t$html_file\n";
		open (HTMV,">$html_file") || die "Cannot open $html_file\n";
		foreach $x(@repl){
			chomp ($x);
			$x =~ s/\r//g;
			print HTMV "$x\n";
		}
	}
}
die "Could not find $ARGV[0] in any \*html\n" unless ($found);
if ($test_flag){
	print "rename $ARGV[0] $ARGV[1]\n";
}
else{
	rename ($ARGV[0],$ARGV[1]);
}