| Tracking
Web Site Traffic Part 1: An explanation of terms that often cause confusion. Sooner or later, nearly everyone with a web site wants to know how much traffic their site is getting. Other information, such as how long your visitors are staying on your site, which pages they are visiting, and how often they return will also help you develop a better site. There are various ways of gathering this information, some of which are more accurate than others. Here, we'll take a look at what's in a log file, various methods of analyzing log data, and the problems you will encounter with analyzing the data, but first let's define some terms that often cause confusion. Hits vs Pageviews When someone visits your page, their browser sends a number of requests to your web server. One request is for the HTML file but individual requests are also sent for each of the other elements that make up the web page - graphics files, audio files, and so on. Each of these requests is called a hit - a hit is a request to the server for a file not a page. As you can see, counting hits is not the same as tracking pageviews. It takes multiple hits to view a page. A pageview is the number of times a page is accessed as a whole. Methods of Tracking Traffic Most counters record hits rather than pageviews. Nearly all web servers maintain log files - a text file that lists each request made to the server. Analyzing log data can give you a good idea of where your site visitors are coming from, which pages they are visiting, how long they stay, and which browsers they are using. Before signing on with a hosting company, make sure they offer access to raw log files. Even if you don't need them immediately, sooner or later you'll be glad to have them. Next, let's take a look at what's in a log file. Then we'll look at
methods of extracting and compiling the relevant information with software,
online services, and even a few do-it-yourself techniques. Then last,
but not least, we'll look at problems with analyzing the data. There are a number of different log file formats but they are all fairly similar. The most common format is called CLF (Common Logfile Format). There are also different types of log files - access, referer, error, and agent are the primary ones. The following examples were taken from my own log files. You can check with your hosting company to find out the format they provide or just have a look at the raw data. It's not hard to decipher. Access Log The record below shows the visitor's IP number or hostname, date and time of the request, the command received from the client, the status code returned, the size of the document transferred, and the browser and operating system the visitor was using. nas-112-52.slc.navinet.net - - [29/Jan/2000:17:17:12 -0500] "GET page.html HTTP/1.1" 200 23443 "http://www.mydomain.com/page.html" "Mozilla/4.0 (compatible; MSIE 5.01; Windows 98)" Referer Log The record below shows that the visitor followed a link from somedomain.com to the index page of my site. http://www.somedomain.com/page.html -> / This record shows that the visitor came to my site from a search engine link. Notice the keyword data is included in the record. http://search.yahoo.com/bin/search?p=design+tips -> / Agent Log Mozilla/4.0 (compatible; MSIE 5.01; Windows 98) Error Log The record below shows the type of server, date and time of the error, client identification, explanation of the error code generated by the server, and the path to the file that caused the error. apache: [Sun Jan 30 10:09:57 2000] [error] [client 195.238.2.162] File does not exist: /u/web/mydomain/favicon.ico As you can see, log files contain a wealth of information about how
your visitors are using your site. The next topic we will explore is
how to get the relevant data extracted from the log files and compiled
into a useable format. Tracking Web Site Traffic There are a number of options available for extracting and compiling the information from a log file. The method you choose should be based on your specific needs. For some sites, the path visitors take and the number of pageviews is essential. Other webmasters may need to know where their visitors are coming from, what browsers they are using, or the keywords used to find their site from the search engines. Commercial Software Freeware and Shareware Scripts Tracking Web Site Traffic Analyzing log data seems straightforward enough. It's just a matter of counting and compiling information from a simple text file. One problem though, is that many requests to the server never make it into the access log. Other requests shouldn't be counted. Then there are problems with how to determine a unique visitor. So, here are some of the challenges you will face in log data analysis. Caching
AOL makes extensive use of proxy serves, routing users through various proxy servers according to their own proprietary scheme. In order to better understand how AOL handles traffic, read the AOL Guide for Webmasters. IP Address Considerations Robots Frames
|
|
|