Comparing Google Analytics to Server Logs
Posted September 12, 2007 at 11:00 am by Dustin | Filed under: Metrics
In some of our redesign work for existing websites our clients have asked for a way to compare the traffic / performance of the old version of their site vs the new over a specific period of time. While being an understandable request, it poses some challenges specifically if the old site stats are based solely on server logs. Let me explain:
First take server logs…
Server Logs
The log files on your server can be analyzed with software designed to read them. Common software examples would be AWstats and Webalizer. Where the problem comes in is log files are very literal and every single request is tracked without much filtering or classification. At this time, there is no efficient mechanism to distinguish between real visitors, robots, etc. The software often has rules that dictate what constitutes a visit or a unique visit, but this is rarely accurate and the rules are different from vendor to vendor.
The Result: Inaccurate and bloated numbers for visitors, unique visitors and page views.
Now compare with a better solution, like Google Analytics…
Google Analytics
While not 100% perfect, we currently use Google Analytics for all our clients that require accurate metrics for their website. Web-based 3rd party tracking systems such as Google Analytics track visitors much more accurately and do not over count like log files. A small line of javascript is inserted in to the pages you wish to track. Often a cookie is used as well. These two items allow more information to be gathered about your visitors and allows it to be presented in a more accurate and useful format. The downside is, if your visitors have very high security settings on their browsers (such as blocking all cookies or having javascript turned off) they will be partially tracked or not tracked at all. The percentage of visitors that fall into this group is said to be less than 10%.
The Result: Web Based Tracking systems are the most accurate and provide the most useful information about your site and its visitors. The only downfall is the numbers may be slightly lower than reality if some of your visitors have very high security settings.
Comparing one to the other…
At this point the problem should be clear. Server Logs are going to report much higher traffic than Google Analytics. To the untrained eye, comparing the two it looks like the redesign caused a significant loss in visits to your site. Flat out, the two can’t be directly compared. Depending on your needs you must compare server logs to server logs or scrap the old data and start working with the new more accurate data provided by Google Analytics.
If your main concern is metrics for performance of the site you may have to get creative. Measure registered users, newsletter sign ups, track revenue / sales for a period of time before and after the redesign, use a conversion tracking tool and set a goal, etc. Most often the metrics desired translates into increased or sustained revenue or ROI. Only the client knows what they really need out of the redesign.
The Bottom Line: Server Logs and Web Based Analytics can not be accurately compared. Some other form of data must be used. You could compare just the server logs before and after the change to measure some changes, but remember the numbers will not be very accurate or useful in comparison to what is provided by systems such as Google Analytics.

I switched over to google analytics from using the Urchin tool to analyse our server logs and the number of visitors seemed to plummet from 900 to 100 per day. We have been asked by a subscriber to our service for the number of visitors to our site and will clearly have to quote the lower figure. Do you know whether websites generally quote the unfiltered figures to their clients in their marketing? I wonder if honestly puts us at a competitive disadvantage!
Neil,
Your experience is unfortunately typical. I can’t say for sure which numbers are being used in marketing, but my guess would be the ‘inflated’ numbers. Bigger is better, as they say.
In my opinion, honesty is the best policy - it may take a little extra explanation, but the relationships you build with your customers will be stronger for it.