Google analytics does good job at monitoring pages visited, but what about pages that are not found on your server and thus never displayed? You can get these pages by moving the content around, having errors in your links, getting malformed links from various webmasters (often they are caused by bad software on their side or miscommunication) or for other reasons. Even a soft like wordpress does not check comment links, which might generate 404 errors (page not found). How to detect such links? Well, you have 3 choices, each of them have their own drawbacks.
2. Log analysis. That is the best method if you have access to log files and you can process them. I prefer using Awstats for that, however you can do it by hand for smaller sites on apache, as errors are logged to separate file as well. The single problem with manual analysis is that error log has less information than common log. There is no referrer link mentioned in the error log. However, this can be solved by using grep to scan access log for 404 error codes as well. The drawback is that some CMS processes all requests and do not generate error codes successfully. This means that you will not see such errors in logs even if they exist.
3. If you can’t access error logs, the best way of action is to use custom error pages and create a log from them. You have to log both referrer path and request uri for best result. This approach can be implemented in many of the abovementioned CMS’es too.
So, what to do with bad links? This depends on what causes these links. If it is an advertisement campaign or a referrer site, you will have to create a redirect from bad link to the appropriate good one. In cases this is a malformed comment link, I would just delete it from database.
Vladimir Radmilovic · August 13, 2009 at 1:04 pm
With our product Web Log Storming (http://www.weblogstorming.com) it’s easy to overcome the AwStats problem you mentioned and list 404 hits with referrers, among other things. It’s not free, though…
I hope it’s not inappropriate to post this comment here. If it is, please accept my apologies and remove it.
Giedrius · August 13, 2009 at 2:06 pm
well, grep solves that problem too 🙂
Vladimir Radmilovic · August 13, 2009 at 2:24 pm
Fair enough 😉
Vladimir Radmilovic · August 14, 2009 at 11:46 am
I’ve tried, but I can’t resist… 😉 You can also use Notepad.exe to open log files and pen and paper to count visitors, but it’s not the point. 🙂
Giedrius · August 14, 2009 at 1:26 pm
Vladimir Radmilovic · August 14, 2009 at 3:48 pm
LOL! Yeah, I got your point. There’s lot of discussion about JS vs log analyzers (which depends on specific needs). Still, most agree that both methods should be used for full picture (but you are already doing this).
>Maybe weblogstorming has something more to offer
I’d say: definitely… 😉 You should really check it out as there’s a chance you would find it useful. Interactive reports, “on-the-fly” filters and drill-down to individual visitors, to name some features. If you have any comments I will be glad to hear them.
BTW, thanks for listening!