The next idea was to ignore clickthroughs from all user agents (browsers) except IE, Netscape and Opera. Again, I let them clickthrough, just don't make the advertiser pay for them.
// ignore any user agent that isn't Mozilla (IE and Netscape)
or Opera
var sAgent = '' + Request.ServerVariables ( 'HTTP_USER_AGENT' );
// make lowercase
sAgent = sAgent.toLowerCase ( );
// should we count this agent?
if ( -1 != sAgent.indexOf ( 'mozilla' ) || -1 != sAgent.indexOf ( 'opera' ) )
{
// it's an unknown user-agent
bIgnoreClick = true;
}
It was suggested that I also ignored known IP addresses used by spiders. I found an excellent source at Search Engine World that documents these.
After investigation however, I noticed that all major spiders except one, Lycos, used very unique user agents. They would all get caught by the user agent check above!
Lycos was an exception - their spider sometimes masqueraded as IE 5.0. Luckily, they also have "Lycos_Spider" in the agent string, so I modified the test above to cope with that too:
// should we count this agent?
var bKnownBrowser = ( -1 != sAgent.indexOf ( 'mozilla' ) || -1 != sAgent.indexOf
( 'opera' ) );
if ( -1 != sAgent.indexOf ( 'spider' ) ) // lycos spider acts like
mozilla
bKnownBrowser = false;
if ( !bKnownBrowser )
{
// it's an unknown user-agent, so ignore them
bIgnoreClick = true;
}
Comments