<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Pear Analytics &#187; data consistency</title>
	<atom:link href="http://www.pearanalytics.com/blog/tag/data-consistency/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.pearanalytics.com/blog</link>
	<description>Pear Analytics - home of the free SEO analysis tool and website analyzer</description>
	<lastBuildDate>Mon, 23 Jan 2012 21:58:30 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.2.1</generator>
<xhtml:meta xmlns:xhtml="http://www.w3.org/1999/xhtml" name="robots" content="noindex" />
		<item>
		<title>A Note on SiteJuice Accuracy</title>
		<link>http://www.pearanalytics.com/blog/2010/a-note-on-sitejuice-accuracy/</link>
		<comments>http://www.pearanalytics.com/blog/2010/a-note-on-sitejuice-accuracy/#comments</comments>
		<pubDate>Tue, 18 May 2010 15:51:54 +0000</pubDate>
		<dc:creator>Ryan Kelly</dc:creator>
				<category><![CDATA[SiteJuice Updates]]></category>
		<category><![CDATA[data accuracy]]></category>
		<category><![CDATA[data consistency]]></category>
		<category><![CDATA[Sitejuice]]></category>

		<guid isPermaLink="false">http://www.pearanalytics.com/blog/?p=1343</guid>
		<description><![CDATA[We wanted to write this blog post to help you understand the accuracy of our data, just in case you were thinking of canceling your subscription Where we get our data from We get our data directly from Google, Yahoo and Bing&#8217;s API&#8217;s.  Some of the information is free, and some of it is not.  [...]<p><a href="http://www.pearanalytics.com/blog/2010/a-note-on-sitejuice-accuracy/">A Note on SiteJuice Accuracy</a> is a post from Pear Analytics, an <a href="http://www.pearanalytics.com/blog">SEO tools and software</a> company.</p>
]]></description>
			<content:encoded><![CDATA[<div class="tweetmeme_button" style="float: left; margin-right: 10px;">
			<a href="http://api.tweetmeme.com/share?url=http%3A%2F%2Fwww.pearanalytics.com%2Fblog%2F2010%2Fa-note-on-sitejuice-accuracy%2F"><br />
				<img src="http://api.tweetmeme.com/imagebutton.gif?url=http%3A%2F%2Fwww.pearanalytics.com%2Fblog%2F2010%2Fa-note-on-sitejuice-accuracy%2F&amp;source=pearanalytics&amp;style=normal&amp;b=2" height="61" width="50" /><br />
			</a>
		</div>
<p>We wanted to write this blog post to help you understand the accuracy of our data, just in case you were thinking of canceling your subscription <img src='http://www.pearanalytics.com/blog/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> </p>
<p><strong>Where we get our data from</strong></p>
<p>We get our data directly from Google, Yahoo and Bing&#8217;s API&#8217;s.  Some of the information is free, and some of it is not.  The information related to search volume, competitiveness and getting keyword suggestions is something we have to pay for.  We make roughly 3-4 million calls per month to the Google API to get this data, and it&#8217;s not exactly cheap.  However, what we get from the API&#8217;s can be different that what you might see in the Google External Keyword Tool, or other front-facing applications that can get this information.  We suspect the search engines selfishly like to keep the latest and greatest data to themselves, and probably lag the data available in the API.  This causes a mismatch in what we are getting and what you might be seeing from Google directly.</p>
<p>On another note, Google is a bit finicky in getting search volumes.  If you see that search volume is &#8220;0&#8243; and you know for a fact that is wrong, there is a reason why.  Google will only give us the search volume data if the keyword is relevant.  In other words, they don&#8217;t want us sitting there hitting the API and caching all their data, so what we&#8217;re asking for has to be tied to a URL and has to be somewhat relevant, otherwise they will return null information.</p>
<p>We also use the Alexa API for Competitor Monitoring, and recently removed the Compete.com API since it was causing reports to error out.  We also use the W3C validation tools as an open source application on our servers to run your HTML and CSS validation tests.  We DO NOT screen scrape any data whatsoever.</p>
<p><strong>Why are my rankings off in Keyword Watch?</strong></p>
<p>If you are seeing a result in a Google ranking for a keyword that is different that when you type it in and search for it yourself, it could be related to <a href="http://googleblog.blogspot.com/2009/12/personalized-search-for-everyone.html" target="_blank">personalized search</a>.  We get your keyword rankings straight from the Google API, and we request this from a random server in Dallas.  If you routinely search for yourself by doing Google searches, you may have personalized search turned on, which will almost always give you a higher result than you really have.  For example, you may be seeing that you rank #4 for a keyword, when in fact you rank #9 or #10.</p>
<p><strong>We test our data daily</strong></p>
<p>Every day we run a 35-point test on at least two random reports that were run in the last 24 hours.  We check the data against some other tools to see how were fare.  For example, we test keyword rankings with the SEOBook Rank Checker plugin for FireFox.  We check the HTML and CSS validation directly with the W3c.  We check your inbound links directly with Yahoo SiteExplorer, and we query the page including and excluding links from interior pages.  We check your load time and page size with Pingdom.com to see what they get.  If any of the data is off by more than 5%, we create a trouble ticket and our development team looks into the problem and deploys a fix.</p>
<p>Are you seeing other data inconsistencies?  <a href="mailto: support@pearanalytics.com" target="_blank">We&#8217;d like to know about them</a>.  We can look into the issue, deploy a fix and re-run your report usually within 24 to 48 hours.</p>
<p><strong>Update May 21, 2010 4;45p</strong></p>
<p>The last two tests we ran scored over 80% accurate.  We are still having some issues detecting the meta keyword tag properly so we can populate the Keyword Watch feature, and sometimes pages with a Google PageRank of &#8220;0&#8243; show &#8220;4&#8243;.  We are also working on getting the proper inbound link number for the domain that INCLUDES links from interior pages.  Search volume has been difficult to get &#8220;on the fly&#8221; and you may be seeing &#8220;0&#8243; for many terms.  We are working on a fix for this.  Lastly, the HTML/CSS validation checks are all coming in as &#8220;passing&#8221; when many have errors.  We will have a fix for this next week.</p>
<p><a href="http://www.pearanalytics.com/blog/2010/a-note-on-sitejuice-accuracy/">A Note on SiteJuice Accuracy</a> is a post from Pear Analytics, an <a href="http://www.pearanalytics.com/blog">SEO tools and software</a> company.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.pearanalytics.com/blog/2010/a-note-on-sitejuice-accuracy/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>

