Follow

Common Reasons a Site Audit May Not Complete

Running a Site Audit on your website or your competitor's website is a great way to gain insights into opportunities to improve your web presence or understand how your competitors may be optimizing theirs. However, when doing so, you may occasionally encounter an empty audit (only 1 or 0 resources crawled with zero results returned). This can be caused by a variety of technical circumstances or issues. Please refer to the list below to see if the site you are auditing fits any of these conditions.

Don't see a situation below which matches yours? Contact support and we will be happy to help diagnose your Site Audit problems. 

  • Permissions - The gShift Site Audit crawler requires permission to crawl your site. Ensure your site is not blocking access to any pages you would like included in the audit. For example, if our crawler encounters a 403 or 999 error it will not be able to continue. Please read this article for more information on Site Audit statuses
  • HTML - The gShift Site Audit bot crawls your site in a way similar to Google. If you do not have valid HTML, we may not be able to parse it to gather the insights you need. Please ensure your HTML is valid prior to running a test. To ensure there are no errors you can run a free HTML scan here: https://validator.w3.org/
  • Robots.txt -  The Site Audit crawler can be blocked by a corrupt robots.txt file. Please ensure your website has proper markup and syntax in this file, which should be in the root folder for your site.
    Click here to learn more about the Robots.txt file.
    Test your Robots.txt files here.
  • Time Outs - gShift's Site Audit crawler waits a reasonable amount time (5 seconds) for a page to load before abandoning the crawl. If your web server takes more than 5 seconds to provide a response then our crawler will skip the page. If this occurs on the first page it tries to crawl, the crawler will abandon the entire crawl, as it assumes it cannot get to the underlying links. Adjust your server configuration to ensure your first page will load faster. This will not only help our crawler, but also the search engines trying to crawl and index your website. (Tip: Sometimes timeouts can occur due to server overload. Try running the audit at a different time of the day to ensure this is not the case.)
  • Website Link Structure - The Site Audit bot crawls through your website looking for links to other pages. If it cannot find a link to a page, it will not be able to crawl it. Following site structure best practices will help ensure your website is properly crawled and indexed by both the gShift bot and search engines in general. Learn more about running a successful website audit in this gShift blog post.
  • Wrong Domain - The gShift Site Audit crawler will only crawl the domain provided. If the domain has a redirect the crawler will note this and stop. Be sure to enter the correct domain. Common variations are: www. vs non-www. or HTTP vs HTTPS. You can check for domain redirections here. (Tip: If your Web presence's website is using a redirected domain you can use the custom domain tab when creating a Site Audit to ensure you start with a crawlable domain.)
  • Blocked by User-Agent Strings - Server's can block access to our crawler. This can be detected by running the audit with a different user agent (Ex. Googlebot) and see if it completes. if it doesn't your server is configured to block our crawler. Please contact your webmaster to adjust the server's settings to allow our crawler.
  • Dynamic (Asynchronous Javascript) Websites - Websites with dynamic (asynchronous Javascript) calls are difficult to crawl properly as the gShift crawler cannot parse the Javascript. Dynamic websites load content after the initial page load, which is missed by the crawler. This often leads to missed content such as links. Any content manipulated or inserted on a Web page via JavaScript will not be detected by our crawler.

Clear Signs your Site Audit did not complete: 

  • Resources used is listed as 1 or 0
  • Date Ran is 1
  • One or more of the following: 
    • Redirects = 1
    • Other Errors = 1
    • Other = 1
  • Keep in mind Resources Used may be very different than the total number of pages you have on your site.

Screenshots:

 

 

Was this article helpful?
0 out of 0 found this helpful
Have more questions? Submit a request
Powered by Zendesk