Spikes in Site Crawl Issues

What's Covered?

In this guide we’ll outline common causes for large spikes in your issue counts in Site Crawl. If you’re looking for tips on how to investigate fluctuations in your pages crawled, please see our troubleshooting guide.

Quick Links

Overview of Spikes in Issues

If you’re seeing a sudden spike in your Total Issues or New Issues count in your recent Site Crawl results, this may indicate that something is amiss with your site. It could be that you added some new content or changed your robots.txt file but it may also be that something isn’t coded correctly or that there is a link that’s causing some trouble. Below, we’ll outline a few of the common issues that can impact your Site Crawl and cause a spike in issues so you’re able to investigate and, if needed, make corrections to solve the problem.

Before we get started outlining things to check and investigate, it’s important to note a few key things about our crawler’s behavior and how Site Crawl data is populated in the tool.

How We Crawl

Our Site Crawl bot, Rogerbot, finds pages by crawling all of the HTML links on the homepage of your site. It then moves on to crawl all of those pages and the HTML links.

How New Issues Are Counted

When looking at your New Issues count, it’s important to note that this is calculated by comparing the most recent crawl to the one immediately prior. We do not look at older crawls to determine this count.

This means that if an issue was seen in Crawl A but not in Crawl B, it would no longer be counted in your Total Issues. If the same issue was then flagged in Crawl C, this would be counted as a New Issue since it was not seen in Crawl B, the crawl immediately prior to Crawl C.

Understanding how New Issues are counted can be crucial to investigating spikes in your issue counts as it can help to inform about changes made to your site’s settings and crawl data.

The Previous Crawl Failed Or Returned Fewer Pages

One of the most common issues which causes spikes in issue counts within Site Crawl is having the prior crawl fail or return fewer crawled pages.

A crawl may fail or return fewer pages crawled for a number of reasons, some of which are outlined in our Moz Can’t Crawl Your Site guide and our Fluctuations in Pages Crawled guide. The reason these fluctuations in results are important when looking at your New Issues count is that they impact the data set used to determine which issues are Existing and which are New. As noted above, your New Issues count is determined by comparing the new crawl to crawl immediately prior.

This means that if your site typically returns a crawl of 30,000 pages but we were only able to crawl 13,000 in the last crawl, the issues associated with those 17,000 pages we were unable to crawl are not counted in your Total Issues count for that crawl. When we then crawl again and are able to crawl the full 30,000 pages, all those issues we weren’t able to find in the smaller crawl will be flagged as New Issues, causing what looks like a spike in your issue count. These issues may not be new issues on your site, but are rather new issues seen in this crawl.

How Do I Check If This Is Happening?

If you’re needing to verify if this is what has caused the spike in issues you’re seeing in your Site Crawl results, you can start investigating in the Site Crawl Overview of your Campaign.

The Total Pages Crawled and Total Issues graphs can quickly illustrate drops in your counts. As seen in the example below, we typically see a crawl of about 35,000 pages week to week however on March 15th, there was a significant drop in pages crawled which resulted in a drop in total issues counted as well.

Site crawl overview page showing a drop in pages crawled on a certain date which resulted in a drop in total issues count.

We can then head to an issue category which is seeing a spike in issue count to further investigate. Knowing that the Critical Crawler Issues for the March 15th crawl came in at about 2.4K issues, we can then compare that to the new crawl. In the crawl on March 22, the one immediately after the March 15th crawl, we are seeing 9.7K total Critical Crawler Issues, 8.5K of which are considered New Issues even though that issue count is close to what we were seeing before the drop in pages crawled.

Critical crawler issues page showing a spike in new issues.

This indicates that it is likely those issues are not new issues on the site being tracked but rather new issues to this particular crawl.

A Link Has Caused An Infinite Loop

Another common issue which can lead to a spike in new issues is the presence of a misformatted link in the source code of the site.

If a link in the source code of a page is misformatted it can cause the crawler to mistreat the link and create an infinite loop as it attempts to crawl.

For example:

<a href="products">Products</a>

When the crawler encounters a link like this, it thinks it is supposed to take that href designation and apply it to the end of the page it is already on.

If we use the example above, the crawler’s behavior would look something like this:

The crawler is on the page https://mysite.com/home and finds the misformatted link which is supposed to send it to https://mysite.com/products. However, due to the formatting, it instead tries to go to https://mysite.com/home/products where it then finds that link again and tries to go to https://mysite.com/home/products/products. This can continue until the crawler reaches the site crawl limit for your Campaign which can cause a giant spike in issues and pages crawled.

Source Code with misformatted link.

How Do I Check If This Is Happening?

  1. Check to see if there was a sudden spike in your Total Pages Crawled - did we end up crawling a lot more pages than usual or than you were expecting?
  2. Sort your All Crawled Pages by Crawl Depth. Crawl Depth is how many links from the homepage we crawled to get to that link. If your Crawl Depths are unusually high, it could be an indication that the crawler is following a link over and over in a loop.
  3. Check the format of the URLs crawled. Typically when this issue occurs, there is a spike in issues for the URL Too Long category due to the URL getting longer with each loop crawled. Check these URLs to see if there are any repeating patterns like “/products/products/products” or “/contact/email/contact/email.”

How Do I Resolve The Issue?

If you are seeing that a crawl loop is being created, you can resolve the issue by tracing the loop back to the original instance of the misformatted link. This is typically identified as the first part of the URL before the repeating loop pattern begins. For example, if we’re looking at https://mysite.com/home/products/products the original instance would likely be found on https://mysite.com/home.

You can then locate the broken link in your source code and format it to be an absolute link.

Unignored Issues From Previous Crawls

If you had previously ignored issues or issue types which have now been unignored, those issues will now be added back into your issue counts. This can cause it to look like there was a spike in your Total Issues and New Issues.

Crawler warnings page showing unignored issues.

Since ignored issues were not reported in the Site Crawl while in the ignored state and were not included in the crawl report immediately prior, they are then counted as New Issues in the next crawl after being unignored.

Site Crawl Limit Is Less Than The Size Of The Site

If your site is larger than your allotted crawl limit, then we will be unable to crawl every page every week. As a result, we may see a page for the first time and flag a New Issue when in fact, that page has been around for some time.

Here are some ways to solve this:

Robots.txt File Updates

If you have recently updated your robots.txt to allow sections of your site to be crawled which were previously blocked, this may cause a spike in issue counts. Since those pages had not previously been crawled, there was no way to find and report issues for them.

Changes Made To Site Crawl Limits

Similar to how updating your robots.txt file allowances can lead to more pages being crawled, and thus more issues being discovered, changing your Site Crawl limits can lead to more pages on your site being captured in each crawl. An increased number of pages crawled means there may be more New Issues found and reported in your Site Crawl report.

Newly Created Content

If you have recently created and launched new content on your site, this means new pages are being crawled and reported on in your Site Crawl data. When we crawl those new pages, we will report on any issues we find and they will count towards your Total Issues and New Issues counts. If you happen to launch a lot of new content at once, this may cause a larger spike in your issue counts than you were expecting. Additionally, if you’re seeing a spike in issues but only launched a few new pages it may be a sign that something bigger is going on and you may want to investigate to see if something is amiss with your newly launched pages.


Woo! 🎉
Thanks for the feedback.

Got it.
Thanks for the feedback.