Our crawler was not able to access the robots.txt file on your site

tigersohelll

Hello Mozzers!

I've received an error message saying the site can't be crawled because Moz is unable to access the robots.txt. I've spoken to the webmaster and he can't understand why the robot.txt can't be accessed in Moz.

https://www.thefurnshop.co.uk/robots.txt

and Google isn't flagging anything up to us.

Does anyone know how to solve this problem?

Thanks

jaytarr

@LoganRay This was our issue. Didn't know Moz tries to retrieve the HTTP robots.txt first. Our HTTPS redirect was not working on static files only, so the HTTP path to the robots.txt was failing. We did not notice it because the HSTS policy was forcing the browser to redirect.

LoganRay

Wanted to jump back in on this topic as I've just confirmed my initial suspicion.

I just added a new client to our Moz account and had the exact same issue, crawler unable to access the robots.txt file. It's a secure site and was configured in Moz without the HTTPS. When I go to the robots.txt file without https://www, it redirects to the same thing as yours where the / between the TLD and page path gets removed.

Reconfigure your site and it should begin to work.

Tenlo

There are 2 parts of your robots.txt that could be causing this, and it all just depends on how each bot is reading regular expressions in your robots.txt:

First, your Disallow: /? can be read as Disallow all paths starting with "/" with 0 to infinity characters "" and one character "?". Try replacing this part with Disallow: /*? to make it not crawl anything with a query string (which is what I believe you were going for).

Second, you have a open Disallow followed by the User-agent: rogerbot and while this should not be read this way, once again it all depends on how each bot reads the commands. To fix this you should change your Disallow following your Googlebot-Image as Disallow: /

LoganRay

Hi there,

There's something odd going on when I try to access your robots.txt file without the www. The www gets added back on, but when it does, the slash between the TLD and page path gets deleted, see below. I'm guessing your domain in Moz is configured without the www, which means RogerBot is getting redirected to this slash-less version of the file.

https://www.thefurnshop.co.ukrobots.txt

Your All-In-One Suite of SEO Tools

Complete Local SEO Management

Enterprise Rank Tracking

The Power of Moz Data via API

Competitive Intelligence to Fuel Your SEO Strategy

Powerful Backlink Data for SEO

The One Keyword Research Tool for SEO Success

Free Domain SEO Analysis Tool

Free, Instant SEO Metrics As You Surf

Welcome to the Q&A Forum

Browse the forum for helpful insights and fresh discussions about all things SEO.

Our crawler was not able to access the robots.txt file on your site

Got a burning SEO question?

Browse Questions

Explore more categories

Related Questions

What is the quickest and easiest way to run an SEO audit on a Wordpress site that at least shows all the mechanical problems?

How I can increase DA of my site?

Can't Crawl Site - but deducting crawls.

When I crawl my site On Moz it says it can't access the robots.txt file, but crawl is fine on SEM Rush - Anyone know any reason for this?

Moz Site Crawl can't index WIX sites

Standard Syntax in robots.txt doesn't prevent Moz bot from crawling

In Open site explorer the page title and Url show in the left hand column. Why do some of my pages have no data for page title?

Site traffic vs other sites