Isn`t this blocking all pages? Why would someone do this?

User-agent: *
Disallow: /wp-admin/
Allow: /wp-admin/admin-ajax.php

User-agent: *
Disallow: /post/page/*

User-agent: *
Disallow: /category/*

  • Michael Martinez: It`s not blocking all pages. It only blocking those pages that are created in those directory hierarchies. It`s not a good strategy for managing crawl.
  • Brenda Michelin: So, it`s blocking the category. And I don`t understand the /post/page ---- is that blocking all pages?
  • Michael Martinez: No, it`s blocking a post archive. This structure is essentially strangling crawl on a Website, but there could be mitigating reasons for doing that. For example, they might have changed the URL structure and don`t want search engines to crawl old URLs (I would set up redirects, but I don`t know why someone decided to do this).
  • Brenda Michelin: Thank you for the explanation. I have never seen this. You have given me a path to investigate.
  • Michael Martinez: If you`re auditing a site that someone else has set up, you definitely want to learn all you can about the history of changes they made to the site. Sometimes old decisions look less than optimal but when you learn why they were made, you can see WHY they were made. You may think, "I would have done it differently, " but most of the time what`s done is done and you should focus on where you can take the site based on where it is now, now based on where you would have put it way back when.
  • Perry Bernard: Looks like someone is simply trying to avoid a duplicate content issue. They are better off using rel="canonical" and letting Googlebot crawl everything in those paths.
  • Perry Bernard: Sorry - I mean the last two paths, not the wp-admin one of course.
  • Michael Martinez: Well, I`m not one to second-guess a Website based on an obscure "robots.txt" example, but if one is concerned about duplicate content on a blog all they need to do is publish excerpts on the archive index pages. "rel=`canonical`" really isn`t appropriate on that kind of content.
  • Perry Bernard: yes, not for archive pages, but certainly for pages and posts with that path that exist elsewhere.

