Dumb SEO Questions

(Entry was posted by Dave Elliott on this post in the Dumb SEO Questions community on Facebook, 10/29/2013).

Can i trust the search engine bots to listen to my robots file?

Quick question that i`m sure i know the answer to, but, i want to be lazy.

In my robots.txt i have disallowed the indexing of the login page.
Using screaming frog as a way to create my sitemap xml there are about 1 million entries(this may be a lie) of various login pages (each with a different return path)

Do i have to go through and delete them all manually or can i trust the search engine bots to listen to my robots file??
This question begins at 00:09:45 into the clip. Did this video clip play correctly? Watch this question on YouTube commencing at 00:09:45
Video would not load
I see YouTube error message
I see static
Video clip did not start at this question

YOUR ANSWERS

Selected answers from the Dumb SEO Questions Facebook & G+ community.

  • Dave Elliott: Quick question that i'm sure i know the answer to, but, i want to be lazy.

    In my robots.txt i have disallowed the indexing of the login page.
    Using screaming frog as a way to create my sitemap xml there are about 1 million entries(this may be a lie) of various login pages (each with a different return path)

    Do i have to go through and delete them all manually or can i trust the search engine bots to listen to my robots file??
  • Nick Stuart-Miller: A similar problem happened to me recently as I was auditing a client's website, although my issue was to do with duplicate indexing of pages due to session IDs.

    I would suggest that as long as you have correctly Robots.txt the indexing of the login pages you shouldn't have a problem.
  • Simon Fryer: (1) Exclusions in robots.txt don't always work 100%. That's why noindex/nofollow tags should really be used to block crawler access/indexation. ;

    (2) I doubt that having duplicate versions of your login page is going to cause any problems whatsoever, so I don't think this is anything to concern yourself with (re: removing the listings from your sitemap).

    (3) If you're using an opensource platform you're better off using a dynamic sitemap extension. Normally these won't include admin pages by default. ;
  • Sarvesh Bagla: +Dave Elliott If u hv a large number of login URLs I would atleast get rid of them from the xml sitemap if not fix the problem on the site. U want google to crawl relevant pages and not waste resources on URLs that u don't want indexed.

View original question in the Dumb SEO Questions community on Facebook, 10/29/2013).

All Questions in this Hangout