Dumb SEO Questions

(Entry was posted by Athanasios Giannias on this post in the Dumb SEO Questions community on Facebook, Saturday, February 7, 2015).

Is there any other way to 100% block a site from all search engines?

Hi everyone

I have a test site which is a 100% duplicate of my live site. I mainly use this site for testing, uploading new modules etc. I have blocked this site with a robots.txt file.
 When using site:mytestsite.com it shows up. Is this normal?  
Is there any other way to 100% block a site from all search engines? 

Thank you every one for you assistance! ?

This question begins at 01:05:11 into the clip. Did this video clip play correctly? Watch this question on YouTube commencing at 01:05:11
Video would not load
I see YouTube error message
I see static
Video clip did not start at this question

YOUR ANSWERS

Selected answers from the Dumb SEO Questions G+ community.

  • Athanasios Giannias: Hi everyone

    I have a test site which is a 100% duplicate of my live site. I mainly use this site for testing, uploading new modules etc. I have blocked this site with a robots.txt file.
     ;When using site: it shows up. Is this normal?  ;
    Is there any other way to 100% block a site from all search engines? ;

    Thank you every one for you assistance! ;
  • Toni Anicic: Robots.txt disallow says search engines they can't see what's on that URL. It doesn't mean they can't show it in the search results (although without meta description and title from your actual page since they can't see those). If oyu want search engines not to display your website in search results use meta noindex tag. Make sure in that case that you remove the robots.txt disallow since Google can't see meta noindex tag on your page if it's disallowed through robots.txt.
  • Athanasios Giannias: thanks + ;  ;I want this site to be for "my eyes only" . If search engines display results, this means duplicate content.  ;Bad for my original site. With the meta noindex tag, is this site 100% isolated? ;
  • Toni Anicic: From Google's official page: "Googlebot will see the noindex meta tag and will drop that page entirely from Google Search results, regardless of whether other sites link to it."
  • Toni Anicic: BTW. Don't worry about duplicate content in your case. If you disallowed a site through robots.txt even though Google can show those URLs in SERP, Google doens't know what's on them and can't know it's duplicate content.
  • Athanasios Giannias: Perfect. So i can leave as  ;is with the robot.txt file.
  • Toni Anicic: That's assuming you used the correct syntax to disallow a website in robots.txt.
  • Athanasios Giannias: When searching the test site in google search, the site shows up, but displays on top of the search results this message : ;A description for this result is not available because of this site's robots.txt – learn more."
    I think i am covered ;
  • Dave Elliott: you could always add a noindex, nofollow tag to all the pages on the test site.
  • Promoz SEO: As everyone said, use noindex, nofoolow tag on the test sites each pages and additionally you can use canonical tags on the test sites pages pointing to the original sites relevant pages.
  • Athanasios Giannias: thank you all.  ;
  • Tony McCreath: Google indexing pages on your test site means it found links to it. So I would also double check the live website to see if you have any accidental links to test. Screaming Frog can help you with that.

    Or run a backlink checker on the test site to uncover what is linking.

    Or even register the test site with Google Webmaster Tools to see backlinks and indexing status.

    robots.txt blocking should be fine. Some people go further and block the whole domain via a password. Try a search like "server password protect website" to find out how.
  • Athanasios Giannias: Thank you +​
  • Blade S: Password-protecting your site is the best way to block search engines. Web crawlers can’t access content in password-protected directories.

    If you're using WordPress, you can use a plugin like Password Protected.

    Robots.txt and robots meta tags are not always 100% safe when it comes to blocking search engines.
  • Athanasios Giannias: thanks + ;
  • Blade S: + You're most welcome!

View original question in the Dumb SEO Questions community on G+, Saturday, February 7, 2015).

Reference Links