Dumb SEO Questions

(Entry was posted by Greg Krista on this post in the Dumb SEO Questions community on Facebook, 11/12/2016).

Best way to handle facet URL

Hello,

I work on an ecommerce site that uses facets for navigation, but it is causing issues in SEO I believe. For example, a user can get to a page like this /us/c/mens-shoes, but when they use the facets, they get a URL like this /us/c/mens-shoes?q=%3Arelevance%3Agender%3AMENS & text= & topPosMENS=0

Even though we have a canonical that points to the correct version of the page (not the dynamic URL), I still see this page at a 200 level by Google, so I know it is being crawled and indexed. I can confirm this is the case with Screaming Frog too.

What is the best way to handle facet URL`s like this? Is it possible for me to add a DISALLOW: ?=q=* in the robots field? I was always under the assumption the Robots. file needed a / mark to block a certain path. Should I add a meta name robots tag to these URL`s to NOINDEX these pages? Should I do both?

Thanks!
This question begins at 00:02:33 into the clip. Did this video clip play correctly? Watch this question on YouTube commencing at 00:02:33
Video would not load
I see YouTube error message
I see static
Video clip did not start at this question

YOUR ANSWERS

Selected answers from the Dumb SEO Questions Facebook & G+ community.

  • Greg Kristan: Hello,

    I work on an ecommerce site that uses facets for navigation, but it is causing issues in SEO I believe. For example, a user can get to a page like this /us/c/mens-shoes, but when they use the facets, they get a URL like this /us/c/mens-shoes?q=%3Arelevance%3Agender%3AMENS&text=&topPosMENS=0

    Even though we have a canonical that points to the correct version of the page (not the dynamic URL), I still see this page at a 200 level by Google, so I know it is being crawled and indexed. I can confirm this is the case with Screaming Frog too.

    What is the best way to handle facet URL's like this? Is it possible for me to add a DISALLOW: ?=q=* in the robots field? I was always under the assumption the Robots. file needed a / mark to block a certain path. Should I add a meta name robots tag to these URL's to NOINDEX these pages? Should I do both?

    Thanks!
  • ?ukasz Rogala: Can you share the website URL?

    Generally speaking you can add meta name robots="noindex, follow" for URLs with parameters but canonical is better idea. Sometimes Google still indexes the canonicalized URLs and it takes more time for bot to re-crawl it. Can you share log data how often Google-bot visits your website? Canonical is better when you get loads of natural links from Users sharing your categories around the web. If you will have canonical you will still get some link juice passed even if URL won't be indexed in future.

    Also the reason Screaming Frog crawls your canonicals can be related to software configuration. If you will share the URL we will probably be able to help you better understand the issue.

    If you are afraid of those URLs - think about pointing out the parameters in Google Search Console and let Google know how to handle them.
    https://plus.google.com/photos/...
  • Greg Kristan: +Łukasz Rogala Google parameters is a good idea. I've never used that tool, so I did not want to cause a bigger issue if it was not performed well.

    If I share the URL, will this post get taken down?
  • ?ukasz Rogala: I don't think so. It will be easier for us to dive into the issue and help you more. :) If you don't want to share it here - post it via PM or [email protected] I will look at it and post update here for others to discuss about it.
  • Greg Kristan: +Łukasz Rogala sure, no problem. Here are the two example URL's.

    http://www.clarksusa.com/us/c/mens-shoes

    http://www.clarksusa.com/us/c/mens-shoes?q=%3Arelevance%3Acategory%3Amens-active&text=&topPosActive+Shoes=0
  • Suraj Gadage: +Greg Kristan we have had similar problem with one of our eCommerce client. Infact Google sent a message suggesting increase in dynamic errors. We fixed this problem by adding a self referencing canonical tag and as an additional safety measure blocked the dynamic parameters in search console.

    Hope this helps!


    
  • ?ukasz Rogala: I've crawled it via Screaming frog now and it looks like it's correct. So no need to be afraid at all. Google treats it as a signal, not directive so sometimes it takes more time to change the situation.

    Overall visibility of your website is quite good. :)
  • ?ukasz Rogala: https://plus.google.com/photos/...
  • Greg Kristan: +Suraj Gadage based on the URL that I provided, would I just add ?= as the parameter to block in Search Console?
  • Greg Kristan: +Łukasz Rogala thanks for looking into that for me!
  • Suraj Gadage: +Greg Kristan I believe the parameter should be '?q='. While adding the parameter, you will have to select No URLs option under 'Which URLs with this parameter should Googlebot crawl?'. This tells Googlebot not to crawl any URLs containing the suggested parameter.
  • Greg Kristan: +Łukasz Rogala wow! that chart is incredible. I've never used Search Metrics, but that tells a tremendous story!
  • ?ukasz Rogala: True, a lot of data to analyze and combine with Google Analytics and GSC data. :)

View original question in the Dumb SEO Questions community on Facebook, 11/12/2016).