OK, we are completely stumped, and I`m hoping I`m just having a "moment". We submitted our domain in Search Console, we`ve uploaded our sitemap, and ALL pages are index, follow. The site was launched a year ago and all was fine until 12/1/17. Now....SC now only shows 7 pages being indexed and 368 submitted. In the new SC we get some more info 252 pages are "Crawled - currently not indexed". The site is over a year old and I am dumbfounded. If you have any thoughts please PM me and I`ll send you the URL. Plus! tell me I`m crazy but I think our "developer" (used lightly) put a rel=canonical pointing to the home page on every page. We removed this from the site - think this has something to do with it?
Selected answers from the Dumb SEO Questions G+ community.
Michael Stricker: Yes, that’s a serious goof, suggesting that Gbot prefer the homepage over all others. It might not trigger mass deindexation, but I’d be curious if someone: Flubbed the robots.txt
Or set Meta Robots to Noindex sitewide in on-page code
Or if there is some untrustworthy cloaking or scripted redirect taking place
Or if Google Search Console URL Handling has been misconfigured
At some point you’re going to have to provide a URL to the community to dig for more clues.
Richard Hearne: >I think our "developer" (used lightly) put a rel=canonical pointing to the home page on every page. We removed this from the site - think this has something to do with it?
I think Google Should be able to ignore most of those. They are fault tolerant, and many sites make mistakes like this. So I don`t think this would be the cause for the Crawled - currently not indexed.
You can also discount anything around robots.txt as pages wouldn`t be crawled if they were blocked, and blocking doesn`t remove from index.
So this leaves you with some questions around what`s on all these pages that Google doesn`t like? Have you tried fetching a sample of the not indexed pages with Fetch as Googlebot? Did they render OK?
Have you compared your content with other content online? Take some sentences from your articles and search for them within inverted commas "sentence" to see what Google spits back.
This could well be a technical issue, but I`d be inclined to say it sounds much more like a quality issue. Could also be combination of the 2, but it`s highly likely that whatever Googlebot is being served has significant quality issues.
Stockbridge Truslow: That rel=canonical link sounds like the culprit to me. Sheesh, I can`t even begin to describe how bad that one is.
Also, you say, "Everything is index, follow." Is there really a directive on each page that says that? That`s invalid, too. There are no "do this" directives for bots. Only "don`t do this" ones. Typically this isn`t a problem, but combined with a bunch of other issues, the bots might be confused.
Richard Hearne: "index, follow" is the default assumed in the event that no directives are included on a page. It`s not invalid.
I`ve seen many sites mess up canonical tags and values, but not seen it ever lead to a near-complete de-indexation of those sites. Just adding my €0.02 for further balance.