Dumb SEO Questions

(Entry was posted by Michalis Apk on this post in the Dumb SEO Questions community on Facebook, 02/07/2019).

I want to 410 a few thousand old articles

Hi everyone, I want to 410 a few thousand (around 8, 000) old articles from our Wordpress blog. The articles have no visits the last 2 years, no backlinks and no internal links. I discovered a plugin called ‘410 for Wordpress’ which I was thinking to install. Has any of you used that before? Would it harm SEO? Also, what would be the best way to remove those 8, 000 articles? I haven’t done this before so I’m a bit sceptical if I need to start removing a few of them first or all of them at once. Any ideas would be much appreciated, thank you all in advance!
This question begins at 00:04:53 into the clip. Did this video clip play correctly? Watch this question on YouTube commencing at 00:04:53
Video would not load
I see YouTube error message
I see static
Video clip did not start at this question

YOUR ANSWERS

Selected answers from the Dumb SEO Questions Facebook & G+ community.

  • Jobin John: Use 404
  • Michalis Apk: Jobin John thanks for this but my questions are quite specific
  • Mal Ö Tonge: 404 means page unavailable for what ever reason that may be. The problem with 404s is that search engines do not know what the issue is, so they keep crawling that page for months maybe years. This will use up your crawl budget. in 2019 the preferred method is 410. 410 lets search engines know that the page has been deleted and will not return. search engines will then remove the urls from the SERPs, thus saving crawl budget and server usage.
  • Jobin John: “If a 404 error goes to a page that doesn’t exist, should I make them a 410?”

    John Mueller answered:

    “From our point of view, in the mid term/long term, a 404 is the same as a 410 for us. So in both of these cases, we drop those URLs from our index.

    P.S. In 2019
  • Mal Ö Tonge: 410 is a good plugin, make sure to check your site for broken links, or links pointing to the deleted articles. if they exist it will keep google coming back and checking them. hope this helps!
  • Michalis Apk: Hi Mal Ö Tonge , thanks for this as I was specific in the post. Happy to see you’ve used that before and that it does the job. Would you start deleting the articles a bunch of them at a time see how Google reacts on this and then keep deleting them OR delete all of them at once? I’m a bit scared to remove all at once, what you reckon?
  • Mal Ö Tonge: it depends on the pages/posts and the reason for deleting them in the first place. if they are not generating any traffic then just delete them. this will allow google to focus more on the meaningful content.
  • Michalis Apk: Mal Ö Tonge as mentioned in the post the pages do not generate any traffic the last 2 years and have no backlinks. So probably I need to get rid of them at once then. Thanks for your help
  • Mal Ö Tonge: no problem Michalis, Happy to help!
  • Michael Martinez: I would just delete the articles. Every plugin you add to a site slows it down a little. Think of plugins as the drag on an aerodynamic design in a wind tunnel.
  • Michalis Apk: Hey Michael Martinez good to see your reply here. Well, I’m aware of this situation however there’s a large amount of articles that need to be deleted. Do you have any other ideas of how I can make this happen instead of installing this plugin?
  • Michael Martinez: Michalis Apk I`m not sure of what you`re asking. I don`t see a need to send 410 status codes if nothing is linking to the articles. The search engines will continue crawling 410 URLs for a while anyway, but if all they see are soft 404 URLs they`ll eventually stop without the 410 codes. Crawl is driven by links and old URLs are crawled less over time. If there is even a remote chance you might reuse some of those old URLs some day then it`s better not to send a confusing signal like a 410. If you`re trying to manage "crawl budget" you cannot do that. The search engine determines crawl budget regardless of what you do, and for anything you do to affect their decisions your site would have to contain millions of URLs.
  • Michalis Apk: Michael Martinez I’m trying somehow to ‘manage’ our crawl budget.. The pages will never go back live again. The only reason I want to 410 them is because I’ve noticed that sometimes 410s fall out a bit faster than the 404s. To sum up: your opinion is just to delete them and let them removed from index without using any plugins which can slow down the site, right?
  • Michael Martinez: Michalis Apk This whole crawl budget thing has been a massive waste of people`s time. If the site doesn`t have millions of pages you don`t have anything to worry about. You may be able to speed the site performance by reducing the size of the database. If no one is reading the articles then deleting them is all you need to do.
  • Michalis Apk: Thank you Michael Martinez. Unfortunately our site is massive so I cannot ignore the bot’s visits.. I got some really interesting insights in the past. Coming back to my question I think I’ll just delete the pages at once and eventually they will be removed from index. Thanks everyone who replied on this post 🙂
  • Michael Martinez: Michalis Apk Well there are 86, 400 seconds in a day. Assuming Google is fetching 1 URL every 2 seconds (a rather slow crawl rate), how many times can it recrawl every URL on the site in 30 days? If your answer is anything greater than a fraction of 1 you don`t have a crawl budget problem. On the other hand, most sites` lose about 40-60% of their bandwidth to rogue crawlers (SEO tools and such). Your time would be better spent blocking those rogue crawlers because that would have a noticeable impact on user experience and the legitimate search engines` ability to crawl your site. "Managing crawl budget" just isn`t possible because it`s all managed by the search engine.
  • Mal Ö Tonge: I would prefer to have a very lite plugin slow the site down by a millisecond that have 8000 404 error redirects slowing down the server everytime bots take a visit. 410 gets rid of the SERP`s fast. 410s are not the best for every instance but for this one it fits.
  • Michael Martinez: Mal Ö Tonge The 410 errors use the same amount of bandwidth and CPU as the 404 errors. You`re arguing for an insignificant performance change, essentially a tradeoff. If they really want to improve site responsiveness, they should be blocking the rogue bots and crawlers - preferably with a server-level firewall. Using any kind of PHP plugin to manage crawl is very inefficient. But if they are running on anything less than PHP 7 then upgrading to the latest version of PHP *MIGHT* also give them better performance, as it uses less memory and processes some tasks more efficiently. However, such an upgrade may not be a trivial task.
  • Mal Ö Tonge: Michael Martinez maybe but 410 will be removed from SERPS 404 is not an instruction to remove and can remain in SERPS for years.
  • Michael Martinez: Mal Ö Tonge Based on his description, it doesn`t sound like anything needs to be removed from SERPs. These URLs are not getting any traffic. All your proposal will do in this case is add overhead to the site`s performance. There is no benefit.
  • Mal Ö Tonge: how will it add overhead? come on bro! 404 may keep google crawling for years. 410 will see the end of the links in a few weeks, especially if updating sitemap with google. then he can remove the 410 plugin. What you are suggesting is wrong!!!
  • Michael Martinez: Mal Ö Tonge Every plugin that has to be run through the WordPress event cycle adds to the overhead regardless of whether you see it doing anything or not. If there are no links pointing to the URLs then Google will eventually stop crawling them. As it is, there is currently no real downside to the dead pages except that they are taking up space in the database. Merely deleting them solves that problem. Nothing else needs to be done. If they had been hacked and people were clicking on them in SERPs I could agree that using a plugin to implement 410 status codes makes sense. But he said he`s trying to manage crawl budget. Well, he can`t do that but he can certainly work on reducing rogue crawl, which will have a positive impact on site performance and at the same time NOT create a negative impact by installing a plugin he doesn`t need.
  • Mal Ö Tonge: so do 404 errors, even more so because they can last for years! a few weeks after resubmit of sitemap and he can delete all traces of the 410 plugin, for what little weight it does cause. 404s will have him using resources for possibly years to come and for what, if you know for sure the pages will never appear again then the correct response is 410 "Page Gone" not 404 "Not Found"
  • Michael Martinez: Mal Ö Tonge The 404 errors only exist if someone or something tries to fetch them. Without any links to follow Google will stop trying to fetch them. There is no justification for going to extra effort to send 410 status codes. They provide absolutely no benefit in this scenario.
  • Mal Ö Tonge: Experience tells me otherwise. 404s that have been hanging in SERPS for years I have removed in days sometimes with 410s. thats all I have to say on the matter. 404 has no instruction so whats the point, 410 gives exact instructions to search engines. nothing left to say! this is not a debate, if you think your right then carry on, im not going to try to break my head to get the point across.
  • Michael Martinez: Mal Ö Tonge Without links to follow the search engines eventually stop finding old URLs. They expire everything from their indexes. They also eventually "forget" them and stop crawling. As there is no evidence that these thousands of old articles are hurting the site in any way, other than taking up space in the database, there is no reason to go to extra effort to send 410 status codes. You haven`t shown there would be any benefit to doing this.
  • Jim Munro: Consider that the index might be composed of myriad sub-indexes and googlebot will attempt to crawl missing pages on their timetable, not yours, regardless of what you do. It`s not like weeding the garden so beware of wasting time tidying up. :)
  • Jobin John: “If a 404 error goes to a page that doesn’t exist, should I make them a 410?”

    John Mueller answered:

    “From our point of view, in the mid term/long term, a 404 is the same as a 410 for us. So in both of these cases, we drop those URLs from our index.
  • Dave Elliott: If it`s not got traffic or links I`d 404 and make sure they aren`t in any sitemaps. If they represented a significant portion of your site id 301 but for 8k it doesn`t seem worth it.
  • Richard Hearne: If you want to speed up the process hand Google an XML sitemap with all the 410 URLs and a recent lastmod date for all. You can also ping this via WebSub. But do take care with plugins for this - bear in mind what happened with Yoast a few months back.

    Might just be safer to delete the posts in Wordpress, which should then return a regular 404.

View original question in the Dumb SEO Questions community on Facebook, 02/07/2019).