Dumb SEO Questions

(Entry was posted by Loren Baker on this post in the Dumb SEO Questions community on Facebook, 10/18/2018).

Canonical to a PDF file

Hi everyone, extremely Dumb SEO Question here.Contributed an client`s article to a traditional magazine which only has a PDF for their online version.

Would like to also publish that content on the client`s blog.

Normally, I would canonical back to the HTML place where we guest posted it. But it`s a PDF, that`s indexed.

Should I:

a) Canonical to a PDF file? :P

b) Self refer and don`t worry about the PDF, then see how Google treats it and possibly change canonical if the magazine ends up launching a web version

c) Something entirely different
This question begins at 00:22:52 into the clip. Did this video clip play correctly? Watch this question on YouTube commencing at 00:22:52
Video would not load
I see YouTube error message
I see static
Video clip did not start at this question

YOUR ANSWERS

Selected answers from the Dumb SEO Questions Facebook & G+ community.

  • Jobin John: Go for a) Canonical to pdfFollow the examples here:http://moz.com/.../how-to-advanced-relcanonical-http-headersWhen you`ve set it up, it`s important to check that it returns the correct http header using either a) A browser tool such as live http headers.b) A web-based header checker such as Rex Swain`s http header checker.You have to do this with the link canonical http header of your PDF file, which you specify in an htaccess file (for Apache) or iis equivalent. Here is what google says....."Indicate the canonical version of a URL by responding with the Link rel="canonical" HTTP header. Adding rel="canonical" to the head section of a page is useful for HTML content, but it can`t be used for PDFs and other file types indexed by Google Web Search. In these cases you can indicate a canonical URL by responding with the Link rel="canonical" HTTP header."
  • Loren Baker: Thanks, PDF on a third party though. No canonical in header.
  • Jobin John: Loren Baker If you want the PDF to be appearing as the original form of content in search results, the HTML part should have a canonical tag which highlights the PDF version in there. For example, your HTML part should have a canonical tag like below,
  • Michael Martinez: Do not canonicalize if the PDF is for the entire magazine. You are only supposed to canonicalize identical content. Probably you don`t need to do anything.
  • Loren Baker: Good point, PDF is one piece. I`m probably just going to self refer and see what happens. Client owns their own content anyway :)
  • Barry Schwartz: C) 404 it
  • Loren Baker: Going to canonical it here: http://www.barryschwartz.org/.../kick_me_if_i_stop...
  • Arsen Rabinovich: Barry, return a 410
  • Victor Antiu: Don`t canonical. There`s no point. Just link to the pdf as a reference.
  • Selena Vidya: I would do option B. If it’s not a crawlable PDF and the text can’t get indexed / they didn’t layer in a text version to serve Google, it shouldn’t be seen as a dupe.
  • Dan Thies: Link rel=alternate
  • Rob Woods: probably B, especially if the PDF is not just the article and who knows if the text in the PDF will even be readable.
  • Roger Montti: If the PDF is linked from somewhere, even a sitemap, then the PDF text is crawlable and indexable. And the document *can* rank for keyword searches. You can test it by doing a filetype:pdf search.

View original question in the Dumb SEO Questions community on Facebook, 10/18/2018).