Help! What to do with bad pages...

by @ctmartin (137), 3 years ago

Hi all, I'm new to the forum and need some advice!

Here's the situation: my client's WordPress website was hacked back in February and the hacker installed code that created a huge number of links to some pharmacy website. The links look like "www.mywebsite.com/?p=viagra+100mg+pills" where the "?p=" query string varies to redirect to some product page on that pharmacy website. Google has indexed over 12,000 of them.

Of course, there are also thousands of backlinks out there to reinforce the product page redirects.

What's the best approach to dealing with these as far as SEO is concerned?

The hack has been removed -- in fact, I completely rebuilt the website, now with a much more secure framework and on a completely new server. No more hack.

The backlinks are still out there as well. I guess I can use the Disavow Tool to defuse those. That will take time as there are so many of them. I've seen conflicting advice on whether or not to use the disavow tool at all.

But now about a month later, Search Console still shows over 12,000 of these "fake" pages (redirects) that now no longer redirect, but the pages are still indexed.

I'm especially not sure what's best to do about these.

I can force a 404 on those pages using regex for a redirect to the 404 page... Google will eventually drop the URLs, but then, I'll have 12,000 404 errors associated with the website. That's not good obviously.

I can redirect the query string to another page on the website -- again, recognizing these using regex -- but probably won't have any impact. As it is, the query strings go to specific (valid) blog pages on the site... I don't really know how though.

All I know is that if I just sit here, they won't go away because Google will just continue to think they exist. Like zombies. Google doesn't seem to care that my sitemap makes no reference to these zombie page links.

SEO pros -- what SHOULD I do? I want to kill the zombies but not kill the rest of the population in the process.

MANY THANKS in advance... -Chris

6 Replies
2 Users
Sort replies:
by @ms (3791), 3 years ago

Hi Chris, welcome to SEO Forum!

Main question here is: did the whole hack thing affect SE ranks/traffic in any way? You didn't mention it, so my best guess is that it did not.

Google says, you should NOT need to use disavow tool, as Google is pretty good at tracking down those bad links and giving them zero value. Generally speaking, you would make use of disavow tool if someone linked to your website (in a bad way, pointing black SEO juice to your site), but that's not the case. So all I think you should do is remove the pages, remove all the redirects and you should be good. Give Google some time to deal with it... It's not an instant process.

by @ctmartin (137), 3 years ago

Thank you, I guess what I ought to do is set the robots.txt to just disallow anything /?p*, since the pages aren't really there but Google has indexed the query string-version of the page URLs. Ugh!

I don't know if it has actually impacted SE ranking, but it just doesn't seem like a good thing to have hanging over the site as we try to build legit SE traffic and start competing for some local keywords.

by @ms (3791), 3 years ago

I don't think it would help - as you pointed out, pages are already in G's index. Disallowing crawler access to those pages (that are responding with 404 anyway) won't remove the pages.

by @ctmartin (137), 3 years ago

Ah, yeah you're right... robots.txt will just stop Google from crawling the pages, but if someone, somewhere, is linking to these query string pages, Google thinks they're real.

My problem then is: how do I remove a "query string page" without removing the main page? For example, I want to remove "www.mywebsite.com/?p=viagra+100mg+pills" but I obviously don't want to remove "www.mywebsite.com" home page in the process!

Can I do something in .htaccess? I can't just redirect to a valid page because that just validates the query string in Google's eyes. Sorry, I'm lacking imagination here...

by @ctmartin (137), 3 years ago

I guess I could forbid anything with "?p=" in htaccess. For the record, here's what I'm doing... let's see if this has the right effect (gotta wait for Google).

# Disallow query string ?p=
RewriteCond %{QUERY_STRING} \bp=\b [NC]
RewriteRule ^ - [F]
by @ms (3791), 3 years ago

I don't know. Personally, I wouldn't touch it.

You don't need to sanitize every possible parameter and it's value. I can hit any site with ?p=whatever, yet it doesn't mean the site has parametrized content for my query. Doesn't make sense to fix imo.

Join the forum to unlock true power of SEO community

You're welcome to become part of SEO Forum community. Register for free, learn and contribute.

Log In Sign up