Recently, I was working on a project that required me to copy a database with 3 million spam comments. The database export was 12GB and just about everything I tried failed because of the sheer size of this file. I eventually did get it to work. Here’s how:
First, I tried doing things server-side using Pippin’s Batch Comment Spam Deletion Plugin. It works really well, however, at the rate it was going, it would have taken about two full days. I then decided that doing things locally would likely be faster.
Pulling down the database
Because of the size of the database, pulling it down in and of itself was a large task. I did everything through the command line, using SSH to get into the server and using the mysql
command:
mysql [name of database] < backup.sql
Then from my Desktop, I used the SCP command to pull it down (SFTP would also work here):
scp user@servername:backup.sql ./
Installing the database
Now for the fun part. I use an application called Querious for my local database management. However, when trying to use it here, the program stalled. Once again, I relied on the command line. After SSHing into my virtual machine (which is vagrant-based, so vagrant ssh
worked), I ran this command (assuming mydb
is the name of my database):
mysql -u root -p mydb < backup.sql
Due to the size, it took about 10 minutes to fully install, but when it did, everything was running smoothly.
Deleting the comments
NOTE: Make sure to back up your database before performing any of these commands.
With 3 million spam comments to delete, the only real option was to use some command line query to get the job done. I have WP-CLI, a command line interface for WordPress, installed on my virtual machine, and there’s a nifty command to mass delete comments:
wp comment delete $(wp comment list --status=spam --format=ids)
I ran this command, and after a few minutes, it stalled out. It was starting to look like I would have to do this the old fashioned way: running queries against the database.
Luckily, WP-CLI allows you to do this easily, too, using wp db query
. This was a two-step process:
- Delete all comments flagged as spam.
- Delete any comment meta associated with the spam comments.
Deleting all comments flagged as spam
To do this, from my wp-cli enabled command line, I ran this query:
wp db query 'DELETE FROM wp_comments WHERE comment_approved = "spam"'
Again, due to the sheer size of this database, the command took several minutes to run. However, it completed, and at the end of it, there were no spam comments. Hooray!
Deleting associated comment meta
This isn’t the end of the story, however. There was one more step I needed to perform: removing any meta data associated with these comments. In general, this isn’t a critical task, but it did dramatically reduce the size of the database. This query should take care of that:
wp db query 'DELETE FROM wp_commentmeta WHERE meta_key LIKE "%akismet%"'
This essentially removes any comment meta that has been touched by Akismet, WordPress’s most popular comment spam plugin.
Be sure to test!
Be sure to always take backups as needed and test! You can make sure you’re seeing the right data by using SELECT
statements to see the comments that will be deleted as well.
Happy cleaning!