I’ve noticed a nasty trend in comment span here lately. So far it’s only a couple of posts, but it could become a flood. Comment spammers are copying sentences out of legitimate comments and resubmitting them with a link or two changed.
If you’re not careful, this can even fool a human inspection since the spam is thereby on topic and relevant. If it comes a couple of months after an original article was posted that received a lot of comments, it’s very easy to miss.
We may need to adjust comment filters to flag comments that copy content from previous comments. I’m not sure if any of the existing filters do that or not.
Even worse, now I’ve caught at least one apparently Polish spammer copying text out of other blog entries that reference this one and submitting that as comments here. The only hint that it’s spam comes from the site linked to. I don’t know if Bayesian analysis will catch these. Possibly a quick, automated Google plagiarism search might be in order?