Spam Filtering AI Content

Dec 08, 2022

As generative AI becomes more advanced, it's likely that we will see an increase in spam that is difficult to distinguish from human-generated content. Some ways that we can combat the next wave of AI-generated content.

Adversarial models that are trained to detect AI-generated content. Of course, this works both ways. Anomaly detection algorithms. These algorithms can be tuned to detect AI-generated content by looking at things like frequency of posts, rate of change in topic, etc.
Client-level restrictions. Rate-limits, limited API access. Shadow-banning. Better tools for humans to identify and report spam.
Harsher penalties. A very naive punishment policy is one where the penalty is inversely correlated to the chance of getting caught. For example, a worker at a remote site who is caught sleeping on the job during an infrequent visit. That might mean removing users from the platform or monetary fines (e.g., CAN-SPAM Act)
Transaction fees. Either explicitly, i.e., micro-transactions or through something like proof-of-work. The early underpinnings of Bitcoin were in Hashcash, a proof of work system designed to curtail email spam.
Reputation systems (analogy to email or IP reputation systems).
Challenge/response systems like CAPTCHA.

Matt Rickard

Discussion about this post