Marginal Gains on Matt Rickard

2 Comments

Sep 28, 2023

About model distillation: I saw this article a few weeks back about issues with feeding the synthetic data back to a generative AI

https://futurism.com/ai-trained-ai-generated-data-interview

And a paper: https://arxiv.org/pdf/2307.01850.pdf

As per the article, “it turns out, when you feed synthetic content back to a generative AI model, strange things start to happen. Think of it like data inbreeding, leading to increasingly mangled, bland, and all-around bad outputs. (Back in February, Monash University data researcher Jathan Sadowski described it as "Habsburg AI," or "a system that is so heavily trained on the outputs of other generative AI's that it becomes an inbred mutant, likely with exaggerated, grotesque features.")”

Expand full comment