As per the article, “it turns out, when you feed synthetic content back to a generative AI model, strange things start to happen. Think of it like data inbreeding, leading to increasingly mangled, bland, and all-around bad outputs. (Back in February, Monash University data researcher Jathan Sadowski described it as "Habsburg AI," or "a system that is so heavily trained on the outputs of other generative AI's that it becomes an inbred mutant, likely with exaggerated, grotesque features.")”
About model distillation: I saw this article a few weeks back about issues with feeding the synthetic data back to a generative AI
https://futurism.com/ai-trained-ai-generated-data-interview
And a paper: https://arxiv.org/pdf/2307.01850.pdf
As per the article, “it turns out, when you feed synthetic content back to a generative AI model, strange things start to happen. Think of it like data inbreeding, leading to increasingly mangled, bland, and all-around bad outputs. (Back in February, Monash University data researcher Jathan Sadowski described it as "Habsburg AI," or "a system that is so heavily trained on the outputs of other generative AI's that it becomes an inbred mutant, likely with exaggerated, grotesque features.")”
That's too bad to hear. I keep hoping all my data hoarding will come in handy one of these days!