Model distillation uses one model to generate training data for a second model.
Curious what happens if models distill upon distilled models into infinity... is there a point in which the model is no longer reasonable
Curious what happens if models distill upon distilled models into infinity... is there a point in which the model is no longer reasonable