Matt Rickard

Share this post

Benefits of Small LLMs

blog.matt-rickard.com

Discover more from Matt Rickard

Thoughts on engineering, startups, and AI.
Over 4,000 subscribers
Continue reading
Sign in

Benefits of Small LLMs

Oct 19, 2023
10
Share this post

Benefits of Small LLMs

blog.matt-rickard.com
1
Share

In a world where “scale is all you need,” sometimes the biggest models don’t win. Some reasons why smaller LLMs might pull ahead.

Many of these points follow from each other.

  1. Quicker to train. Obvious, but quicker feedback means faster iterations. Faster training, faster fine-tuning, faster results.

  2. Runs locally. The smaller the model, the more environments it can run in.

  3. Easier to debug. If you can run it on your laptop, it’s easier to debug.

  4. No specialized hardware. Small LLMs rarely require specialized hardware to train or run inference. In a market where the biggest chips are high in demand and low in supply, this matters.

  5. Cost-effective. Smaller models are cheaper to run. This opens up more NPV-positive applications they can work on.

  6. Lower latency. Smaller models can generate completions faster. Most models can’t run in low-latency environments today.

  7. Runs on the edge. Low latency, smaller file size, and shorter startup times mean that small LLMs can run at the edge.

  8. Easier to deploy.  Getting to production is sometimes the hardest part.

  9. Can be ensembled. It’s rumored that GPT-4 is eight smaller models. Ensembling smaller models together is a strategy that’s worked for decades of pragmatic machine learning.

A few more conjectures on why small models might be better:

  • More interpretable? We don’t have a defining theory on interpretability of LLMs, but I imagine that we’ll understand more of what’s going on in 7 billion parameter models before we know what’s going on in 60 billion parameter models.

  • Enhanced reproducibility? Small LLMs can easily be trained from scratch again. Counter this with the largest LLMs, which might undergo multiple checkpoints and continued training. Reproducing a model that was trained in an hour is much easier than one trained in six months.

10
Share this post

Benefits of Small LLMs

blog.matt-rickard.com
1
Share
Previous
Next
1 Comment
Share this discussion

Benefits of Small LLMs

blog.matt-rickard.com
Andrew Smith
Writes Goatfury Writes
Oct 19Liked by Matt Rickard

Okay, but can we call them SLMs?

Expand full comment
Reply
Share
Top
New
Community

No posts

Ready for more?

© 2023 Matt Rickard
Privacy ∙ Terms ∙ Collection notice
Start WritingGet the app
Substack is the home for great writing