Matt Rickard

Share this post

What Diffusion Models Can Teach Us About LLMs

blog.matt-rickard.com

Discover more from Matt Rickard

Thoughts on engineering, startups, and AI.
Continue reading
Sign in

What Diffusion Models Can Teach Us About LLMs

Jun 4, 2023
5
Share this post

What Diffusion Models Can Teach Us About LLMs

blog.matt-rickard.com
2
Share

The image diffusion model ecosystem evolved quickly due to license-friendly and open-source Stable Diffusion models. Now, with LLaMa, Vicuna, Alpaca, RedPajama, Falcon, and many more open-source LLMs, the text-generation LLMs are evolving nearly as quickly. Developer tools, infrastructure, and other techniques that might eventually come to text-generation LLMs that originated with diffusion models.

LoRA — Low-Rank Adaptation of Large Language Models quickly became the standard to extend the base Stable Diffusion models. They became extremely popular for a few reasons:

  • Much smaller file size

  • Faster to fine-tune (more in a hacker’s guide to LLM optimization)

With QLoRA, we’re getting closer to this reality for text-generation LLMs. 

Prompt Matrix — Used to test different parameters for image generation. You might test with CFG Scale at a few different values on the X-axis and use step values on the Y-axis. 

This is starting to happen, except with parameters like temperature on the X-axis and different models on the Y-axis. Or different prompts tested across different models. Why now? Enough models to want to test, and cheap and quick enough to reasonably test multiple models.

Prompt Modifiers / Attention — Using () in the prompt increases the model’s attention to words, and [] decreases it. You can also add numeric modifiers, e.g., (word:1.5). There’s no direct comparison, but logit bias is a way to steer LLMs towards a particular result. See ReLLM and ParserLLM.

Negative Prompts — LLMs don’t entirely support negative prompts (like in Stable Diffusion). One way to achieve a similar result is through logit bias again. 

Loopback — Automatically feed output images as input in the next batch. This is somewhat equivalent to how we’re starting to think about agents in LLMs. 

Checkpoint Merger — There are utilities to merge checkpoints from different models. For example, blend styles, apply multiple LoRAs, and more. However, we haven’t seen this as much in the text-generation models (other than applying the LoRA weights). I’m unsure how well it works, but it's something to look into. 

5
Share this post

What Diffusion Models Can Teach Us About LLMs

blog.matt-rickard.com
2
Share
Previous
Next
2 Comments
Share this discussion

What Diffusion Models Can Teach Us About LLMs

blog.matt-rickard.com
Roibín O’Toole
Writes Reactive Rants
Jun 5Liked by Matt Rickard

I have been enjoying your articles on Substack. As a developer, I’ve always been fascinated by AI, but I often felt it was too complex for me to fully understand. Thanks for sharing your expertise in such an approachable way.

Expand full comment
Reply
Share
Nadav Geva
Jun 4Liked by Matt Rickard

Excellent write-up, thanks for the hard work!

Expand full comment
Reply
Share
Top
New
Community

No posts

Ready for more?

© 2023 Matt Rickard
Privacy ∙ Terms ∙ Collection notice
Start WritingGet the app
Substack is the home for great writing