Matt Rickard

Share this post

Horizontal Tuning: Instruction, Chat, and What Else?

blog.matt-rickard.com

Discover more from Matt Rickard

Thoughts on engineering, startups, and AI.
Over 4,000 subscribers
Continue reading
Sign in

Horizontal Tuning: Instruction, Chat, and What Else?

Oct 23, 2023
6
Share this post

Horizontal Tuning: Instruction, Chat, and What Else?

blog.matt-rickard.com
2
Share

So far, LLMs have been fine-tuned in two specific ways other than generic next-token completion.

  1. Instruction-tuned models are specialized in answering questions or commands. “Write me a story” or “What is the capital of France?”.

  2. Chat-tuned models are specialized in dialogue between (usually human and AI) entities. Think of all the conversational agents (ChatGPT, etc.). For example, you can ask a chat-tuned model to summarize a document, but an instruction-tuned model will probably do a better job. However, chat-tuned models can usually hold a more coherent conversation and have been used to power many different applications like answering questions, tutoring, and customer support.

But what’s beyond instruction-tuning and chat-tuning? Are there similar horizontal applications of tuning that would make sense for LLMs? That is, beyond fine-tuning for specific tasks, can we come up with better formats to query LLMs? I don’t know, but my intuition says yes. It might entail a small structure that lives over the input and compiles down to some intermediate representation (why ChatML is so interesting). Some ideas:

  • Question-tuned: Given a block of text, return a list of insightful and relevant questions about the text. (Imperative, declarative, interrogative, and exclamatory interfaces).

  • Editor-tuned: Given a block of text, returns the same block of text edited for correctness and clarity.

  • Schedule-tuned: Given a command, break it down into multiple smaller tasks.

  • Filter-tuned: Given a block of text and a set of fuzzy filters, return the same block of text with only the text that passes the filter.

  • Reverse-instruction-tuned: Given some output, generate the prompt. Could be useful for training or evaluating instruction-tuned models.

  • Reverse-chat-tuned: I don’t know what this would exactly be used for, but reversing the input-output pairs for chat-tuning. Might at least shed some more light on how these models work.

  • Diff-tuned: Given a block of text and a diff, return the original + changes applied. Could be useful for everything from merge conflicts in code to document-based collaboration.

6
Share this post

Horizontal Tuning: Instruction, Chat, and What Else?

blog.matt-rickard.com
2
Share
Previous
Next
2 Comments
Share this discussion

Horizontal Tuning: Instruction, Chat, and What Else?

blog.matt-rickard.com
Alex Gorischek
Oct 24

Similar to your note on "Schedule-tuned", I could image models fine-tuned for use as the planning/reasoning engines for autonomous agents. This is a bit more general than taking a direct "command" as an input; the usual inputs would likely be "context about the state of the world", "goal", and "available tools".

Expand full comment
Reply
Share
Andrew Smith
Writes Goatfury Writes
Oct 23

Do you use Bard much? I use it fairly regularly, mainly for research. It seems to be kind of in between instruction-tuned and conversational. Do I have that right?

Expand full comment
Reply
Share
Top
New
Community

No posts

Ready for more?

© 2023 Matt Rickard
Privacy ∙ Terms ∙ Collection notice
Start WritingGet the app
Substack is the home for great writing