🔗 Why You (Probably) Don't Need to Fine-tune an LLM
In this post, we'll talk about why fine-tuning is probably not necessary for your app, and why applying two of the most common techniques to the base GPT models — few-shot prompting and retrieval-augmented generation (RAG) — are sufficient for most use cases.
This post is targeted towards folks focused on building LLM applications (as opposed to research).
If you're a builder, it's important to know what's available in your toolbox, and the right time to use a given tool. Depending on what you're doing, there are probably ones you use more often (hammer, screwdriver), and ones that you use less often (say, a hacksaw).
A lot of very smart people are experimenting with LLMs right now — resulting in a pretty jam-packed toolbox, acronyms and all (fine-tuning, RLHF, RAG, chain-of-thought, etc). It's easy to get stuck in the decision paralysis stage of "what technical approach do I use", even if your ultimate goal is to "build an app for X".
On their own, people often run into issues with base model LLMs — "the model didn't return what I wanted" or "the model hallucinated, its answer makes no sense" or "the model doesn't know anything about Y because it wasn't trained on it".
People sometimes turn to a fairly involved technique called fine-tuning, in hopes that it will solve all of the above. In this post, we'll talk about why fine-tuning is probably not necessary for your app.
⚠️ This post links to an external website. ⚠️
If this post was enjoyable or useful for you, please share it! If you have comments, questions, or feedback, you can email my personal email. To get new posts, subscribe use the RSS feed.