Why are your models so big? (2023)

19 points | by jxmorris12 3 days ago

9 comments

unleaded 2 hours ago
Still relevant today. Many problems people throw onto LLMs can be done more efficiently with text completion than begging a model 20x the size (and probably more than 20x the cost) to produce the right structured output. https://www.reddit.com/r/LocalLLaMA/comments/1859qry/is_anyo...
brainless an hour ago
May I add Gliner to this? The original Python version and the Rust version. Fantastic (non LLM) models for entity extraction. There are many others.
I really think using small models for a lot of smell tasks is the best way forward but it's not easy to orchestrate.
lynndotpy 2 hours ago
> I think the future will be full of much smaller models trained to do specific tasks.
This was the very recent past! Up until we got LLM-crazy in 2021, this was the primary thing that deep learning papers produced: New models meant to solve very specific tasks.
lsb 2 hours ago
My threshold for “does not need to be smaller” is “can this run on a Raspberry Pi”. This is a helpful benchmark for maximum likely useful optimization.
A Pi has 4 cores and 16GB of memory these days, so, running Qwen3 4B on a pi is pretty comfortable: https://leebutterman.com/2025/11/01/prompt-optimization-on-a...
semiinfinitely 2 hours ago
I don’t understand why today’s laptops are so large. Some of the smallest "ultrabooks" getting coverage sit at 13 inches, but even this seems pretty big to me.
If you need raw compute, I totally get it. Things like compiling the Linux kernel or training local models require a high level of thermal headroom, and the chassis has to dissipate heat in a manner that prevents throttling. In cases where you want the machine to act like a portable workstation, it makes sense that the form factor would need to be a little juiced up.
That said, computing is a whole lot more than just heavy development work. There are some domains that have a tightly-scoped set of inputs and require the user to interact in a very simple way. Something like responding to an email is a good example — typing "LGTM" requires a very small screen area, and it requires no physical keyboard or active cooling. checking the weather is similar: you don’t need 16 inches of screen real estate to go from wondering if it’s raining to seeing a cloud icon.
I say all this because portability is expensive. Not only is it expensive in terms of back pain — maintaining the ecosystem required to run these machines gets pretty complicated. You either end up shelling out money for specialized backpacks or fighting for outlet space at a coffee shop just to keep the thing running. In either case, you’re paying big money (and calorie) costs every time a user types remind me to eat a sandwich.
I think the future will be full of much smaller devices. Some hardware to build these already exists, and you can even fit them in your pocket. This mode of deployment is inspiring to me, and I’m optimistic about a future where 6.1 inches is all you need.
[-]
- bee_rider an hour ago
  I dunno. It kinda works, and points for converting the whole article. But something is lost in the switch-up here. The size of a laptop is more or less the size of the display (unless we’re going to get weird and have a projector built in), so it is basically a figure-of-merit.
  Nobody actually wants more weights in their LLMs, right? They want the things to be “smarter” in some sense.
- Archelaos 2 hours ago
  A typical use case for large laptops is when you want to store it away after work or when you only carry it occasionally. I have a PC for coding at home, but use a thinkpad with the largest screen I could get for coding in my camper van (storing it away when not using it, because of lack of space) or when staying at my mother's home for longer (setting it up once at the start of my visit). I also have another very small, light and inexpensive subnotebook that I can carry around easily, but I rarely use it these days and not for coding at all.
siddboots 3 hours ago
I think I have almost the opposite intuition. The fact that attention models are capable of making sophisticated logical constructions within a recursive grammar, even for a simple DSL like SQL, is kind of surprising. I think it’s likely that this property does depend on training on a very large and more general corpus, and hence demands the full parameter space that we need for conversational writing.
debo_ 2 hours ago
2000: My spoon is too big
2023: My model is too big