Local Models, Specialists Not Substitutes
A local OCR pipeline demonstrates where small models genuinely win
For context, I’m running an M3 MacBook Pro with 36GB of unified memory. It’s a capable machine for local inference, but it has a ceiling around 30B parameters. Anything larger either won’t load or crawls to unusable speeds.
Local models haven’t wowed me for coding tasks. I’ve tried routing smaller models into my workflow, and the results consistently fal…



