News

Switching to Local YOLO + CLIP Detection

This was a big architecture day. I replaced the old cloud-based vision flow with a fully local pipeline built around YOLO object detection and CLIP visual search. That changed both the cost profile and the feel of the product in one shot.

The new setup means pins show up right after detection, while pricing can keep running asynchronously in the background. That is a much nicer experience than waiting on everything at once, especially when someone uploads a busy board with a lot of pins in it. The app feels quicker even when there is still work happening behind the scenes.

Just as important, there are no more per-image cloud API costs hanging over every upload. The inference all happens locally now, which makes the whole thing more sustainable for a hobby project. It took some rewiring, but this feels much closer to the long-term shape of the system.