Today I vibe coded a small project with Gemini, which fetches all the entries from a WordPress blog, and then extracts core arguments and key facts from them.
To my surprise, Gemini generates the code correctly in almost one-shot.
I used Ollama locally with a gemma3:12b model for this project, because I like to have absolute privacy and don’t want to pay any token fees. Also, for simple tasks like this, a local LLM is good enough.
The project turns out to be very simple, with just two Python scripts.
- scrapper.py
- analyzer.py
The scrapper is very simple, with requests and BeatifulSoup packages being used.
For the analyzer, there are a few tricky parts I like to take notes here.
In order to talk to the model, you just need to create a client with the ollama package, and the Ollama host is on http://localhost:11434 by default.
For each LLM chat call, I specified a system prompt and a user prompt in the message, together with a format argument with json as the value, so that I always get well formed json responses from the model.
I also assigned temporature and num_ctx (context length) arguments. A lower temporature hyperparameter allows the model to generate more deterministic results, and you need to assign proper context length argument so that it does not exceed the model’s limit.
Lastly, since I have hundreds of posts to be summarized, I used a batching + 2 step summarization approach to avoid exceeding context window limit.
I’m very satisfied with the result. Absolute privacy + zero token fee + no 3rd party agent frameworks used!
Maybe in the future I would explore more interesting ideas using local models with Ollama.