Yet Another How I Did A Thing With An LLM Post
Yet Another How I Did A Thing With An LLM Post
Leaning into my awesome leadership skillz and laziness of reading documentation, I decided to practice the last marketable skill and write code using LLM as the majority contributor.
I used opencode and its default, free model, Big Pickle. This is a postmortem.
Disclaimer: I have written a bunch of Go code before, but not in recent years.
What was achieved
A personal workflow runner tool, it's working, here: https://gitlab.com/andras.horvath1/marathon, was had in about a week. I chose this because I wanted one for long-running tasks in the homelab, and couldn't find a decent tool that scratched my itches.
I wrote the basic data structures and a sort of "walking skeleton" by hand, then let opencode update various things, using a NixOS sandbox because I'm paranoid about leaking my other data.
I reviewed and hand-tested everything the LLM wrote.
What went well
As long as I
- Had a crystal clear mental model as to what I wanted (features, control flows, data structures) and
- Had a reasonable understanding of the toolkit that should be used (libraries, abstractions)
- Kept the task at hand minimal and super well defined
- Trusted, but verified every little detail
the LLM
- Produced okay code
- Read the docs so I didn't have to understand the details of TUI elements or CLI libraries.
- Was able to write tests and assorted boilerplate, which is neat.
- Had a reasonable structure in putting stuff into the "right" files, without being told so.
- Produced semi-useful, if sometimes incorrect, summaries of what it had done. It sped up my understanding of the code, but I couldn't trust it.
- It rarely broke existing functionality, perhaps because of my insistence on testing.
What didn't go well
- It struggled with concurrency patterns and took many tries and manual correction to get some right (for a "personal tool" at least).
- It tended to reinvent the wheel, often badly. The scrolling UI window for example took 193489713 tries and still didn't work right, until I found a library, and
git reset --hardand retried with that. - Sometimes it refactored Just Because, sometimes it generated two almost-identical functions for slightly different use cases.
- It often took a long time to come back with a result.
What was learned/confirmed
- Git continues being a godsend as a save-game functionality where I could easily save incremental progress or reset out of dead ends, using branches.
- It's slow and verbose, but I was doing this for not-work, so I often context-switched into chores instead of other work tasks. Not sure how well switching to more thinking would have gone, and I often forgot about coming back until the next day.
- Corollary: it's very uncertain how long a feature is going to take to develop. Will it get it in 1 try, 5, or do I give up and code it by hand?
- As expected, I had to build a good mental model, exactly the same as if I was coding this by hand. This is the hard part, but I appreciated not having to hand-write basic CLI functions.
- As expected, sometimes you have to go in and muck things out by hand.
Summary
These tools are useful to the low-attention-span manager (me) or senior/staff engineer who doesn't want to live and breathe all the if err != nil patterns but has a good idea of what they want.
The tool I used would have been absolutely terrible paired with a junior dev without further guidance. At least, today.