2025-01-07

πŸ”—

Building

  • Most of my time lately has been spent between my new startup and my new baby, but when I can squeeze in some free time I'm trying my hand at an autonomous programming agent based around Aider. First attempts with prompts I found from elsewhere were just mediocre, but I'm thinking something like this:
  • Aside from Aider's internal repo map, I'll create a list of all the files in the repository and use an LLM to create a short description of each one.
  • Core loop:
    • Get a task from plan.md
    • Use file_map.yaml to ask about relevant files, add these to Aider.
    • Generate a plan to implement this task. Lay out the interfaces between the frontend and backend, but do not generate any code except for type definitions.
    • Generate integration tests for the plan
    • Generate the frontend code based on the plan.
    • Generate the backend code based on the plan and the generated frontend code.
    • Check types and edit until type checks pass
    • Generate unit tests for the code
    • run/edit loop on tests
    • once tests pass, run format step
    • run postmortem step updating docs and lessons
    • update file_map.yaml file to describe any newly added files or files that
  • As I make progress on this I may need to also come up with a system for developing detailed plan documents for each task. It can be done in ChatGPT/Claude but it gets a bit hard to manage all the different feature-specific chats all in one, and it would be nice to have this live alongside the actual coding agent and be informed by the repository structure.

2024-07-11

πŸ”—

Learning

Today I read StateFlow: Enhancing LLM Task-Solving through State-Driven Workflows. This paper talks about using state machines to explicitly model the problem that an agent is trying to solve. They got great results, and I agree that it's a good way to go about solving a problem, but I'm not sure what's really new here. Maybe it's just my background, but using state machines to direct the high-level behavior of your agent feels like a very obvious thing to do, and a lot of people in the LLM space have been talking about it for a while now.

2024-07-09

πŸ”—

Building

  • Fixed some issues with svelte-maplibre and Mapbox's Draw plugin today. Both of these turned out to be general incompatibilities with Maplibre, where the Mapbox plugin (understandably) assumed the presence of certain Mapbox CSS classes. I released v0.9.9 of the package which includes some new CSS which allows the mouse pointers to work properly in Maplibre. And an example is also now up on the website.

2024-07-02

πŸ”—

One of the best parts about tools like Phind is you can ask it a question, follow up with "I'm on a Mac and this didn't work" and it will just tell you why.

Turns out Mac tools often use libedit instead of readline, so your keybindings need to go in .editrc instead of .inputrc.

2024-06-15

πŸ”—

Building

  • I've managed to avoid learning Kubernetes for years, but it's finally time, so I'm setting up SBBP to run with it. I'll probably put Caddy in front. Aside from the learning, the main purpose of this is just so I can easily add the transcribed videos to Readwise.

Learning

SIEVE is a smarter change eviction algorithm that also manages to be simpler and faster than most other algorithms. A clever idea that optimizes for quick eviction, taking advantage of the property that in many cache use cases, a large proportion of the fetched objects are actually not fetched often.

Links

  • This podcast with Flo Crivello has a lot of good information on how Lindy is training their AI agents. Most interesting to me was the talk about explicit training phases for agents when they encounter new tasks, where the user can give quick feedback on how the agent is performing on a set of initial work items, and have it adapt in real-time based on that feedback until the performance is acceptable.

2024-06-12

πŸ”—

Learning

Hit an unusual problem today where reading pixels back from a canvas HTML element won't always give you the same value you wrote, if Brave's anti-fingerprinting measures are on. (Anti-aliasing is an issue too, but we knew about that already.) I remember reading something about how browser-fingerprinting libraries can sometimes use techniques like this to determine unique users, but was baffled for a while until turning off Brave's privacy features made the bug go away.

2024-03-11

πŸ”—

Building

  • Filigree
    • Got the object storage abstraction and configuration finished up over the weekend.
    • Now I"m starting on a "File" type of model, which will allow the template system to generate code that handles uploads, and optionally stores things like file hash, image type, etc. in the database.

2024-03-06

πŸ”—

Building

  • Filigree
    • Starting on object storage support. Once done, this will add a File type to the database model, set up the client, and also make it easy to do things like stream files to and from the client. And of course the template system will initialize everything for you.

2024-03-05

πŸ”—

Building

  • Smelter
    • The worker framework now handles SIGINT for you, and gives your code a channel that will change when the task should be cancelled.
    • Published v0.1.0 of the Smelter crates, since it's about to go into production at work!
  • Effectum
    • More work on outbox pattern. The code is pretty much done, just needs tests.

2024-03-04

πŸ”—

Building

  • Effectum
    • Started on support for transactional outbox pattern, since I'm starting to add background worker support to Filigree
  • Smelter
    • The AWS Rust SDK doesn't handle the ThrottlingError returned from the ECS API, and so they don't get retried. I added a custom `ClassifyRetry` layer to make this work.
    • Improved job cancellation on error, specifically around ensuring that there is time to send cancel requests to all the workers. There's still room for improvement around the ergonomics here I think, but it's an improvement.

2024-03-03

πŸ”—

Building

  • Smelter
    • Worker containers collect statistics such as RAM used and load average, and return these to the job manager.
    • Added comments to all the public exports, in preparation for the first published version on crates.io

2024-03-02

πŸ”—

Building

  • Smelter
    • Added better job cancellation support
    • Each job now as its own UUID
    • QOL improvements around status messages

2024-03-01

πŸ”—

Smelter had been on hold for a while, but this week at work a need came up for parallel processing, so I finished it up and ran the first real job!

This data processing pipeline formerly took 6 hours to run with a single process on a 24 vCPU machine, and now runs in 23 minutes using 32 8vCPU containers on Fargate.

The original vision for Smelter was to run it with Lambdas and other similar FaaS runtimes, but because it uses an adapter model to support new platforms, I was able to add Fargate with no changes to the existing code. Really happy how this turned out.

I do still want to implement the originally envisioned massively-parallel Lambda computation at some point too, for more interactive use cases.

Building

  • Filigree
    • Started on many-to-many relationships using a through table and model
  • Smelter
    • Tested out the Fargate orchestration code with real containers and made some fixes
    • Ergonomic improvements to spawning Fargate tasks
    • First real run of Smelter! This is a data processing pipeline at work that formerly ran with a single process, now running with 32 containers in parallel across the dataset.

2024-02-29

πŸ”—

I finally had a real-world need for Smelter, so I've revived the project and it now supports AWS Fargate. With the adapter model I'm using, implementing it for Lambda as well (the original vision) shouldn't be a big lift.

Building

  • Filigree
    • Finished up tests for populated child models

Links

  • PRQL is a data query language that compiles to SQL. There are a number of these projects, with Malloy being the other one you hear about the most.

2023-11-28

πŸ”—

I've been reorganizing my code snippets repo, and among other things I've updated my dark mode support for modern SvelteKit.

This supports SSR by persisting both the user's choice and the default setting in cookies, to avoid that "flash of light". Check it out here.

I also set up CI and websites for PromptBox and sqlweld, where they can easily be downloaded.

Links

  • AWS announced a new S3 storage tier designed for low-latency analytics among other things. If I ever start working on Smelter again this will be useful. 7x the cost of normal S3 but I'm sure that's worth it for certain systems.
  • Placekey looks like a nice solution for address entity resolution. It has a generous free tier and cheap paid tiers as well. Heard about it on from the Mapscaping Podcast.

2023-11-27

πŸ”—

Got the first version of Glance up and running. A bit more work to be done on design and then I'll be ready to write more mini-apps to feed in all the info I want to see... at a glance.

2023-11-23

πŸ”—

Spent some time yesterday and today updating my sqlx JSON companion crate to support sqlx 0.7. I also added a type that makes it easier to get a Box<RawValue> out of the query.

Also finally added tests! And wrote up some notes on using JSON in sqlx.

2023-11-21

πŸ”—

I created a small utility named sqlweld for assembling SQL query files from liquid templates with partials, so I can get statically-defined queries while still allowing reuse for common things like permissions checks. This way I can write raw SQL in my projects while still getting code reuse on my queries.

The first real use of this will be on Glance, which I'm finally starting on in earnest. Check out sqlweld on Github!

2023-11-20

πŸ”—

Updated SBBP so that I can add videos directly from the app, and also queue up multiple videos for download. Whisper currently doesn't do great on technical terms, so that needs some improvement. I think there's a way to do that with OpenAI's API, but it's not obvious how to do it with the Huggingface high-level packages, so I may need to dive in to the internals some to see if that can be done.

2023-11-19

πŸ”—

Somehow I missed that the WebP format includes a separate lossless encoding mode. I've made two updates to Pic Store as a result.

  1. Added configurable quality to the conversion profile, and a 100 quality will do lossless encoding when the input is a PNG.
  2. Updated the "reconvert" endpoint to read the conversion profile again, so that changes will take effect.

With this, I can finally upload a good screenshot from SBBP as well, where I used it to read through a MotherDuck talk on hybrid query execution and see the slides alongside.

sbbp-1.png

2023-11-18

πŸ”—

I also sped up the image similarity process by 100x, by just running it all in one Python execution instead of starting a new invocation for every checked pair of images.

Overall I'm really happy with how this is turning out. Still need to make the design better but it's already very useful for things like long conference talks.

2023-11-17

πŸ”—

I added structural image similarity to SBBP , so that screenshots from the video that are too similar to the previous one are automatically dropped out. This makes the viewing experience much better since you only see a new image when something has really changed since the previous one.

Learning

Calculating Structural Similarity of Images

Turns out scikit-image in Python makes this really easy.

import cv2
from skimage import structured_similarity as ssim
img1 = cv2.imread(sys.argv[1])
img2 = cv2.imread(sys.argv[2])
sim = ssim(img1, img2, channel_axis=2)

2023-11-16

πŸ”—

I started a new project called "Should have been a Blog Post," or SBBP for short. This application downloads a Youtube video, extracts screenshots every 10 seconds, and runs the audio through Whisper. The text is then presented alongside the screenshots, so you can just read through the transcript and see the visuals alongside, instead of needing to spend an hour watching a single video.

Check it out on Github.

2023-11-13

πŸ”—

Added the ability in PromptBox to append strings from the command line and from stdin in addition to what's in the prompt template. Next up will be ability to submit images to GPT4 vision and Ollama (once the pending PR for it is merged).

I uploaded some sample prompt templates as well.

Links

  • Spent the weekend playing some with Dagster for downloading and processing medical device approval data from the FDA. It's a nice system; the data asset model and partitioning fits well with how I like to think about things. Definitely worth checking out if you need to write data pipelines of some complexity.

2023-11-09

πŸ”—

Wrote a small note on how to Preview a CSV In-Browser with Papa Parse.

Links

  • pipx is a program that exists solely to install and run Python executable packages, each in their own venv. Takes a lot of pain out of the nonsense and frustration that continues to characterize using Python.
  • Rye does seem to work better for some packages, particularly those that refuse to run on the latest version of Python, which is basically anything ML-related for a few months after the release.

2023-11-05

πŸ”—

I’ve been playing with using PromptBox for code generation given context from the repository. Having some trouble figuring out a good output format though.

I have been able to get it to output diffs, sort of. The main problem is that it loses track of how long the diff is supposed to be compared to what it puts in the header, so patch rejects it.

Maybe there’s some better way? Maybe I just need a post processing step to fix the diff or apply it some other way? I'll have to play with it some more.

SoundQueue

Every year I help out with the sound at my church's Christmas play. This year we're low on help, and the play is much more involved than your average Christmas play, so I find myself both running the sound board and triggering the music/effects. This was a good excuse to automate the latter task, and a good opportunity to explore Tauri a bit.

SoundQueue is a small program that reads in a manifest of a bunch of sound files, and lets you easily play them one at a time with the press of the space bar, queueing up the next sound when one finishes playing. It allows custom volume and in/out points for each sound as well.

I was pretty happy with the productivity of Tauri. Despite not being too familiar with it, this app took only a few hours to make.

2023-11-04

πŸ”—

I've spent the last few days building PromptBox, a utility allows maintaining libraries of LLM prompt templates which can be filled in and submitted from the command line. The templates are just TOML files like this.

# File: summarize.pb.toml

description = "Summarize some files"

# This can also be template_path to read from another file.
template = '''
Create a {{style}} summary of the below files
which are on the topic of {{topic}}. The summary should be about {{ len }} sentences long.

{% for f in file -%}
File {{ f.filename }}:
{{ f.contents }}


{%- endfor %}
'''

[model]
# These model options can also be defined in a config file to apply to the whole directory of templates.
model = "gpt-3.5-turbo"
temperature = 0.7
# Also supports top_p, frequency_penalty, presence_penalty, stop, and max_tokens

[options]
len = { type = "int", description = "The length of the summary", default = 4 }
topic = { type = "string", description = "The topic of the summary" }
style = { type = "string", default = "concise" }
file = { type = "file", array = true, description = "The files to summarize" }

Each of these options becomes a CLI option which can help fill in the template.

It works with OpenAI for the usual case, but you can also run it against LM Studio or Ollama if you like local LLMs. If you give it a try, let me know what you think!