2023-09-21

🔗

A useful Github trick is to watch a repo, but with custom notifications so I only get notified on releases.

CleanShot 2023-09-21 at 07.45.57@2x.png

Links

  • How Fast to Hire by Sarah Guo focuses on navigating the transition around finding product-market fit, and hiring appropriately on both sides of that transition.

2023-09-20

🔗

The upcoming runes system in Svelte 5 looks really nice. I think this will fix most or all of the biggest bugs in Svelte 3 and 4.

I did have some concerns about pages that need to edit complex objects, but it looks like you can write a function to "rune-ify" any object without too much trouble. I made a small project in the Svelte 5 preview REPL to try it out. There are probably some bugs there but that relieved my main concern.

Looking at the compiled code with runes, everything is a lot simpler too. No more need for dirty tracking or passing nested context into components for slots. Slot functions are just closures that directly access the runes in the outer component's state now.

The internal scheduler looks more complex than before, but that's to be expected. Importantly, there are far fewer moving parts and the moving parts aren't being generated by the compiler for every new project, which should both reduce potential for bugs and make it easier to test.

Links

2023-09-19

🔗

Buzzy is built out enough now that I could show the prototype to my kids. My 6-year-old said I should change the name to Alexa, or maybe Google :)

Up next will be intent detection and web search to help the model answer factual questions, but first I redid some of the web app internals to use a real state machine. First time using XState's type generation helpers, and it works well!

2023-09-17

🔗

After some Python dependency hassles, I ended up using NVIDIA's FastConformer CTC model for the speech-to-text side of Buzzy. It works great and is roughly 8x faster than Whisper when running without a GPU.

On the speech generation side, I ended up using Mycroft's Mimic 3. There are a number of different voices to choose from, and some sound better than others, but it's super fast and the quality is more than acceptable. I found that many of the voices sound better if you use a lengthScale of 1.2 to slow it down a bit.

2023-09-15

🔗

I initially was using Bun for Buzzy, but had to switch back to Node until some Vite ecosystem issues are resolved. I really liked Bun though, and look forward to using it more in the future. And of course, the first thing I hit when going back to Node was an ESM issue ;)

2023-09-14

🔗

So my latest side project is Buzzy, an AI assistant geared toward answering my kids' questions. They like talking to Alexa but it frequently gives irrelevant results, so I'm hoping to give them something both smarter and more fun.

Any voice assistant needs wake word detection and while it would be fun to train my own model here, PicoVoice works great out of the box. A tad expensive if you're a solopreneur without a budget, but totally free for personal use.

Buzzy lives at https://github.com/dimfeld/buzzy and I'll continue updating here as things get going!

2023-04-18

🔗

Links

  • Where to set the standard by Sarah Guo is an exhortation to move quickly. The biggest advantage startups have when trying to break into existing markets is the agility that larger, more established companies can not match. This short essay urges you to embrace that advantage, and not worry about high standards scaring people away. High performers embrace high standards.
  • Excuse me, is there a problem? by Jason Cohen is a great overview of analyzing the market for your potential startup. But the big contribution from this article is a quantitative framework for rating your business idea and classifying it as appropriate for high-growth, VC-funded scale up, a medium-growth bootstrapped business, or just something not worth pursuing.

2023-04-16

🔗

Yesterday I wrote some notes on PostgreSQL Row Level Security and it ended up on HN. It got onto the front page, which resulted in this crazy graph on my Plausible stats.

plausible-hn-spike.png

If this happens more regularly I'll have to upgrade my plan! 😆

2023-04-14

🔗

Yesterday I set up a custom command bar for Neovim, and I also wrote up some details on how to do it for yourself: Creating a Custom Command Bar in Neovim

vim-command-bar.png

Links

  • Building LLM Applications for Production is a bit light on the actual topic, but that's forgivable since the community as a whole is still figuring out best practices for using LLMs in production. Regardless, it is a good overview of the LLM landscape and various tools and methods around it. On the title topic, I did enjoy the discussion of unit testing.
  • One thing I’ve also found useful is to ask models to give examples for which it would give a certain label. For example, I can ask the model to give me examples of texts for which it’d give a score of 4. Then I’d input these examples into the LLM to see if it’ll indeed output 4.
  • Scott Alexander's review of a book about IRBs is a good read if you're interested in research practices and process. But this quote was too jaw-dropping to miss.
  • maybe it was unethical to do RCTs on ventilator settings at all. He asked whether they might be able to give every patient the right setting while still doing the study. The study team tried to explain to him that they didn’t know which was the right setting, that was why they had to do the study. He wouldn’t budge.

2023-04-12

🔗

I wrote a short essay yesterday titled Are LLMs Databases?, and posted it up here too. Ultimately I don't think there's any real comparison, but it's still a useful question to think about a bit.

In more practical news, I've been thinking about how using LLMs to generate SQL seems an especially potent way for existing applications to integrate LLMs and AI in a way that’s actually useful.

Users ask the questions they care about, without having to wait for you to build the answer. The most popular questions can become full features. I'll likely be exploring this more in the future.

2023-04-11

🔗

fast.ai released part 2 of their Practical Deep Learning for Coders free course. This one covers building a model like Stable Diffusion from scratch. I’m looking forward to going through it, since the foundations needed to implement the techniques are said to apply to many other types of models, but a more important lesson jumped out at me here.

Instead of starting with the foundations of transformers, attention, autoencoders, and so on, the course adopts a top-down method of learning. In this style you start by introducing the full solution to a problem, proceed from there to examine the component parts of the solution. If you’re ever been reading a book or taking a course, and have had trouble figuring out why or how a particular chapter’s content is useful, you can probably see why this style of learning would be helpful.

Starting with a full solution leaves some things unexplained or fuzzy at first, but I think it has a few advantages.
  • It’s clear how new concepts fit into the bigger picture, because they can be introduced in the context of the initial, full example.
  • Having a working example also helps with suggesting ways to explore the new concepts that might otherwise be presented in a vacuum, instead of just promising that it will all come together later.
  • It makes it easier to jump right in and get something working right away, and for those who like to experiment, lets them start doing so more quickly.

I’ve been thinking about top-down learning so much because it applies directly to the spatial data book I’ve been writing.

After seeing how the fast.ai course is structured, this week I added a new chapter right at the beginning of the book. This new chapter is a working example that quickly introduces GeoJSON and D3 and presents a very simple web page that renders a map.

Simple Country Boundary Map

This map is nothing special — just some country boundaries and a few points — but serves a very important purpose by incorporating most of the concepts explored in more detail later in the book.

There are a lot of foundational concepts to cover when starting out with spatial data: the structure of GeoJSON, how to find and load spatial data into your application, and more. Now the reader can go into these foundational chapters with some idea of what’s actually going on, instead of having to wait until the last quarter of the book to make something they can actually see on the screen.

Aside from this new chapter, I’ve been updating the “Types of Map Visualization” chapter. This had been a very quick overview, but now I’ve been adding more examples and more pitfalls for each of the types of maps. The chapter will probably be three as long as it had been, but I expect that it will be much more useful and hopefully provide some guidance through what can be a tricky process even for seasoned data visualization practitioners.

Links

  • Malleable software in the Age of LLMs by Geoffrey Litt looks forward to how LLMs may bring software development to the masses, not so much in the sense of replacing existing software developers, but in making it much easier for non-technical people to create small one-off applications to solve specific applications. Worth some time to read and think about.

2023-03-13

🔗

Today I added a nav bar to the svelte-maplibre demo site, so you don't have to hit the back button all the time to go between examples.

Otherwise, I've mostly been heads down on the book. It's a bit over 7,000 words right now and coming along well. Still lots more to go!

2023-03-08

🔗

As procrastination for writing my book, I wrote a Neovim plugin to generate word counts per section in a document.

Neovim lets you use an extmark on a position to add virtual text. So the idea was to set an extmark on each header line with the word count, but I didn't quite figure out how the extmark actually moves around with the text. It feels somewhat unintuitive, but there are probably some subtleties that I haven't figured out yet.

Instead of dealing with that, instead I set up my plugin as a "decoration provider." This lets you just set an ephemeral extmark which only lasts for the draw cycle, and then next time a line is redrawn you just create it again (or not). This ended up being a bit more code, but much simpler since I now only have to track the word count and where the headers are, and not worry about if an extmark is still on the right line or not.

Overall Lua feels nice, if rather barebones. I'm definitely missing the functional programming idioms afforded by most modern languages. But the Neovim APIs are very easy to use.

If you're interested in using this, you can get it on Github.

2023-03-07

🔗

Today I encountered an issue using Vite with the Rush monorepo software. Rush keeps its lockfile in an unusual place, and so Vite could not automatically check the lockfile to know when to clear its cache. But it turns out to be pretty easy to code your own logic here.

// vite.config.js
import * as fs from 'fs';
import * as path from 'path';
import { fileURLToPath } from 'url';

const dirname = path.dirname(fileURLToPath(import.meta.url));

function lockfileNeedsRebuild() {
  const lockFilePath = path.resolve(
    dirname,
    '../../common/config/rush/pnpm-lock.yaml'
  );
  const lockFileDatePath = path.resolve(dirname, '.last-lockfile-date');

  let lockFileSavedDate = '';
  try {
    lockFileSavedDate = fs.readFileSync(lockFileDatePath).toString();
  } catch (e) {
    // It's fine if there is no existing file.
  }

  const lockFileDate = fs.statSync(lockFilePath).mtime.valueOf().toString();

  if (lockFileSavedDate.trim() !== lockFileDate) {
    fs.writeFileSync(lockFileDatePath, lockFileDate);
    return true;
  }

  return false;
}

/** @type {import('vite').UserConfig} */
const config = {
  optimizeDeps: {
    force: lockfileNeedsRebuild(),
  },
  // ... and other config
};
export default config;

2023-02-25

🔗

I've made great progress on my Maplibre Svelte library, enough so that it's basically ready for use and I'm starting to move on to more advanced features such as new types of layers with custom shaders. Really happy with how this turned out and how easy it is to create great maps with Maplibre.

The Demo Site has a lot more demos now, so check it out!

I also created a small utility called merge-geo, which takes a GeoJSON file and related CSV files with information about the regions in the GeoJSON, and imports the CSV data into the GeoJSON. This comes up a lot when working with US Census data and similar sources, so it can save a lot of time.

2023-02-19

🔗

The new article on loading geodata is published and I'm happy with how it turned out. Now I've been starting on the Svelte MapLibre library.

Today I got the basic map and simple marker support working. Tomorrow I'm starting on real sources and layers, and along the way I'm planning on some wrappers to make it easier to create fancy styles and shaders too.

If you're interested in following the progress here, you can check out the Github Repo or the Demo Site.

2023-02-17

🔗

Making some last minute changes to the new post tonight and aiming to publish it on Saturday. After that's out, I'm going to start on a Svelte wrapper for the Maplibre mapping package. This will be a basis of future posts on working with geographic data in the browser.

2023-02-16

🔗

Finished the initial draft of my 2nd GIS post today. Coming soon: "Loading Geographic Data in a Format You Can Actually Use". Then we can move on to the fun stuff.

2023-02-13

🔗

Starting on my next geodata blog post. This is another foundational topic — how to actually get geographic data into a format you can use in your application. I'll cover shapefiles, KML, US census data, OpenStreetMap, and more.

Links

  • Why you should go after a market with strong demand — Some things to consider when looking to start a new product
  • Creatively Misusing TLA+ by Hillel Wayne. This quote stuck out:
  • TLA+ does worst-case model checking, so it fails if it finds any path to an error. This opens a famous trick: if you want to find the set of steps that solve a problem, write a property saying “the problem isn’t solved” and make that an invariant. Then any behavior that finds the solution also breaks the invariant, and the model checker will dutifully spit out the set of steps in that behavior.
  • A bit of math trivia from John D Cook's Blog
  • If a three-digit number is divisible by 37, it remains divisible by 37 if you rotate its digits. For example, 148 is divisible by 37, and so are 814 and 481.

2023-02-11

🔗

Yet more changes to my publishing flow over the past two days. I added the ability to set custom HTML classes and elements in my Logseq page exporter, which lets me add nice custom styling like in the "code and image" pairs in the new Introduction to GeoJSON article.

And speaking of that article, it's live on the site now. This is targeted toward newcomers to the geographic data world, so if you are a beginner, then I hope it's useful, and if not, then expect additional content soon!

2023-02-09

🔗

I wrote 1,100 words on an intro to GeoJSON last night. The article is coming together and should be up soon once I make the content bit less dry :) This one is just the basics, but will set the stage for doing actual work.

Did a bit more tonight, but I took a break to fix some CSS stuff on this site and update my Logseq exporter to better support exporting longform writing. Now I can write blog posts in Logseq and still use the outline hierarchy to organize sections, but get flattened HTML output when I publish here.

2023-02-08

🔗

Starting the first article geographic data article tonight. This one will be an overview of GeoJSON, to lay the foundation for more complex topics.

Links

  • This article on rollback netcode is a nice overview of some different ways of synchronizing state when latency and timing really matters. Mostly only useful for game programming, but well explained.

2023-02-07

🔗

I'm thinking about doing some more content on working with geographic data. This would include topics such as GeoJSON, working with PostGIS, and writing full apps with SvelteKit and Leaflet. If there is anything that you've found confusing or hard to learn in this area, let please reach out. My email is just daniel at this domain, and I'm also on Twitter and Mastodon.

In other news, I haven't been posting updates on Ergo recently, but the dataflow model is working. Still needs a bit more work to generate type definitions for the editors and for more visualizations, but the core functionality is there.

2023-01-04

🔗

Small Perceive update today. Given a particular item, you can find other items that are most similar. Since the semantic search is already basically a similarity match, this was just a matter of changing the code to read the embeddings vector for the item from the database, instead of creating one from a typed-in string.

2023-01-03

🔗

Added support for semantic search on Browser bookmarks. It's very convenient that Chrome's metadata files are all just SQLite databases or JSON files. I'm thinking that bookmark management is going to become a first-class feature of Perceive, so you can get semantic search not only on bookmarks imported from the browser, but can add bookmarks inside the tool itself and search through them as well. So next up, going to try out a GUI in Tauri.

2023-01-02

🔗

Did various cleanup on Perceive over the weekend, and cleaned up the HTML parsing which had previously been removing a lot of spaces between words. Since a lot of the data comes from the browsing history and similar online sources, I added a command to allow reprocessing all the data without downloading it again.

This brought about an unexpected issue. I ended up with a data processing pipeline deadlock where all the Rayon thread pool's threads were waiting on blocking channel sends. Then a different stage later in the pipeline which also used Rayon was unable to get any threads to do anything, and so no progress was made.

Attaching with the debugger was very useful here. I had my suspicions, mostly from eliminating pretty much every other potential cause, but looking at the call stacks of all the different threads made it very obvious.

Fortunately the solution was easy. It turns out Rayon lets you create separate thread pools, and so I did exactly that to remove contention. A couple hours of debugging, and only a couple minutes to make the fix.

2022-12-30

🔗

After yesterday's work and the resulting performance issues, I did some quick profiling. This revealed that most of the time in building the HNSW search index was in dot product calculations between the 768-element vectors that make up each document embedding. Fortunately, the ndarray crate came to the rescue. With support for BLAS and Apple's Accelerate framework enabled, the time to build the search index went from 45 seconds down to 5, so I'm quite happy with that.

Next up I'm going to try out putting all this into a simple GUI using Tauri.

I've also been enjoying, Rust's new let/else syntax. Take a bit to get used to, but it's really convenient, for example when you want to unwrap an option or return from the function early if it's None.

2022-12-29

🔗

Got browser history searching done today. Chrome's history file is just a SQLite database so that was pretty easy to pull in.

Lots of little details related to actually fetching the date, which aren't worth talking about too much and took up most of the day. I also revamped the import pipeline a bit to reduce head-of-line blocking when some requests take a while.

With this done, the database contains about 13,000 items, and the HNSW search index takes a while to build. So I'll work on optimizing that tomorrow.

2022-12-28

🔗

Torch on M1 GPU

It was a bit tricky getting rust-bert to work on the M1 GPU. The issue, apparently, is that pytorch JIT models not trained for MPS (the macOS GPU framework) can not be directly loaded on MPS. But it does work to load them and then convert them to MPS.

But it turns out there's an easy solution. After loading the VarStore on the CPU device, it's just a matter of var_store.set_device(tch::Device::Mps) and then you're running on the GPU!

In my initial tests with an M1 Pro, this is about 2-3x as fast as running on CPU/AMX. This took the time to scan and index my Logseq database (~1000 documents) down to 6 seconds. Curious if this would have been 3 seconds on an M1 Max, but I didn't spend the extra $400 a couple years ago to find out now. :)

Switching Models

The MiniLmL12V2 model that I started out with is trained more for "sentence similarity" than for searching longer documents, and it shows. I switched the model to msmarco-bert-base-dot-v5, which is supposed to work a lot better for semantic search, and indeed the search results improved immensely. The import process takes a lot longer (40 seconds for ~1000 documents), but that's still not bad. That GPU inference is pulling its weight.

These models aren't automatically supported by rust-bert, but the instructions on how to download and use other models worked great, and this one is similar enough to the existing sentence embedding pipeline that I didn't have to change much.

Search Highlighting

Finally, I implemented search result highlighting, so that you get not only the title of the found document, but a snippet of text from the document relevant to the query. I'm now using two models in the program at once. The primary model used for the search is still a BERT-based model, and handles the full document encoding.

The BERT model is powerful but relatively slow, so for highlighting, I used the MiniLmL6V2 model, which is both much faster and focused on small strings of text.

Then for each matching document, I tokenize it, break the list of tokens into overlapping chunks, and encode each chunk with the model. Finally, I also encode the original query, take a dot product between the two results, and the highest dot product for each document is the best-matching chunk.

I think it could use some tweaking to pay more attention to actual word boundaries in the tokens. But overall I'm quite happy with this as a first effort in a few hours.

Search results with snippets

2022-12-27

🔗

Got the import pipeline, model encoding, and embedding search working for Perceive. I ended up using the instant-distance crate to do the nearest neighbor searching, but hnsw_rs looks good as well.

Tomorrow I'll look at generating better embeddings. The model I'm using cuts off the input at 250 tokens so I'm going to add something to cut documents up and do a weighted average of the resulting vectors for each piece. Might play around with some other methods too.

Terminal Output of Search