- Small Perceive update today. Given a particular item, you can find other items that are most similar. Since the semantic search is already basically a similarity match, this was just a matter of changing the code to read the embeddings vector for the item from the database, instead of creating one from a typed-in string.
- Added support for semantic search on Browser bookmarks. It's very convenient that Chrome's metadata files are all just SQLite databases or JSON files. I'm thinking that bookmark management is going to become a first-class feature of Perceive, so you can get semantic search not only on bookmarks imported from the browser, but can add bookmarks inside the tool itself and search through them as well. So next up, going to try out a GUI in Tauri.
- Did various cleanup on Perceive over the weekend, and cleaned up the HTML parsing which had previously been removing a lot of spaces between words. Since a lot of the data comes from the browsing history and similar online sources, I added a command to allow reprocessing all the data without downloading it again.
- This brought about an unexpected issue. I ended up with a data processing pipeline deadlock where all the Rayon thread pool's threads were waiting on blocking channel sends. Then a different stage later in the pipeline which also used Rayon was unable to get any threads to do anything, and so no progress was made.
- Attaching with the debugger was very useful here. I had my suspicions, mostly from eliminating pretty much every other potential cause, but looking at the call stacks of all the different threads made it very obvious.
- Fortunately the solution was easy. It turns out Rayon lets you create separate thread pools, and so I did exactly that to remove contention. A couple hours of debugging, and only a couple minutes to make the fix.
In addition to these short updates, I sometimes write longer articles too. My latest is Avoiding Deadlock from Rayon Thread Pool Exhaustion.
I'm a co-founder of Carevoyance (acquired by H1 Insights), a sales acceleration tool that enables healthcare sellers to zero in on their best prospects and generate custom reports and insights with just a few clicks.
I spend most of my time there creating new data analyses, working on the backend API and database systems, and developing tooling to research data anomalies and automate repetitive tasks. Recently I've been active on the front-end too, and have been enjoying the Svelte framework.
Before starting my own venture, I interfaced with advanced network switching chips at Arista Networks and worked on JTAG hardware debuggers and embedded operating systems at Green Hills Software. Running a small startup feels very different from working at these companies, and it has its ups and downs, but I love it.
I usually have some sort of side project going on, and my most recent obsession is Ergo, a low-code workflow orchestrator that is still in early stages, but coming along well.
Sometimes I wish I could code all day and night, but when not hacking on something or spending time with my family, I enjoy good coffee, nature photography, reading nonfiction and sci-fi, and improving my nascent design and UX skills. I'm also active in my church and run the sound board there every few weeks.
Where to find me
About this site
The prose content on this site is licensed under a Creative Commons Attribution 4.0 International License. The code can be viewed on Github. The underlying code as well as all code examples are licensed under the MIT license.