2022-12-27

🔗
  • Got the import pipeline, model encoding, and embedding search working for Perceive. I ended up using the instant-distance crate to do the nearest neighbor searching, but hnsw_rs looks good as well.
  • Tomorrow I'll look at generating better embeddings. The model I'm using cuts off the input at 250 tokens so I'm going to add something to cut documents up and do a weighted average of the resulting vectors for each piece. Might play around with some other methods too.
  • Terminal Output of Search