SBBP

Written — Updated
  • "Should've been a Blog Post" is a project that downloads a video, generates a transcript using Whisper, and places that transcript along side screenshots from the video so you can follow along with slides.
  • The name of the project is a bit of snark, but I have two aims for this project:
    • Some videos really should have just been blog posts, and I feel like I've just wasted time.
    • Others are actually valuable, but as a parent with young kids my time is limited, and so this allows faster consumption or in settings where watching a video isn't feasible.
  • Github Link
  • Task List

    • Up Next

    • Soon

      • Improve reader layout
      • Improve whisper accuracy for technical terms
        • See if OpenAI's prompt parameter can be passed to huggingface's whisper API
      • Check out insanely-fast-whisper
    • Later

      • Host this somewhere so I can access it over the web and add to Readwise
      • Hosting: Use an actual database
      • Hosting: Consider using OpenAI Whisper or similar service (maybe DeepGram?) so I don't have to run Whisper locally
      • Hosting: Do SSIM in Rust to save RAM?
      • Hosting: Store non-temporary downloaded files in Backblaze B2
      • Hosting: Basic auth
    • Done

      • Allow triggering video download and processing from inside the app — Nov 20th, 2023
      • Queue up multiple videos for processing — Nov 20th, 2023
      • Add "read" state and progress tracking — Nov 20th, 2023
      • Speed up SSIM process — Nov 18th, 2023
      • Generate a summary of the video — Nov 18th, 2023
      • Use structural similarity (SSIM) metric to remove duplicate screenshots — Nov 17th, 2023
      • Screenshot extraction
      • Whisper transcript generation
      • Align transcript to images and

Thanks for reading! If you have any questions or comments, please send me a note on Twitter.