SBBP

Written — Updated
  • "Should've been a Blog Post" is a project that downloads a video, generates a transcript using Whisper, and places that transcript along side screenshots from the video so you can follow along with slides.
  • The name is a bit of snark, but I have two aims for this project:
    • Some videos really should have just been blog posts, and I feel like I've just wasted time.
    • Others are actually valuable, but as a parent with young kids my time is limited, and so this allows faster consumption or in settings where watching a video isn't feasible.
  • Github Link
  • Task List

    • Up Next

      • Host this somewhere so I can access it over the web and add to Readwise
        • Maybe just host locally and expose subdomain
    • Soon

      • Get thumbnail image from video metadata
      • Summarize comments (can yt-dlp grab them?)
      • For videos with chapters, summarize each chapter
      • Table of contents with video chapters
      • Improve reader layout
    • Later

      • For longer videos without chapters, use LLM to try to generate chunks
      • Click on/near a paragraph to play the video at that point.
      • Better image similarity algorithm
      • Some way for multiple users to link to the same video without downloading it each time (if opened to public)
        • Probably put the video metadata itself in a separate table and have the user's video model link to that.
      • Video Host whitelist? (if opened to public)
        • Might be a hassle to maintain but also potentially prevents issues with shady content
    • Done

      • Move processing into Rust
        • Steps
          • Download video
          • Images
            • Extract Images
            • Similar Image Algorithm
          • Transcript
            • Extract Audio
            • Transcription
            • Summary of Transcription
      • Simple auth
      • Allow storing files in cloud storage
      • Use Deepgram for transcription
      • Hosting: Store non-temporary downloaded files in Backblaze or Cloudflare object storage
      • Integrate with Filigree
      • Allow triggering video download and processing from inside the app — Nov 20th, 2023
      • Queue up multiple videos for processing — Nov 20th, 2023
      • Add "read" state and progress tracking — Nov 20th, 2023
      • Speed up SSIM process — Nov 18th, 2023
      • Generate a summary of the video — Nov 18th, 2023
      • Use structural similarity (SSIM) metric to remove duplicate screenshots — Nov 17th, 2023
      • Screenshot extraction
      • Whisper transcript generation
      • Align transcript to images and

Thanks for reading! If you have any questions or comments, please send me a note on Twitter.