Pic Store

Written — Updated
  • This is a project that can take image uploads, convert them to appropriate formats and sizes, and store them in some web-accessible block storage for use in websites.
  • Goals
    • Image Upload
    • Convert and resize
    • Helper library for image upload from front-end
    • Helper library (vite plugin maybe?) for using uploaded images
      • This should also check by hash and upload if needed.
  • Store image convert queue in sqlite/postgres with some adapter
  • Store input image on disk or in blob storage while it waits to be converted
  • Upload output to S3
  • Tasks
    • Immediate Task List
      • Endpoint to check by hash if an image already exists
      • Generate <picture> tag corresponding to the base image and its output images.
      • Update S3 auth to support instance metadata auth method
        • This is mostly just adding the config and setting it to not do anything
    • Side Helper Tasks
      • CLI app that can hash an image and get the corresponding <picture> tag or other URL for it, and upload it if needed.
      • Some sort of GUI version of the same, maybe even something that can be triggered straight from macOS Finder.
        • Good excuse to try Tauri
    • Later Tasks
      • When adding a storage location, allow testing read/write access to it
      • Make it possible to delete obsolete output images
      • Support HEIC as input format
      • Add some simple logic to conversions
        • Only upsize
        • Only downsize
        • Only do this conversion if source image is a particular format
          • e.g. only generate JPEGs if the source image is also a JPEG.
      • More tests
        • Upload a base image and see that the output images get created
        • Targeted tests for each image type
          • Read PNG, JPEG, Avif, Webp
          • Convert image to PNG, JPEG, Avif, Webp
        • Check that image conversion has a reasonable result
        • Upload to some kind of mock S3
        • Set upload locations, conversion profile, etc. and read them back
        • Test all the permissions, checking for authorized and unauthorized operations
        • Set up e2e test framework :LOGBOOK: CLOCK: [2022-07-20 Wed 19:38:24]--[2022-07-20 Wed 19:38:25] => 00:00:01 :END:
          • Scaffold base data
          • Testing functionality, new database and so on
          • Create temporary folder to hold output images
          • Some method of checking that output images actually look ok. The image-compare crate may work well here.
    • Finished
      • Ensure that replacing a base image properly replaces the output images
      • When replacing a base image, delete the current output images associated with it.
        • We'll want an option here for how long to keep the old images around
        • Forever should be an option
        • Resolved: Images of the same size/format are overwritten, the rest are retained and marked for manual deletion.
      • Check out tracing-tree as a replacement for bunyan formatter :LOGBOOK: CLOCK: [2022-06-16 Thu 08:45:06]--[2022-06-16 Thu 08:45:07] => 00:00:01 :END:
      • Integrate DB into Axum
      • Set up storage providers
      • Accept HTTP uploads of images
      • Put uploaded images into S3 (can we stream them directly?)
      • Download files to convert from S3
      • Do conversion on files and save them back to S3
      • Graceful shutdown for server on SIGINT :LOGBOOK: CLOCK: [2022-11-11 Fri 09:39:39]--[2022-11-11 Fri 09:39:39] => 00:00:00 :END:
      • Bootstrapping command
        • team
        • user
        • project
        • storage locations
        • conversion profile
        • upload profile
  • Codecs
    • AVIF
      • ravif for encoding
      • libavif for decoding since it's more tolerant of slightly out-of-spec AVIF headers than other libraries
    • WebP webp
    • PNG and JPG - image
  • Task Queue Desires
    • Schedule jobs for later times
    • Multiple channels and workers only pull the channels they're interested in
    • Automatically keep job alive while it runs
    • Retry on failure
    • Nice to have
      • Checkpoints with payload updates
      • Run everything in a Diesel transaction and include the job completion at the end of the transaction
        • Just implement outbox pattern for this. Best to add helpers for it to Effectum
      • Likewise with checkpoints
    • Currently using a modified version of sqlxmq, changed to work with Diesel. While I like the queue code, maintaining a fork like this isn't a great idea so I should find something else. (Switched to Effectum )
  • Workflows
    • Image Upload
      • Add base image
      • Upload image
        • If presigned url upload
          • Client uploads the image
          • Client calls the server to indicate that it is finished
        • If direct upload, then accept the upload
      • Look at the conversion profile to determine which conversions need to take place
      • Enqueue the conversions
      • Task queue does the conversions
      • When they are all done, image is marked as done.
    • Text Overlay
      • Given an existing base image
  • Roadmap
    • Support conversion profiles
    • Auth key validation
    • Permissions table
    • Simple web UI
  • Permissions Lookup
    • Each object has a linked "permissioned object," which is the object on which we need to actually check that the permission is present.
    • This means that given the roles, we need to
      • Option 1
        • Left join on the permissions table using a subquery that checks if any of the user's roles have the required permission on the linked permissioned object.
        • Select an "allowed" field on whether that returns null or not
        • This feels bulky...
      • Option 2
        • Siimilar left join, but just return nothing if it's not allowed.
        • This has the problem that we can't distinguish a missing object from a lack of permissions.
      • Option 3
        • Fetch the object
        • Do the permissions check on the relevant field in a separate query.
        • This is clean but
          • requires two queries
          • requires remembering to do the permissions check
      • An ideal solution, I think, would be option 1, but in a way that it can be automated as part of the fetch process.
        • Could do this with a function that takes some arguments
          • Table to read
          • Operation requested
          • The field on which the operation must be allowed (i.e. project_id)
          • the user info (roles and such)
  • Lazy Conversion
    • Might end up doing this for v2
    • Store the raw image and convert it as requests come in. Requires a little more coordination since we may end up needing to handle concurrent requests for an image format that is currently being generated.

Thanks for reading! If you have any questions or comments, please send me a note on Twitter.