Skip to content

[hma] Ability to fetch media from S3-compatible storage #1828

@ThisIsMissEm

Description

@ThisIsMissEm

Currently when you want to work with media stored on an S3-compatible storage solution (AWS S3, DigitalOcean Spaces, Minio, etc), you need to either use presigned URLs and use the GET /h/hash endpoint, or you need to have a intermediary process first download the file from media storage and then send it to HMA via FormData / multipart request (POST /h/hash)

It'd be very useful to be able to configure HMA with credentials for a bucket and then pass a payload to HMA saying "fetch this object key from this bucket and hash it", so the API might look something like a JSON POST /h/hash with:

{
  "bucket": "example-media",
  "key": "/path/to/media/file.png"
}

This could also support doing a POST request for url based hashing too:

{
  "url": "https://some.url.example/path/to/media/file.png"
}

(potentially worth including an expires_at timestamp, such that HMA can quickly assert if the URL is still valid, in the case of queue-based hashing, instead of request-based)

Metadata

Metadata

Assignees

No one assigned

    Labels

    hmaItems related to the hasher-matcher-actioner system
    No fields configured for Feature.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions