Multi-tenant /Catalogs Extension#880
Conversation
|
thank for the PR @jonhealy1 Before I start the review I have a quick question: should this extension be in |
|
Hi @vincentsarago I was wondering about that. I can move it to third party. |
|
@hrodmn Implementing scoped search of collections within a sub-catalog would be a huge contribution. I also agree that interoperability with existing clients like stac-browser and pystac would be ideal. |
**Related Issue(s):** - #308 - stac-utils/stac-fastapi#880 - stac-utils/stac-fastapi#891 **Description:** This PR migrates the internal Catalogs Extension implementation to the new, standardized stac-fastapi-catalogs-extension library available on PyPI, reducing local maintenance burden. This transition aligns our multi-tenant catalog support with the official STAC API ecosystem: https://github.com/StacLabs/stac-fastapi-catalogs-extension **PR Checklist:** - [x] Code is formatted and linted (run `pre-commit run --all-files`) - [x] Tests pass (run `make test`) - [x] Documentation has been updated to reflect changes, if applicable - [x] Changes are added to the changelog
hrodmn
left a comment
There was a problem hiding this comment.
I realize that it is inherently backend-specific, but it would be good to note somewhere that, since links tend to be dynamically generated by stac-fastapi applications, the relationships between catalogs and collections (and catalogs and catalogs) needs to be tracked somehow.
I am also wondering if the children extension gets activated does that mean the API root will potentially have a rel=child link for all collections (not just sub-catalogs)?
| Logic: | ||
| 1. Verifies the parent catalog exists. | ||
| 2. If the sub-catalog already exists: Appends the parent ID to its | ||
| parent_ids (enabling poly-hierarchy - a catalog can have multiple | ||
| parents). | ||
| 3. If the sub-catalog is new: Creates it with parent_ids initialized | ||
| to [catalog_id]. |
There was a problem hiding this comment.
Is parent_id assumed to be a linking field in the backend data model?
There was a problem hiding this comment.
Yes I think you would need to track parent_ids in both catalogs and collections. That's how we do it in SFEOS anyways. The wording here definitely doesn't make this clear.
| """ | ||
|
|
||
| client: AsyncBaseCatalogsClient = attr.ib(kw_only=True) | ||
| enable_transactions: bool = attr.ib(default=False, kw_only=True) |
There was a problem hiding this comment.
@jonhealy1 what do you think about having a separate CatalogsTransactions extension? The transaction capability is a pretty large feature set to put behind a feature flag.
It's true that you would never add the transactions endpoints alone, so maybe keeping it as an add-on in the CatalogsExtension is better.
There was a problem hiding this comment.
I was thinking about that too. The normal Transactions extension wouldn't be used without the core routes either. It would make some sense to separate them?
There was a problem hiding this comment.
It will create some more meta-work in the specification and in this PR but I think we should separate them. Keeping them separate simplifies the conformance class logic in the main CatalogsExtension and separates the concerns of access and management.
There was a problem hiding this comment.
Sounds good - would that mean 2 prs?
There was a problem hiding this comment.
We could implement both extensions in stac-fastapi in this PR. I am not sure if it would require a new extension repo (i.e. StacLabs/multi-tenant-catalogs-transactions).
|
so the thought i just had. could this be implemented backend agnostic by just acting as a proxy to rewrite queries inro collection search requests similar to how stac-auth-proxy works rather than needing to implement in stac-fastapi-pgstac at all?? |
Definitely. We should give some guidance on this. In stac-fastapi-elasticsearch-opensearch, we track an internal-only parent_ids list on both collections and catalogs stored in the database.
This is something to think about. We don't presently do this. |
@bitner It's an interesting idea. Some quick thoughts:
I think the complexity of creating a proxy that added all of the features our extension provides - and more in the future -combined with the performance issues and JSON-rewriting overhead would make me think it's not the best solution. |
|
Closing in favor of #891 and hosting this extension externally: https://github.com/StacLabs/multi-tenant-catalogs If anyone wants to contribute there - issues, prs, discussions, questions - we would love to have you. |
|
Reopening. Thoughts:
@gadomski has said that there isn't a consensus towards eventually approving this pr or not. I would like to know what the objections are. Has any other extension ever been refused in this project? Some extensions hosted here don't even have tests #901 |
|
@hrodmn I have some time coming up to clean up the docstrings and divide this into two separate extensions. Thanks for all of the input |
Create transactions client, clean up docstrings,
|
Slowly adding to stac-fastapi-pgstac: stac-utils/stac-fastapi-pgstac#366 |
|
I've updated this branch but I think this now lives in https://github.com/StacLabs/stac-fastapi-catalogs-extension |
This was open a for a while so I moved the extension there. Then there were opinions about where the extension should rightfully live. I'm good with either option. |
Related Issue(s):
Description: Multi-Tenant Catalogs Extension
This extension introduces a recursive
/catalogsendpoint to the STAC API, enabling complex, nested hierarchies beyond the standard flatRoot -> Collectionsstructure. It transforms a STAC API into a Multi-Tenant system capable of serving distinct catalog trees (e.g.,Provider -> Theme -> Project) within a single instance.Key Architectural Concepts:
/children) that lists both Sub-Catalogs and Collections, supporting optional type filtering.Full Endpoint List:
GET /catalogs- List all root catalogs.POST /catalogs- Register a new root catalog.GET /catalogs/{catalog_id}- Get catalog metadata.PUT /catalogs/{catalog_id}- Update catalog metadata.DELETE /catalogs/{catalog_id}- Disband a catalog (does not delete data).GET /catalogs/{catalog_id}/catalogs- List sub-catalogs.POST /catalogs/{catalog_id}/catalogs- Create or link a sub-catalog.DELETE /catalogs/{catalog_id}/catalogs/{sub_catalog_id}- Unlink a sub-catalog.GET /catalogs/{catalog_id}/collections- List linked collections.POST /catalogs/{catalog_id}/collections- Create or link a collection.GET /catalogs/{catalog_id}/collections/{collection_id}- Get collection details.DELETE /catalogs/{catalog_id}/collections/{collection_id}- Unlink a collection.GET /catalogs/{catalog_id}/collections/{collection_id}/items- Search items within catalog context.GET /catalogs/{catalog_id}/collections/{collection_id}/items/{item_id}- Get single item.GET /catalogs/{catalog_id}/children- Unified list of child Catalogs and Collections.GET /catalogs/{catalog_id}/conformance- Conformance classes for this catalog tree.GET /catalogs/{catalog_id}/queryables- Queryable fields for this catalog tree.Specification Reference:
Healy-Hyperspatial/multi-tenant-catalogs
PR Checklist:
pre-commithooks pass locallymake test)make docs)