Skip to content

Allow optional ingestion of sitemap data #181

@navarone-feekery

Description

@navarone-feekery

Problem Description

Crawler uses sitemaps to seed URLs for a crawl, but the content of sitemap files is never ingested. Some users may want to ingest data from this (for example metadata).

Proposed Solution

  • Add a config option to allow ingesting sitemap content
  • If config is enabled, ingest content of sitemap after using it to seed more URLs

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions