Skip to content

[0.1.5] Repo webhook on GHES side : 404 page not found #332

@Fabiosilvero

Description

@Fabiosilvero

Hello,

We're running GHES 3.12 and we're trying to setup repo webhook.

Garm-server is in kubernetes running the ghcr.io/cloudbase/garm:v0.1.5 image :

configMaps:
  config.toml: |-
    [default]
    enable_webhook_management = true
    
    [logging]
    # If using nginx, you'll need to configure connection upgrade headers
    # for the /api/v1/ws location. See the sample config in the testdata
    # folder.
    enable_log_streamer = true
    # Set this to "json" if you want to consume these logs in something like
    # Loki or ELK.
    log_format = "text"
    log_level = "debug"
    log_source = false

    [metrics]
      enable = true
      disable_auth = false

    [jwt_auth]
    secret = "awesome_secret_redacted"
    time_to_live = "8760h"

    [apiserver]
      bind = "0.0.0.0"
      port = 80
      use_tls = false

    [database]
      backend = "sqlite3"
      # This needs to be changed.
      passphrase =  "awesome_secret_redacted"
      [database.sqlite3]
        db_file = "/etc/garm/garm.db"
    
    [[provider]]
      name = "gcp"
      provider_type = "external"
      description = "gcp provider"
      [provider.external]
        provider_executable = "/opt/garm/providers.d/garm-provider-gcp"
        config_file = "/etc/garm/garm-provider-gcp.toml"
        # This is needed if you want GARM to pass this along to the provider.
        environment_variables = ["GOOGLE_APPLICATION_CREDENTIALS"]

  garm-provider-gcp.toml: |-
    project_id = "project_id"
    zone = "gcp_zone"
    network_id = "network_self_link"
    subnetwork_id = "subnetwork_self_link"
    # The credentials file is optional.
    # Leave this empty if you want to use the default credentials.
    credentials_file = "/etc/garm/service-account-key.json/sa.key"
    external_ip_access = false

The volumes are correctly mounted : GARM is up and running, and GCE VMs are correctly created/deleted and can reach GHES and Garm.

I used this command to create the repository on GARM :

/home/user/bin/garm-cli repository add \
    --name github-actions \
    --owner <The_Org> \
    --credentials github-pat \
    --install-webhook \
    --pool-balancer-type roundrobin \
    --random-webhook-secret

/home/user/bin/garm-cli pool create \
    --os-type linux \
    --os-arch amd64 \
    --enabled=true \
    --flavor e2-medium \
    --image  <GCE_IMAGE_SELF_LINK> \
    --min-idle-runners 0 \
    --repo <the_ID> \
    --tags poc-garm \
    --provider-name gcp

On GHES side the webhook exists and seems to be configured, although all events get a 404 and workflow hangs forever :

./garm-cli runner list --all (for 5 min)
+----+------+--------+---------------+---------+
| NR | NAME | STATUS | RUNNER STATUS | POOL ID |
+----+------+--------+---------------+---------+
+----+------+--------+---------------+---------+

+--------------------------------------+-------+----------------+-------------+------------------+--------------------+------------------+
| ID                                   | OWNER | NAME           | ENDPOINT    | CREDENTIALS NAME | POOL BALANCER TYPE | POOL MGR RUNNING |
+--------------------------------------+-------+----------------+-------------+------------------+--------------------+------------------+
|  <the_ID> | The_Org | github-actions | my-ghes | github-pat       | roundrobin         | true             |
+--------------------------------------+-------+----------------+-------------+------------------+--------------------+------------------+

./garm-cli controller show
+-------------------------+---------------------------------------------------------------------------+
| FIELD                   | VALUE                                                                     |
+-------------------------+---------------------------------------------------------------------------+
| Controller ID           | ID                                      |
| Hostname                | garm-server-0                                                             |
| Metadata URL            | https://stg-garm.my.dns.zone/api/v1/metadata                              |
| Callback URL            | https://stg-garm.my.dns.zone/api/v1/callbacks                             |
| Webhook Base URL        | https://stg-garm.my.dns.zone/webhook                                      |
| Controller Webhook URL  | https://stg-garm.my.dns.zone/webhook/ID |
| Minimum Job Age Backoff | 30                                                                        |
| Version                 | v0.1.5                                                                    |
+-------------------------+---------------------------------------------------------------------------+

 ./garm-cli github endpoint list
+-------------+------------------------------+-------------------------+
| NAME        | BASE URL                     | DESCRIPTION             |
+-------------+------------------------------+-------------------------+
| github.com  | https://github.com           | The github.com endpoint |
+-------------+------------------------------+-------------------------+
| my-ghes | https://github.my.dns.zone | My GHES             |
+-------------+------------------------------+-------------------------+

Image

Image

The job_count in logs stays at 0 despite me triggering redeliver on the webhook failed event.

Am I missing something ?

WIth a minidlerunners at 2, I have runners on GHES side and in garm via the garm-cli runner list --all command but since the webhook doesn't work it doesn't scale :/

Note : I'm using GARM operator but I created the repo and pool manually to exclude an issue from it.

I also confirmed that network flow GHES => GARM is correct :

admin@github:~$ curl https://stg-garm.my.dns.zone/
404 page not found
admin@github:~$ curl https://stg-garm.my.dns.zone/api/v1
404 page not found
admin@github:~$ curl https://stg-garm.my.dns.zone/api/v1/metadata
{"error":"Authentication failed","details":""}
admin@github:~$ curl https://stg-garm.my.dns.zone/webhook/84474b45-b8e7-4350-9be0-012531f22388
404 page not found

Log is attached to issue. Let me know if I can help to troubleshoot further.

garm.log

Thanks,

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions