Skip to content

🐛 fix url for glad dataset#577

Merged
philippemiron merged 16 commits intomainfrom
philippemiron-patch-1
Aug 30, 2025
Merged

🐛 fix url for glad dataset#577
philippemiron merged 16 commits intomainfrom
philippemiron-patch-1

Conversation

@philippemiron
Copy link
Copy Markdown
Contributor

@philippemiron philippemiron commented Aug 20, 2025

  • update glad GRIIDC URL
  • add header to mimic browsers
  • increase the timeout

@philippemiron philippemiron changed the title Update glad.py 🐛 fix url for glad dataset Aug 20, 2025
@philippemiron
Copy link
Copy Markdown
Contributor Author

philippemiron commented Aug 20, 2025

Those tests do take forever to complete.. Their site seems to be responding super slow; I'll check/retry again later.

@philippemiron
Copy link
Copy Markdown
Contributor Author

I removed one test that takes over 1h to complete, but still getting issues with the glad dataset. All tests pass locally so I'm not exactly sure what is happening... We will need to investigate a bit more.

@philippemiron
Copy link
Copy Markdown
Contributor Author

philippemiron commented Aug 29, 2025

It looks like GRIIDC blocked the GitHub's IP. We are getting a 403 error.

     def test_glad_url(self):
        url = "https://data.gulfresearchinitiative.org/api/file/download/169841"
        response = requests.get(url, timeout=60)
        if response.status_code != 200:
            print(f"Error downloading GLAD dataset: {response.status_code}")
            print(response.text)
>       self.assertEqual(response.status_code, 200)
E       AssertionError: 403 != 200

@philippemiron philippemiron requested a review from selipot August 30, 2025 02:59
@philippemiron
Copy link
Copy Markdown
Contributor Author

Good to go!

@selipot selipot requested a review from Copilot August 30, 2025 12:00
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR fixes an issue with the GLAD dataset download by updating the URL and improving the HTTP request configuration. The changes enhance the reliability of downloads by mimicking browser behavior and increasing timeout values.

  • Updates the GLAD dataset URL to remove the deprecated /pelagos-symfony path segment
  • Adds browser-like User-Agent headers to HTTP requests to avoid bot detection
  • Increases timeout values for better reliability with large file downloads

Reviewed Changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.

File Description
clouddrift/adapters/glad.py Updates the GLAD dataset URL to the current API endpoint
clouddrift/adapters/utils.py Adds browser headers and increases timeout values for HTTP requests
tests/adapters/utils_test.py Updates test assertions to match the new headers and timeout parameters

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

Comment on lines +26 to +27
_BROWSER_HEADERS = {
"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/126.0.0.0 Safari/537.36"
Copy link

Copilot AI Aug 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] The hardcoded User-Agent string may become outdated over time. Consider using a more generic or library-generated User-Agent, or add a comment explaining why this specific version is required.

Suggested change
_BROWSER_HEADERS = {
"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/126.0.0.0 Safari/537.36"
"User-Agent": requests.utils.default_user_agent()

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Member

@selipot selipot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see the modifications, and they seem to do the trick! Thank you for spending the time to fix this!

@philippemiron philippemiron merged commit 537411d into main Aug 30, 2025
21 of 22 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants