Skip to content

[BUG] UnicodeDecodeError when save_to_file=True and CalledProcessError for building types #155

@felipedantas-pi

Description

@felipedantas-pi

Contributing guidelines

  • I understand the contributing guidelines

Documentation

  • My proposal is not addressed by the documentation or examples

Existing issues

  • Nothing similar appears in an existing issue

Describe the bug

I am experiencing two distinct issues when using load_overture_data:

A UnicodeDecodeError occurs whenever save_to_file=True (specifically during the gpd.read_file step).

A CalledProcessError occurs when attempting to download building data, regardless of the save_to_file setting.

Issue 1: UnicodeDecodeError

When setting save_to_file=True, the data is downloaded correctly, but the library fails to read it back into a GeoDataFrame.

Error Traceback:

UnicodeDecodeError: 'utf-8' codec can't decode byte 0xea in position 24: invalid continuation byte
...
File ~\...\city2graph\data.py:513, in _download_and_process_type
--> 513     gdf = gpd.read_file(output_path)

Interestingly, the user reports that if certain parameters are changed (like setting save_to_file=False in some contexts), it might work, but for building, it consistently crashes.

Issue 2: CalledProcessError with 'building' type

The download for the building layer fails when executed via subprocess.

Error Traceback:

CalledProcessError: Command '['overturemaps', 'download', '--bbox=...', '-f', 'geojson', '--type=building', '-r', '2026-04-15.0']' returned non-zero exit status 1.

Update: I tested the command generated by the library directly in the terminal, and it works perfectly.

overturemaps download --bbox=-42.8902293715,-5.2530579218,-42.6397663876,-4.9296567813 -f geojson --type=building -r 2026-04-15.0

It seems the Overture Maps CLI is returning an error that city2graph is not catching gracefully.

To Reproduce

import city2graph as c2g

subdatasets_types = ["segment", "connector"]
dados = c2g.load_overture_data(
    area=zonaUrbana_3km, # GeoDataFrame/Polygon
    types=subdatasets_types,
    output_dir=RAW_DATA_PATH,
    save_to_file=True,
    return_data=True,
    release='2026-04-15.0'
)

subdataset_building = ['building']
dados_building = c2g.load_overture_data(
    area=zonaUrbana_3km,
    types=subdataset_building,
    save_to_file=False, # or True
    use_stac=True,
    release='2026-04-15.0'
)

Expected behavior

The load_overture_data function should successfully download the requested datasets (including building types), save them to the specified output_dir as GeoJSON files, and return a GeoDataFrame without encountering encoding errors.

Specifically:

When save_to_file=True, the library should handle Windows encoding defaults (likely by enforcing UTF-8) to avoid UnicodeDecodeError during gpd.read_file.

The building subdataset should be downloaded via subprocess successfully, as the command is valid and works when executed directly in the terminal.

Environment (please complete the following information):

  • OS: Windows 11
  • CPU: Intel Xeon E5-2696-V3
  • GPU: NVIDIA GTX 1060 3GB]
  • Python version: [e.g., 3.12.12]
  • city2graph version: 0.3.1

How did you install city2graph?

via PyPI (e.g., pip)

Additional context

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions