-
Notifications
You must be signed in to change notification settings - Fork 1
to_utf
Maarten Hilferink edited this page Apr 8, 2026
·
1 revision
String functions to_utf
The to_utf function converts strings from the system's local encoding to UTF-8.
to_utf(strings: E->String) -> E->String
Converts strings from the current system locale encoding (e.g., Windows-1252, Latin-1) to UTF-8 encoding. This is useful when:
- Reading data from legacy systems with non-UTF-8 encoding
- Processing files created with Windows code pages
- Ensuring consistent UTF-8 output
UTF-8 is the standard encoding for:
- Web content
- Modern databases
- Cross-platform data exchange
| argument | description | type |
|---|---|---|
| strings | Strings in local/system encoding | E->String |
Time complexity: O(n × L) where n is the number of strings and L is the average string length.
UTF-8 encoded strings may be longer than input strings for characters outside the ASCII range.
- Input is assumed to be in the system's default encoding
- Invalid byte sequences may produce replacement characters or errors
- Already UTF-8 encoded strings should not be converted again
unit<uint32> LegacyData: nrofrows = 100;
attribute<String> names (LegacyData); // read from Windows-1252 encoded file
// Convert to UTF-8 for consistent processing
attribute<String> names_utf8 (LegacyData) := to_utf(names);
// Now safe to use with UTF-8 aware operations
- from_utf - reverse operation (UTF-8 to system encoding)
- String functions
7.0
GeoDMS ©Object Vision BV. Source code distributed under GNU GPL-3. Documentation distributed under CC BY-SA 4.0.