Maps are a built-in data structure in Go that associate keys with values, similar to dictionaries, hash tables, or associative arrays in other languages.
// Declare a map with string keys and integer values
wordFrequency := make(map[string]int)
// Set a value
wordFrequency["hello"] = 1
// Update a value
wordFrequency["hello"]++
// Get a value
count := wordFrequency["hello"]
// Check if a key exists
count, exists := wordFrequency["world"]
if exists {
fmt.Println("'world' found with count:", count)
} else {
fmt.Println("'world' not found in map")
}
// Delete a key
delete(wordFrequency, "hello")
// Iterate over a map
for word, count := range wordFrequency {
fmt.Printf("%s: %d\n", word, count)
}Go provides several functions for string manipulation in the strings package.
import "strings"
// Convert to lowercase
lowercase := strings.ToLower("Hello") // "hello"
// Split a string
words := strings.Split("hello world", " ") // ["hello", "world"]
// Join strings
joined := strings.Join([]string{"hello", "world"}, " ") // "hello world"
// Replace all occurrences
replaced := strings.ReplaceAll("hello, hello!", ",", "") // "hello hello!"
// Contains substring
hasPrefix := strings.Contains("hello world", "hello") // true
// Trim whitespace
trimmed := strings.TrimSpace(" hello ") // "hello"For more complex string processing, Go's regexp package provides powerful pattern matching:
import "regexp"
// Create a regex to match only letters and digits
re := regexp.MustCompile(`[^a-zA-Z0-9]+`)
// Replace all non-alphanumeric characters with a space
cleaned := re.ReplaceAllString("Hello, world! 123", " ") // "Hello world 123"
// Split using regex
words := re.Split("Hello,world!123", -1) // ["Hello", "world", "123"]When processing large texts:
-
Pre-allocation: If you know the approximate size of your map, initialize it with capacity:
wordFrequency := make(map[string]int, 1000) // Pre-allocate for 1000 words
-
Builder Pattern: For complex string manipulation, use
strings.Builder:var builder strings.Builder for _, word := range words { builder.WriteString(word) builder.WriteString(" ") } result := builder.String()
-
Single-pass Processing: Avoid multiple iterations over the same data when possible
Key concepts to understand when implementing word frequency counting:
- Text normalization: Converting text to consistent format (lowercase, removing punctuation)
- Word boundary detection: Identifying where words start and end
- Character filtering: Deciding which characters are valid for words
- Frequency tracking: Using maps to count occurrences efficiently
- Memory optimization: Pre-allocating maps when possible
- Normalize input text (case conversion)
- Clean text by removing/replacing unwanted characters
- Split text into individual words
- Count frequency of each word using a map
- Handle edge cases (empty strings, whitespace)
- Hash Maps: Go's maps are implemented as hash tables, providing O(1) average lookup time
- Strings as UTF-8: Go strings are UTF-8 encoded by default, so be careful when handling non-ASCII characters
- Immutability: Strings in Go are immutable, so operations like
ToLower()create new strings - Runes: For proper Unicode character handling, consider working with runes (
[]rune) instead of bytes