A high-performance recommendation engine written in Go. It exposes HTTP APIs for feed and related item recommendations, powered by configurable multi-stage pipelines.
Request
│
▼
┌─────────────────────────────────────────────────┐
│ Pipeline │
│ │
│ 1. Recall (parallel) │
│ ├── MatchRecall (inverted index + expr) │
│ └── ModelRecall (external model service) │
│ │
│ 2. Merge & Deduplicate │
│ │
│ 3. Rank │
│ ├── ChannelRank (priority by recall source) │
│ └── RuleRank (expression-based scoring) │
│ │
│ 4. Constraints │
│ ├── Scatter (diversity control) │
│ ├── WeightAdjust (conditional reweight) │
│ └── FixedPosition (pin item to position) │
│ │
│ 5. Truncate to requested count │
└─────────────────────────────────────────────────┘
│
▼
Response
- Resources: Lock-free hot-reloading of Items and InvertedIndex data via
atomic.Pointer. Resource directories use timestamp-based versioning with aSUCCESSmarker file. - Program: Lazy-compiled expression evaluator built on expr-lang/expr. Thread-safe via
sync.Once. - UserContext: Aggregates user features (from request + remote API), item features, frequency filters, and related-item context.
| Method | Path | Description |
|---|---|---|
| POST | /api/v1/feeds |
Feed stream recommendations for a user |
| POST | /api/v1/related |
Related item recommendations |
| GET | /ping |
Health check (returns "PONG") |
| GET | /git_hash |
Build info (git commit, build time) |
features and context fields use the MutableFeatures serialization format: each feature is {"type": <DataType>, "value": <val>}.
DataType enum: 0 = Int64, 1 = Float32, 2 = String, 3 = Int64s, 4 = Float32s, 5 = Strings
{
"trace_id": "abcd-1234-efgh-5678",
"user_id": "u12345",
"pipeline": "main_feed",
"relate_id": "item987",
"count": 10,
"context": {
"device": {"type": 2, "value": "ios"},
"version": {"type": 2, "value": "3.2"}
},
"features": {
"u_s_country": {"type": 2, "value": "us"},
"u_s_language": {"type": 2, "value": "en"},
"u_s_cat": {"type": 5, "value": ["cat21", "cat22"]}
}
}| Field | Type | Required | Description |
|---|---|---|---|
trace_id |
string | no | Request tracking ID |
user_id |
string | yes | User identifier |
pipeline |
string | yes | Pipeline name (must match config) |
relate_id |
string | no | Target item for /related requests |
count |
int | no | Max items to return |
context |
object | no | Contextual features, each value as {"type": N, "value": V} |
features |
object | no | Caller-provided features, each value as {"type": N, "value": V} |
{
"code": 0,
"message": "success",
"trace_id": "abcd-1234-efgh-5678",
"user_id": "u12345",
"pipeline": "main_feed",
"items": [
{
"item": "item001",
"channels": ["hot_recall", "model_recall"],
"reasons": ["recall by key: hot", "recall by model: http://model-svc/recall"]
}
],
"count": 1
}| Field | Type | Description |
|---|---|---|
code |
int | 0 = success, -1 = pipeline not found |
message |
string | Status or error message |
items |
array | Recommended items with channels and reasons |
count |
int | Number of items returned |
The engine is configured via a TOML file (default: conf/config.toml).
Config supports JSON, YAML, or TOML format. The [server] section includes HttpServerConfig fields inline.
[server]
project_name = "recgo_engine"
debug = false
http_port = 8080
grpc_port = 3527
prome_port = 9090
pprof_port = 6060
user_feature_url = "http://user-feature-svc/user"
timeout_ms = 500
[items]
name = "pool"
dir = "/data/resources/pool.txt"
[[indexes]]
name = "country_index"
dir = "/data/resources/invert_index/country_index.json"
[[feeds]]
# Pipeline config (JSON format, see below)
[[related]]
# Pipeline config (JSON format, see below)Pipelines are defined in the config file (JSON, YAML, or TOML). Each pipeline has freqs (frequency filters), recalls, a ranker, and optional constraints.
Constraint type values must match code constants: scatter, weight_adjusted, fixed_position.
{
"name": "main_feed",
"freqs": [
{
"name": "imp_filter",
"timespan": 86400,
"frequency": 3,
"action": "imp_%s"
}
],
"recalls": [
{
"type": "match",
"name": "country_recall",
"expr": "u_s_cat",
"index": "country_index",
"count": 100
},
{
"type": "model",
"name": "model_recall",
"url": "http://model-svc:8080/recall",
"count": 50
}
],
"rank": {
"type": "rule",
"name": "main_ranker",
"rule": "i_d_ctr * 0.6 + i_s_score * 0.4"
},
"constrains": [
{
"type": "scatter",
"name": "country_scatter",
"field": "i_s_country",
"count": 3
},
{
"type": "weight_adjusted",
"name": "boost_high_level",
"condition": "i_s_level > 3 ? 1 : 0",
"ratio": 1.5
},
{
"type": "fixed_position",
"name": "pin_featured",
"condition": "i_s_cat1 == \"cat11\" ? 1 : 0",
"position": 0
}
]
}Expressions are powered by expr-lang/expr. Feature keys from user context and item features are available as variables.
| Feature Type | DataType | Expression Type | Example |
|---|---|---|---|
| Int64 | 0 |
int64 |
i_s_level |
| Float32 | 1 |
float64 |
i_d_ctr, i_s_score (promoted to float64) |
| String | 2 |
string |
u_s_country, i_s_language |
| Int64s | 3 |
[]int64 |
(array of int64) |
| Float32s | 4 |
[]float64 |
(array promoted float32 to float64) |
| Strings | 5 |
[]string |
u_s_cat, i_s_cat2 |
Note: Float32 features are promoted to
float64in the expression environment, since expr-lang uses float64 natively.
Must return []string - the index keys to look up in the inverted index.
Available variables: user features, related-item features (for /related requests).
# Return user's preferred categories for index lookup
u_s_cat
# Conditional recall keys based on user language
u_s_language == "en" ? u_s_cat : ["default"]
Must return a numeric value (float64, int64, int, or float32) - used as the item score for sorting.
Available variables: user features, item basic features, item runtime features.
# Simple CTR-based ranking
i_d_ctr
# Weighted multi-factor scoring
i_d_ctr * 0.6 + i_s_score * 0.4
# Conditional boost: items matching user's country get a bonus
i_d_ctr + (i_s_country == u_s_country ? 10.0 : 0.0)
# Level-weighted scoring
i_d_ctr * float(i_s_level)
Must return int64 - 1 means the condition is met (apply the weight ratio), any other value means skip.
Available variables: user features, item basic features, item runtime features.
# Boost high-level items
i_s_level > 3 ? 1 : 0
# Boost items matching user's country
i_s_country == u_s_country ? 1 : 0
# Boost items with high CTR and same language
i_d_ctr > 0.5 && i_s_language == u_s_language ? 1 : 0
Must return int64 - 1 means move the item to the configured position, any other value means skip.
Available variables: user features, item basic features, item runtime features.
# Pin specific category items to a fixed position
i_s_cat1 == "cat11" ? 1 : 0
# Pin high-score items from a specific country
i_s_score > 4.0 && i_s_country == "us" ? 1 : 0
| Category | Operators |
|---|---|
| Arithmetic | +, -, *, /, %, ** (power) |
| Comparison | ==, !=, <, <=, >, >= |
| Logical | &&, ||, ! |
| String | + (concatenation), contains, startsWith, endsWith, matches |
| Ternary | condition ? then : else |
| Array | in, len(), filter(), map(), all(), any(), count() |
| Nil check | ?? (nil coalescing) |
For full syntax, see expr-lang documentation.
Tab-separated text file. Each line: <key>\t<JSON_features>.
Each feature value uses the same {"type": <DataType>, "value": <val>} format.
item_id_0 {"id":{"type":2,"value":"item_id_0"},"i_s_level":{"type":0,"value":3},"i_s_score":{"type":1,"value":2.5},"i_s_language":{"type":2,"value":"en"},"i_s_country":{"type":2,"value":"us"},"i_s_cat1":{"type":2,"value":"cat11"},"i_s_cat2":{"type":5,"value":["cat21","cat23"]},"i_d_ctr":{"type":1,"value":0.75},"i_s_cat":{"type":2,"value":"cat3"}}
Feature types: 0 = Int64, 1 = Float32, 2 = String, 3 = Int64s, 4 = Float32s, 5 = Strings.
Loaded into memory as ImmutableFeatures. The resource directory must contain a SUCCESS marker file to indicate data readiness.
JSON array of IndexEntry objects. Each entry has a key and a list of values (key-score pairs):
[
{
"key": "us",
"values": [
{"key": "item_id_0", "score": 0.95},
{"key": "item_id_3", "score": 0.88}
]
},
{
"key": "uk",
"values": [
{"key": "item_id_1", "score": 0.92},
{"key": "item_id_5", "score": 0.76}
]
}
]Candidates are automatically sorted by score descending on load. The engine uses round-robin Z-interleaving across keys to ensure diversity.
Resources are organized by timestamp directories:
data/
items/
20240101120000/
data.txt
SUCCESS
20240102120000/
data.txt
SUCCESS
index/
20240101120000/
hot_index.json
SUCCESS
The engine automatically picks the latest directory containing a SUCCESS file.
# Build with version info
go build -ldflags "-X main.gitCommitInfo=$(git rev-parse HEAD) -X main.buildTime=$(date +%Y%m%d%H%M%S)" -o recgo-engine .
# Run
./recgo-engine -config conf/config.toml -log ./logs| Flag | Default | Description |
|---|---|---|
-config |
conf/config.toml |
Configuration file path |
-log |
./logs |
Log output directory |
- Prometheus: Metrics exported on the configured
prome_port. All pipeline stages are instrumented withprome.NewStat. - PProf: Go runtime profiling on
pprof_port(e.g.,http://localhost:6060/debug/pprof/).
flowchart TD
A[POST /feeds or /related] --> B[Validate Request]
B --> C[Lookup Pipeline]
C -->|Not Found| E[Return code=-1]
C -->|Found| D[Build UserContext]
D --> F[Parallel Recall]
F --> G[Merge & Deduplicate]
G --> H[Rank]
H --> I[Apply Constraints]
I --> J[Truncate & Build Response]
J --> K[Return JSON]
Part of the recgo-engine project. See LICENSE for details.