Skip to content

uopensail/recgo-engine

Repository files navigation

Recgo-Engine

A high-performance recommendation engine written in Go. It exposes HTTP APIs for feed and related item recommendations, powered by configurable multi-stage pipelines.


Architecture

Request
  │
  ▼
┌─────────────────────────────────────────────────┐
│                   Pipeline                       │
│                                                  │
│  1. Recall (parallel)                            │
│     ├── MatchRecall  (inverted index + expr)     │
│     └── ModelRecall  (external model service)    │
│                                                  │
│  2. Merge & Deduplicate                          │
│                                                  │
│  3. Rank                                         │
│     ├── ChannelRank  (priority by recall source) │
│     └── RuleRank     (expression-based scoring)  │
│                                                  │
│  4. Constraints                                  │
│     ├── Scatter          (diversity control)     │
│     ├── WeightAdjust     (conditional reweight)  │
│     └── FixedPosition    (pin item to position)  │
│                                                  │
│  5. Truncate to requested count                  │
└─────────────────────────────────────────────────┘
  │
  ▼
Response

Key Components

  • Resources: Lock-free hot-reloading of Items and InvertedIndex data via atomic.Pointer. Resource directories use timestamp-based versioning with a SUCCESS marker file.
  • Program: Lazy-compiled expression evaluator built on expr-lang/expr. Thread-safe via sync.Once.
  • UserContext: Aggregates user features (from request + remote API), item features, frequency filters, and related-item context.

API Endpoints

Method Path Description
POST /api/v1/feeds Feed stream recommendations for a user
POST /api/v1/related Related item recommendations
GET /ping Health check (returns "PONG")
GET /git_hash Build info (git commit, build time)

Request

features and context fields use the MutableFeatures serialization format: each feature is {"type": <DataType>, "value": <val>}.

DataType enum: 0 = Int64, 1 = Float32, 2 = String, 3 = Int64s, 4 = Float32s, 5 = Strings

{
  "trace_id": "abcd-1234-efgh-5678",
  "user_id": "u12345",
  "pipeline": "main_feed",
  "relate_id": "item987",
  "count": 10,
  "context": {
    "device": {"type": 2, "value": "ios"},
    "version": {"type": 2, "value": "3.2"}
  },
  "features": {
    "u_s_country": {"type": 2, "value": "us"},
    "u_s_language": {"type": 2, "value": "en"},
    "u_s_cat": {"type": 5, "value": ["cat21", "cat22"]}
  }
}
Field Type Required Description
trace_id string no Request tracking ID
user_id string yes User identifier
pipeline string yes Pipeline name (must match config)
relate_id string no Target item for /related requests
count int no Max items to return
context object no Contextual features, each value as {"type": N, "value": V}
features object no Caller-provided features, each value as {"type": N, "value": V}

Response

{
  "code": 0,
  "message": "success",
  "trace_id": "abcd-1234-efgh-5678",
  "user_id": "u12345",
  "pipeline": "main_feed",
  "items": [
    {
      "item": "item001",
      "channels": ["hot_recall", "model_recall"],
      "reasons": ["recall by key: hot", "recall by model: http://model-svc/recall"]
    }
  ],
  "count": 1
}
Field Type Description
code int 0 = success, -1 = pipeline not found
message string Status or error message
items array Recommended items with channels and reasons
count int Number of items returned

Configuration

The engine is configured via a TOML file (default: conf/config.toml).

Top-Level Config

Config supports JSON, YAML, or TOML format. The [server] section includes HttpServerConfig fields inline.

[server]
project_name = "recgo_engine"
debug = false
http_port = 8080
grpc_port = 3527
prome_port = 9090
pprof_port = 6060

user_feature_url = "http://user-feature-svc/user"
timeout_ms = 500

[items]
name = "pool"
dir = "/data/resources/pool.txt"

[[indexes]]
name = "country_index"
dir = "/data/resources/invert_index/country_index.json"

[[feeds]]
# Pipeline config (JSON format, see below)

[[related]]
# Pipeline config (JSON format, see below)

Pipeline Config

Pipelines are defined in the config file (JSON, YAML, or TOML). Each pipeline has freqs (frequency filters), recalls, a ranker, and optional constraints.

Constraint type values must match code constants: scatter, weight_adjusted, fixed_position.

{
  "name": "main_feed",
  "freqs": [
    {
      "name": "imp_filter",
      "timespan": 86400,
      "frequency": 3,
      "action": "imp_%s"
    }
  ],
  "recalls": [
    {
      "type": "match",
      "name": "country_recall",
      "expr": "u_s_cat",
      "index": "country_index",
      "count": 100
    },
    {
      "type": "model",
      "name": "model_recall",
      "url": "http://model-svc:8080/recall",
      "count": 50
    }
  ],
  "rank": {
    "type": "rule",
    "name": "main_ranker",
    "rule": "i_d_ctr * 0.6 + i_s_score * 0.4"
  },
  "constrains": [
    {
      "type": "scatter",
      "name": "country_scatter",
      "field": "i_s_country",
      "count": 3
    },
    {
      "type": "weight_adjusted",
      "name": "boost_high_level",
      "condition": "i_s_level > 3 ? 1 : 0",
      "ratio": 1.5
    },
    {
      "type": "fixed_position",
      "name": "pin_featured",
      "condition": "i_s_cat1 == \"cat11\" ? 1 : 0",
      "position": 0
    }
  ]
}

Expression Language

Expressions are powered by expr-lang/expr. Feature keys from user context and item features are available as variables.

Feature Types

Feature Type DataType Expression Type Example
Int64 0 int64 i_s_level
Float32 1 float64 i_d_ctr, i_s_score (promoted to float64)
String 2 string u_s_country, i_s_language
Int64s 3 []int64 (array of int64)
Float32s 4 []float64 (array promoted float32 to float64)
Strings 5 []string u_s_cat, i_s_cat2

Note: Float32 features are promoted to float64 in the expression environment, since expr-lang uses float64 natively.

Expression Contexts

1. Match Recall (expr field)

Must return []string - the index keys to look up in the inverted index.

Available variables: user features, related-item features (for /related requests).

# Return user's preferred categories for index lookup
u_s_cat

# Conditional recall keys based on user language
u_s_language == "en" ? u_s_cat : ["default"]

2. Rule Rank (rule field)

Must return a numeric value (float64, int64, int, or float32) - used as the item score for sorting.

Available variables: user features, item basic features, item runtime features.

# Simple CTR-based ranking
i_d_ctr

# Weighted multi-factor scoring
i_d_ctr * 0.6 + i_s_score * 0.4

# Conditional boost: items matching user's country get a bonus
i_d_ctr + (i_s_country == u_s_country ? 10.0 : 0.0)

# Level-weighted scoring
i_d_ctr * float(i_s_level)

3. Weight Adjust (condition field)

Must return int64 - 1 means the condition is met (apply the weight ratio), any other value means skip.

Available variables: user features, item basic features, item runtime features.

# Boost high-level items
i_s_level > 3 ? 1 : 0

# Boost items matching user's country
i_s_country == u_s_country ? 1 : 0

# Boost items with high CTR and same language
i_d_ctr > 0.5 && i_s_language == u_s_language ? 1 : 0

4. Fixed Position Insert (condition field)

Must return int64 - 1 means move the item to the configured position, any other value means skip.

Available variables: user features, item basic features, item runtime features.

# Pin specific category items to a fixed position
i_s_cat1 == "cat11" ? 1 : 0

# Pin high-score items from a specific country
i_s_score > 4.0 && i_s_country == "us" ? 1 : 0

Operators Reference

Category Operators
Arithmetic +, -, *, /, %, ** (power)
Comparison ==, !=, <, <=, >, >=
Logical &&, ||, !
String + (concatenation), contains, startsWith, endsWith, matches
Ternary condition ? then : else
Array in, len(), filter(), map(), all(), any(), count()
Nil check ?? (nil coalescing)

For full syntax, see expr-lang documentation.


Resource Files

Items File

Tab-separated text file. Each line: <key>\t<JSON_features>.

Each feature value uses the same {"type": <DataType>, "value": <val>} format.

item_id_0	{"id":{"type":2,"value":"item_id_0"},"i_s_level":{"type":0,"value":3},"i_s_score":{"type":1,"value":2.5},"i_s_language":{"type":2,"value":"en"},"i_s_country":{"type":2,"value":"us"},"i_s_cat1":{"type":2,"value":"cat11"},"i_s_cat2":{"type":5,"value":["cat21","cat23"]},"i_d_ctr":{"type":1,"value":0.75},"i_s_cat":{"type":2,"value":"cat3"}}

Feature types: 0 = Int64, 1 = Float32, 2 = String, 3 = Int64s, 4 = Float32s, 5 = Strings.

Loaded into memory as ImmutableFeatures. The resource directory must contain a SUCCESS marker file to indicate data readiness.

Inverted Index File

JSON array of IndexEntry objects. Each entry has a key and a list of values (key-score pairs):

[
  {
    "key": "us",
    "values": [
      {"key": "item_id_0", "score": 0.95},
      {"key": "item_id_3", "score": 0.88}
    ]
  },
  {
    "key": "uk",
    "values": [
      {"key": "item_id_1", "score": 0.92},
      {"key": "item_id_5", "score": 0.76}
    ]
  }
]

Candidates are automatically sorted by score descending on load. The engine uses round-robin Z-interleaving across keys to ensure diversity.

Resource Versioning

Resources are organized by timestamp directories:

data/
  items/
    20240101120000/
      data.txt
      SUCCESS
    20240102120000/
      data.txt
      SUCCESS
  index/
    20240101120000/
      hot_index.json
      SUCCESS

The engine automatically picks the latest directory containing a SUCCESS file.


Build & Run

# Build with version info
go build -ldflags "-X main.gitCommitInfo=$(git rev-parse HEAD) -X main.buildTime=$(date +%Y%m%d%H%M%S)" -o recgo-engine .

# Run
./recgo-engine -config conf/config.toml -log ./logs

CLI Flags

Flag Default Description
-config conf/config.toml Configuration file path
-log ./logs Log output directory

Monitoring

  • Prometheus: Metrics exported on the configured prome_port. All pipeline stages are instrumented with prome.NewStat.
  • PProf: Go runtime profiling on pprof_port (e.g., http://localhost:6060/debug/pprof/).

Call Flow

flowchart TD
    A[POST /feeds or /related] --> B[Validate Request]
    B --> C[Lookup Pipeline]
    C -->|Not Found| E[Return code=-1]
    C -->|Found| D[Build UserContext]
    D --> F[Parallel Recall]
    F --> G[Merge & Deduplicate]
    G --> H[Rank]
    H --> I[Apply Constraints]
    I --> J[Truncate & Build Response]
    J --> K[Return JSON]
Loading

License

Part of the recgo-engine project. See LICENSE for details.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors