Infrastructure | UAE Visa Processing Platform

Technology Stack

On-Prem Tech Stack (LLD)

Each layer of the stack is independently scalable and uses open-source components where possible.

⚙

Compute Layer

Kubernetes Node Pools

System pool — Ingress, telemetry agents
Orchestrator pool — Orchestrator + Summarizer pods
Specialist pool — Eligibility, Form, Document, Booking, Payment, Rule Compiler
Tool executor pool — Tool middleware + API client pods
GPU pool — vLLM/TGI model serving

Horizontal autoscaling based on CPU/memory/queue depth custom metrics.

🤖

LLM Serving Layer

GPU Model Inference

Fast model (8-13B) — Orchestrator, Summarizer loops
Capable model (70B) — Specialist agent reasoning
Self-hosted via vLLM/TGI on GPU nodes
Local model routing and failover between replicas
KV-cache optimization and prompt caching

📩

Messaging & Queueing

Event-driven backbone

Intake queue/topic
Agent-specific work queues/topics
Tool execution queue/topic
Scheduled retry queue
Status update topics/subscriptions

🗃

Data Layer

Multi-model persistence

Relational

PostgreSQL HA — primary transactional storage for eChannels request tables.

Document Store

MongoDB/Cassandra/Scylla — UDB domain objects, session snapshots, reasoning traces.

Object Storage

MinIO/Ceph — document attachments, SKILL.md sources, compiled YAML.

Cache

Redis cluster — rule cache, session cache, API response cache, rate-limit counters.

🔒

API Gateway & Security

Traffic governance

Self-hosted gateway (Kong/Tyk/APISIX) for traffic governance and multi-backend routing
Vault/KMS for secrets and certificates
K8s service accounts + RBAC + mTLS for east-west auth
Network segmentation with internal firewall policies

Observability

Monitoring & Telemetry Architecture

All platform components emit traces, metrics, and logs through OpenTelemetry collectors into Prometheus (metrics), Loki (logs), and optionally Tempo (traces), unified in Grafana dashboards.

Metrics

Agent throughput, API latency, error rates, queue depth, cache hit ratio, token budgets.

Logs

Structured reasoning traces, tool call logs, auth failures, circuit breaker events.

Dashboards

Unified views across all signals with alerting integration for email, SMS, and ITSM webhooks.

Monitored Stages

Intake
Ingress latency, API rate, auth failures

Orchestration
Loop duration, retries, summarization, tokens

Agents
Per-agent throughput, error rates, queue wait

Tool/API
API latency, status codes, circuit breaker state

Data Layer
DB latency, cache hit ratio, storage growth

Status
Event lag, notification success, loop-back frequency

Security & Compliance

Security, Privacy & Compliance

Encryption & Access

✓ Encryption in transit and at rest for all data stores

✓ Least-privilege IAM and service identity controls

✓ Secret rotation through centralized vault

✓ K8s RBAC + mTLS for service-to-service auth

Audit & Retention

✓ Full audit trail for every critical request action

✓ Immutable, queryable logs for regulatory review

✓ Data retention policies by artifact type

✓ 7-year retention norm (UAE standard)

Reliability

Failure Handling Patterns

Circuit Breakers

Per external API — isolates failures to prevent cascade across the agent pipeline.

Exponential Retries

Transient failures retried with backoff. Scheduled retry queue for deferred operations.

Dead-Letter Queues

Non-recoverable messages routed to DLQ for manual inspection and replay.

Graceful Fallback

Cached state used for read paths where policy permits. Manual review queue for unresolved cases.

Deployment

Environments & Rollout

DEV

Development

→

SIT

System Integration

→

UAT

User Acceptance

→

PROD

Production

Rollout Strategy

Controlled canary for orchestrator changes
Blue/green rollout for tool middleware updates

Configuration Separation

Rule packs versioned per environment
API endpoint and quota profiles per environment