> ## Documentation Index
> Fetch the complete documentation index at: https://docs.gp.scale.com/llms.txt
> Use this file to discover all available pages before exploring further.

# Azure Architecture Reference

> Overview of the SGP Azure architecture

## Architecture Diagram

```mermaid theme={null}
graph TD
    USER(["Client"])
    INTERNET(["Internet"])

    subgraph AZ["Azure Subscription"]
        FD["Azure Front Door<br/>WAF · CDN · TLS termination"]

        subgraph VNET["Virtual Network — 10.0.0.0/16"]
            INGRESS["Istio Ingress<br/>Internal Load Balancer"]
            NAT["NAT Gateway<br/>Outbound egress"]

            subgraph AKS["AKS Cluster — private API server"]
                SYS["System Node Pool<br/>3× Standard_D4s_v3"]
                APP["User Node Pool<br/>3–10× Standard_D16s_v3"]
            end

            BASTION["Azure Bastion"]
            JUMP["Jump Host VM"]
        end

        KV["Key Vault<br/>Secrets + CMK keys"]
        LAW["Log Analytics Workspace"]

        PG[("PostgreSQL<br/>Flexible Server")]
        REDIS[("Redis Cache<br/>Premium")]
        STOR[("Storage Account")]
        SB[("Service Bus<br/>Premium")]
    end

    USER --> FD
    FD -->|"Private Link"| INGRESS
    INGRESS --> APP
    APP -->|"Private Endpoint"| PG
    APP -->|"Private Endpoint"| REDIS
    APP -->|"Private Endpoint"| STOR
    APP -->|"Private Endpoint"| SB
    APP --> KV
    SYS --> KV
    APP --> LAW
    SYS --> LAW
    APP --> NAT --> INTERNET
    BASTION --> JUMP -->|"kubectl"| AKS
```

***

## Resources by Type

### Compute Resources

| Resource         | Count | Purpose                      |
| ---------------- | ----- | ---------------------------- |
| AKS Cluster      | 1     | Kubernetes orchestration     |
| System Node Pool | 1     | System pods (fixed 3 nodes)  |
| User Node Pool   | 1     | Application workloads        |
| GPU Node Pool    | 0-1   | AI/ML workloads (optional)   |
| Cassandra Pool   | 0-1   | Temporal database (optional) |

**Total VMs:** 6-24 (3 system + 3-10 user + 0-5 GPU + 0-6 Cassandra)

***

### Network Resources

| Resource          | Count | Purpose                  |
| ----------------- | ----- | ------------------------ |
| Virtual Network   | 1     | Network boundary         |
| Subnets           | 4-5   | Network segmentation     |
| NSGs              | 4-5   | Traffic control          |
| Route Tables      | 3-4   | Traffic routing          |
| Private DNS Zones | 7     | Internal name resolution |
| Private Endpoints | 7     | Secure PaaS access       |
| Bastion Host      | 0-1   | Secure VM access         |
| Public IPs        | 0-1   | Bastion endpoint         |

***

### Data & Storage Resources

| Resource          | Count | Purpose                     |
| ----------------- | ----- | --------------------------- |
| PostgreSQL Server | 1     | Relational database         |
| Redis Cache       | 1     | Distributed cache           |
| Storage Account   | 1     | Blob/File storage           |
| AI Search Service | 0–1   | Full-text search (optional) |
| OpenAI Service    | 0–1   | LLM models (optional)       |

***

### Security Resources

| Resource              | Count | Purpose                |
| --------------------- | ----- | ---------------------- |
| Key Vault             | 1     | Secrets management     |
| Managed Identity      | 1-2   | Service authentication |
| RBAC Role Assignments | 10+   | Access control         |

***

### Monitoring Resources

| Resource                | Count | Purpose             |
| ----------------------- | ----- | ------------------- |
| Log Analytics Workspace | 1     | Centralized logging |
| Data Collection Rule    | 1     | AKS metrics         |
| Diagnostic Settings     | 7+    | Resource logging    |
| Datadog Connection      | 0-1   | External monitoring |

***

## Network Architecture

### Address Space Planning

```
VNet: 10.0.0.0/16 (65,536 IPs)
├── AKS Subnet: 10.0.1.0/24 (256 IPs)
├── Bastion Subnet: 10.0.2.0/26 (64 IPs)
├── Database Subnet: 10.0.3.0/24 (256 IPs)
└── Private Endpoints: 10.0.4.0/25 (128 IPs)

Pod CIDR: 10.244.0.0/16 (65,536 IPs)
Service CIDR: 10.243.0.0/16 (65,536 IPs)
```

### Traffic Flow

**Egress (Internet):**

```
Pods/VMs → NAT Gateway → Public IP → Internet
(Stateful, return traffic allowed)
```

**Ingress (Internal):**

```
Service IP → Load Balancer → Pod IP (via CNI)
```

**Database Access:**

```
AKS Pods → Private Endpoint → Private Link → PostgreSQL
(DNS: server.postgres.database.azure.com)
```

**External Service Access:**

```
AKS Pods → API Gateway / Load Balancer → OpenAI / AI Search
(Via Private Endpoints)
```

***

## AKS Configuration Deep Dive

### API Server Access

**Type:** Private cluster (recommended)
**Endpoint:** Internal only
**Access Method:** Bastion host or VPN
**DNS:** {prefix}k8s.{region}.azmk8s.io (private)

### Network Policies

**Engine:** Azure Network Policy
**Scope:** Pod-to-pod communication
**Default:** Allow all (unrestricted)
**Configuration:** Define in Kubernetes manifests

### Container Registry

**Integration:** Azure Container Registry (optional)
**Authentication:** Managed identity or pull secrets
**Pulling:** Private endpoint (optional)

### Monitoring & Observability

**Azure Monitor Agent:** Deployed in kube-system
**Metrics:** CPU, Memory, Disk, Network
**Logs:** Container stdout/stderr, Kubernetes events
**Dashboards:** Pre-built in Log Analytics
