CF Sentinel

The AI that actually manages your Cloudflare account

Think OpenClaw — but for infrastructure health

Built on Agents SDK · Containers · Sandbox · AI Gateway · Workers AI

Peer Point Challenge · 2026

Inspired by OpenClaw

openclaw.ai — "The AI that actually does things"

OpenClaw (Personal)

  • Clears your inbox, sends emails
  • Manages your calendar
  • Checks you in for flights
  • Works via WhatsApp / Telegram
  • Connects to real services, takes real actions
  • Not a chatbot — an agent

CF Sentinel (Infrastructure)

  • Monitors error rates, detects anomalies
  • Watches audit logs for suspicious changes
  • Manages SSL renewals, DNS health
  • Works via dashboard / chat / alerts
  • Connects to CF APIs, takes real actions
  • Not a dashboard — an agent

Same philosophy: AI that acts, not AI that summarizes.

The Problem

Scattered Visibility

  • Analytics dashboard for traffic
  • Security tab for WAF/DDoS
  • Separate audit log viewer
  • SSL status buried in settings
  • Workers metrics in another panel
  • No unified health view

Reactive, Not Proactive

  • Alert fatigue from noisy notifications
  • No correlation between signals
  • Manual root-cause investigation
  • No historical pattern analysis
  • Config drift goes unnoticed
  • Audit log insights require SQL skills

What if your Cloudflare account
had its own OpenClaw?

An AI agent that continuously monitors, correlates, analyzes,
and acts on what's happening — before you even ask.

"Your 5xx rate on api.example.com spiked after a WAF rule change 12 min ago.
I've prepared a rollback. Approve?"

CF Sentinel — What It Does

Continuous Monitoring

  • Error rates (4xx/5xx), origin health
  • WAF/DDoS events, SSL expiry
  • DNS health, Workers errors

Audit Intelligence

  • Real-time audit log analysis
  • Config change correlation
  • API token usage anomalies

AI-Powered Analysis

  • Anomaly detection + root cause
  • Past incident lookup (RAG)
  • Natural language summaries

Actionable Alerts

  • Smart dedup + severity scoring
  • Human-in-the-loop approval
  • Auto-remediation (with consent)

Built on Bleeding-Edge Cloudflare

Every component runs on Cloudflare. Zero external dependencies.

Agents SDK NEW — Stateful AI agent orchestration on Durable Objects. Tool use, memory, scheduling, human-in-the-loop. The brain of CF Sentinel.
Containers NEW — Full Docker containers on Workers. Run complex analysis pipelines, Python ML models, custom monitoring scripts that exceed Workers limits.
Sandbox SDK BETA — Isolated code execution for AI agents. Safely run LLM-generated diagnostic scripts, query builders, and remediation code.
AI Gateway GA — Proxy & manage all LLM calls. Caching, rate limiting, cost tracking, fallback routing. Controls the AI spend.
Workers AI GA — Serverless LLM inference (Llama 3, Mistral, embeddings). Powers analysis, summarization, anomaly explanation.

Architecture

Cron Triggers
every 1-5 min
Agent (DO)
Agents SDK
CF APIs
GraphQL + REST
Queues
event buffer
Workers AI
via AI Gateway
Analysis
Containers + Sandbox
Alerts
Email / Webhook
D1
metrics & config
R2
raw logs & reports
Vectorize
incident embeddings
KV
status cache
Pages
dashboard UI

Agents SDK — The Brain

Stateful AI agents on Durable Objects · npm: agents

Key Capabilities

  • Persistent state + scheduled alarms
  • Tool calling + human-in-the-loop
  • WebSocket real-time UI

Sentinel Tools

  • queryAnalytics · getAuditLogs
  • checkSSL · analyzeFirewall
  • compareBaseline · sendAlert
import { Agent } from "agents";

export class SentinelAgent extends Agent<Env, SentinelState> {
  async onSchedule(scheduledTime: number, taskName: string) {
    const metrics = await queryAnalytics(this.env, { last: "5m" });
    const anomalies = detectAnomalies(metrics, this.state.history);
    if (anomalies.length > 0) {
      const analysis = await this.env.AI.run("@cf/meta/llama-3.1-8b-instruct", {
        messages: [{ role: "user", content: buildAnalysisPrompt(anomalies) }]
      });
      await sendAlert(this.env, { severity: analysis.severity, summary: analysis.response });
    }
    this.setState({ lastCheck: Date.now(), history: [...this.state.history, metrics] });
  }
}

Containers — Heavy Lifting

Full Docker containers on Cloudflare Workers

What They Enable

  • Python/Go/Rust analysis pipelines
  • Anomaly detection (scipy, numpy)
  • Escape Workers CPU/memory limits

Use in CF Sentinel

  • Time-series anomaly detection
  • Batch log processing from R2
  • Compliance report generation
# wrangler.toml
[[containers]]
name = "anomaly-detector"
image = "cf-sentinel/anomaly-detector:latest"
max_instances = 3

Sandbox SDK — Safe Execution

Isolated environments for AI-generated code

The Challenge

AI agents need to run dynamically generated code: diagnostic queries, data transformations, remediation scripts. Running untrusted LLM output directly is dangerous.

Sandbox Solution

Sandbox SDK provides isolated V8 execution environments with controlled access to APIs, time limits, and memory caps. The agent can safely execute generated code without risk.

CF Sentinel Use Cases

  • Dynamic GraphQL query construction
  • LLM-generated diagnostic scripts
  • Custom alert rule evaluation
  • User-defined monitoring expressions
  • Safe "what-if" config analysis
  • Ad-hoc data transformation
// Agent generates a diagnostic query, runs it safely
const diagnosticCode = await env.AI.run("@cf/meta/llama-3.1-8b-instruct", {
  messages: [{ role: "user", content: `Write JS to analyze: ${anomaly.description}` }]
});
const result = await env.SANDBOX.run({
  code: diagnosticCode.response,
  timeout: 5000,
  bindings: { ANALYTICS: env.ANALYTICS_API },
});

AI Gateway — Cost & Quality Control

Proxy layer managing all LLM interactions

Caching

  • Semantic cache for similar queries
  • Exact match cache for repeated analysis
  • Reduces token spend by 40-60%
  • Sub-100ms cached responses

Rate Limiting

  • Per-zone request budgets
  • Priority queue for critical alerts
  • Graceful degradation under load
  • Cost ceiling enforcement

Observability

  • Token usage per analysis type
  • Latency percentiles
  • Error rate tracking
  • Full request/response logging

Resilience

  • Fallback: Workers AI → external provider
  • Automatic retries with backoff
  • Model routing (fast vs accurate)
  • Guardrails & content filtering

What CF Sentinel Monitors

SignalSourceFrequencyAI Analysis
HTTP Error RatesGraphQL Analytics1 minAnomaly detection, trend correlation
WAF / DDoS EventsFirewall Events API1 minAttack pattern classification
Origin HealthHealth Check API1 minDegradation prediction
SSL CertificatesSSL API1 hourExpiry risk scoring
DNS HealthDNS Analytics5 minResolution failure analysis
Workers ErrorsWorkers Invocations1 minError clustering, root cause
Audit LogAudit Logs API1 minSuspicious activity detection
Bot TrafficBot Management API5 minBot score distribution shifts
Config ChangesAudit Logs + Zone API1 minDrift detection, impact assessment
API Token UsageAudit Logs5 minAnomalous access patterns

Incident Memory — RAG Pipeline

Incident
anomaly detected
Embed
bge-base-en
Vectorize
similar incidents
LLM
via AI Gateway
Action
fix + explain

Vectorize stores

  • Incident embeddings + resolutions
  • Root cause classifications
  • Affected zones & TTR metrics

AI produces

  • "Similar to incident #42 (3 weeks ago)"
  • "Cause: origin timeout after deploy"
  • "Fix: roll back Worker v1.2.3 (87%)"

Human-in-the-Loop Remediation

AI suggests, human approves, agent executes

Detect
5xx spike on
api.example.com
Analyze
Correlate with
audit log change
Propose
"Roll back WAF
rule #12345"
Approve
Human clicks
✓ in dashboard
Execute
Agent applies
via CF API

Safety Levels

  • Auto — cache purge, alert escalation
  • Approve — rule changes, DNS updates
  • Manual — account settings, SSL config

Powered By

  • Agents SDK requestHumanApproval()
  • WebSocket push to dashboard (DO)
  • Sandbox for safe "dry run" preview
  • Full audit trail in D1

Dashboard — Pages + Durable Objects

Real-time health view with WebSocket updates

Account Overview

  • Health score per zone (0-100)
  • Active incidents & alerts
  • Error rate sparklines
  • Traffic volume trends

Incident Timeline

  • Chronological event feed
  • AI-generated summaries
  • Correlated audit log entries
  • Resolution status tracking

Chat Interface

  • "Why is zone X error rate high?"
  • "Show me audit changes today"
  • "Compare this week vs last"
  • Natural language → analytics

Approval Queue

  • Pending remediation actions
  • AI reasoning & confidence
  • One-click approve/reject
  • Action history & rollback

The Complete Stack

LayerServiceRole
OrchestrationAgents SDK + Durable ObjectsStateful agent lifecycle, scheduling, tool use
ComputeWorkers + Cron TriggersAPI calls, data processing, routing
Heavy ComputeContainersML models, batch analysis, report generation
Safe ExecutionSandbox SDKLLM-generated code, dynamic queries
AI InferenceWorkers AILLM analysis, embeddings, classification
AI ManagementAI GatewayCaching, rate limiting, cost control, fallbacks
Relational DataD1Metrics history, config, incidents, alert rules
Object StorageR2Raw logs (Logpush), reports, snapshots
Vector SearchVectorizeIncident embeddings for RAG
CacheWorkers KVCurrent status, dashboard state, config cache
MessagingQueuesDecouple collection → analysis → alerting
FrontendPagesDashboard SPA with real-time WebSocket
NotificationsEmail WorkersAlert delivery, daily digests

Why All-Cloudflare?

0
External Dependencies
<50ms
API Latency (same network)
$5
Workers Paid Plan / month

Technical Advantages

  • Same-network API calls = minimal latency
  • Native auth via Service Bindings
  • No egress costs (R2)
  • Automatic global distribution
  • Single deploy target (wrangler)

Operational Benefits

  • Single vendor = one bill, one support
  • Unified auth & permissions model
  • Deployable with wrangler deploy
  • Scales from 1 zone to 1000+
  • Open-sourceable reference architecture

Roadmap

Phase 1 — Foundation Now
Core monitoring agent with Agents SDK. Cron-based data collection from GraphQL Analytics + Audit Logs. D1 storage. Basic alerting via Email Workers.
Phase 2 — Intelligence
Workers AI analysis via AI Gateway. Vectorize RAG for incident memory. Anomaly detection. Natural language summaries. Chat interface on Pages dashboard.
Phase 3 — Autonomy
Container-based ML pipelines. Sandbox for dynamic diagnostics. Human-in-the-loop remediation. Multi-account support. Compliance reporting.
Phase 4 — Platform
Open-source release. Custom monitoring plugin API. Community-contributed detection rules. Integration with Cloudflare Notifications system.

Demo Scenarios

Scenario 1: 5xx spike detected → agent correlates with WAF rule change in audit log → suggests rollback → human approves → agent executes
Scenario 2: SSL certificate expiring in 7 days → agent checks renewal status → finds validation stuck → alerts with specific fix steps
Scenario 3: Unusual API token activity → agent detects token used from new IP range → cross-references audit log → flags for security review
Scenario 4: "Why is latency high?" in chat → agent queries analytics, finds origin degradation, checks health checks, summarizes in plain English

CF Sentinel

OpenClaw for your Cloudflare account.
The AI that actually manages your infrastructure.

100% Cloudflare Stack · Agents SDK · Containers · Sandbox · AI Gateway · Workers AI

Questions? Let's build this.