Cupcake: policy enforcement for AI coding agents

Cupcake is a policy enforcement layer for AI coding agents such as Claude Code, Cursor and OpenCode. It implements deterministic security measures by evaluating agent activity against rules defined as policy-as-code. Rules are written in Open Policy Agent Rego. This way, potential gaps in configuration possibilities offered by AI coding agents can be addressed in a more unified way.

In this post, I will explore how Cupcake can be used to block prompts that may contain keywords hinting at secret leaks to the model APIs. Claude Code will be our coding agent.

Installation and setup

After following the installation instructions, which include installing Cupcake and Open Policy Agent, Cupcake needs to be initialized in your project:

cupcake init --harness claude --builtins protected_paths

This initializes the hooks for Claude and enables one of the built-in policies that can be configured. The protected paths policy prevents access for a configured list of directories.

Creating new policies

The reference examples show how to block tools from reading a specific set of directories or from writing secrets into files.

Let’s write a policy that will react upon submission of a prompt to validate if it’s leaking secrets based on keyword matches. We need to create a policy file leak.rego and place it in the correct folder for claude: .cupcake/policies/claude/leak.rego. The important bit is that required events includes the correct event name that we will check for in our policy rule.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
# METADATA
# scope: package
# title: Prevent Secret Leak
# description: Blocks Prompts that may leak secrets
# custom:
#   routing:
#     required_events: ["UserPromptSubmit"]
package cupcake.policies.prevent_secret_leak

import rego.v1

# Content patterns that indicate secrets
secret_content_patterns := [
    "API_KEY",
    "SECRET_KEY",
    "PASSWORD",
    "PRIVATE_KEY",
    "ACCESS_TOKEN",
    "AUTH_TOKEN",
    "AWS_SECRET",
    "GITHUB_TOKEN"
]

# Block if prompt contains one of the protected patterns
deny contains decision if {
    input.hook_event_name == "UserPromptSubmit"

    some pattern in secret_content_patterns
    contains(upper(input.prompt), pattern)

    decision := {
         "rule_id": "SECRET-LEAK",
         "reason": concat("", ["secret pattern discovered in prompt: ", pattern]),
         "severity": "HIGH"
    }
}

Policy evaluation

Having written the policy, we can evaluate it against test events. The reference manual contains examples of event fields, making testing easy. Let’s start with a test event where the rule is not matching, meaning that the policy will be allowing the prompt.

$ cat prompt-ok.json
{
  "hook_event_name": "UserPromptSubmit",
  "prompt": "API",
  "session_id": "test",
  "cwd": "/tmp",
  "transcript_path": "/tmp/transcript.md"
}

When running the evaluation we observe that the output mentions parsing the policy file that we have created. We see that there was one policy match and that the final decision was to allow the prompt.

$ cupcake eval --harness claude < prompt-ok.json
2026-03-14T20:02:56.439299Z  INFO Processing harness: ClaudeCode
2026-03-14T20:02:56.439405Z  INFO Initializing Cupcake Engine
...
2026-03-14T20:11:34.550357Z  INFO Successfully parsed policy: cupcake.policies.prevent_secret_leak from "./.cupcake/policies/claude/leak.rego"
...
2026-03-14T20:02:56.532071Z  INFO Engine initialization complete
2026-03-14T20:02:56.532218Z  INFO evaluate{trace_id=019cedf1-db94-7f10-a302-6bf08391af67 event_name="UserPromptSubmit"  session_id="test"}: Evaluating event: UserPromptSubmit tool: None
2026-03-14T20:02:56.532232Z  INFO evaluate{trace_id=019cedf1-db94-7f10-a302-6bf08391af67 event_name="UserPromptSubmit"  session_id="test" matched_policy_count=1}: Found 1 matching policies
2026-03-14T20:02:56.532690Z  INFO evaluate{trace_id=019cedf1-db94-7f10-a302-6bf08391af67 event_name="UserPromptSubmit"  session_id="test" matched_policy_count=1}:synthesize{total_decisions=0 halts=0 denials=0 blocks=0 asks=0}: Synthesizing decision from 0 total decisions
2026-03-14T20:02:56.532705Z  INFO evaluate{trace_id=019cedf1-db94-7f10-a302-6bf08391af67 event_name="UserPromptSubmit"  session_id="test" matched_policy_count=1}: Synthesized final decision: Allow { context: [] }
2026-03-14T20:02:56.532710Z  INFO evaluate{trace_id=019cedf1-db94-7f10-a302-6bf08391af67 event_name="UserPromptSubmit"  session_id="test" matched_policy_count=1}: execute_actions_with_rulebook_and_debug called with decision: Allow { context: [] }
{}

Now, let’s take a look at a prompt that we expect to be blocked by the policy:

$ cat prompt-block.json
{
  "hook_event_name": "UserPromptSubmit",
  "prompt": "API_KEY",
  "session_id": "test",
  "cwd": "/tmp",
  "transcript_path": "/tmp/transcript.md"
}

During the evaluation we see a summary of the decisions across policies with different actions as well as the reason for the denial.

$ cupcake eval --harness claude < prompt-block.json
...
2026-03-14T20:03:59.779653Z  INFO Engine initialization complete
2026-03-14T20:03:59.779760Z  INFO evaluate{trace_id=019cedf2-d2a3-7232-8527-d48221f09998 event_name="UserPromptSubmit"  session_id="test"}: Evaluating event: UserPromptSubmit tool: None
2026-03-14T20:03:59.779771Z  INFO evaluate{trace_id=019cedf2-d2a3-7232-8527-d48221f09998 event_name="UserPromptSubmit"  session_id="test" matched_policy_count=1}: Found 1 matching policies
2026-03-14T20:03:59.780248Z  INFO evaluate{trace_id=019cedf2-d2a3-7232-8527-d48221f09998 event_name="UserPromptSubmit"  session_id="test" matched_policy_count=1}:synthesize{total_decisions=1 halts=0 denials=1 blocks=0 asks=0}: Synthesizing decision from 1 total decisions
2026-03-14T20:03:59.780262Z  INFO evaluate{trace_id=019cedf2-d2a3-7232-8527-d48221f09998 event_name="UserPromptSubmit"  session_id="test" matched_policy_count=1}: Synthesized final decision: Deny { reason: "secret pattern discovered in prompt: API_KEY", agent_messages: [] }
2026-03-14T20:03:59.780267Z  INFO evaluate{trace_id=019cedf2-d2a3-7232-8527-d48221f09998 event_name="UserPromptSubmit"  session_id="test" matched_policy_count=1}: execute_actions_with_rulebook_and_debug called with decision: Deny { reason: "secret pattern discovered in prompt: API_KEY", agent_messages: [] }
2026-03-14T20:03:59.780326Z  INFO evaluate{trace_id=019cedf2-d2a3-7232-8527-d48221f09998 event_name="UserPromptSubmit"  session_id="test" matched_policy_count=1}: Executing actions for DENY decision: secret pattern discovered in prompt: API_KEY
2026-03-14T20:03:59.780330Z  INFO evaluate{trace_id=019cedf2-d2a3-7232-8527-d48221f09998 event_name="UserPromptSubmit"  session_id="test" matched_policy_count=1}: execute_rule_specific_actions_with_debug: Checking actions for 1 decision objects
{"decision":"block","reason":"secret pattern discovered in prompt: API_KEY"}

This is how the error message looks in Claude Code when the policy denies the prompt execution:

Screenshot of error message in Claude Code following a policy block. — Error message in Claude Code indicating the policy block.

Troubleshooting

For troubleshooting, Cupcake offers an inspection command which should list your custom policies added to the project:

$ cupcake inspect
...
Policy: .cupcake/policies/claude/leak.rego
  Package: cupcake.policies.prevent_secret_leak
  Required Events: UserPromptSubmit
  Title: Prevent Secret Leak

There is a verification command validating the policy syntax and ensuring that the OPA rules will compile correctly:

$ cupcake verify --harness claude
...
2026-03-14T21:01:03.123720Z  INFO Successfully parsed policy: cupcake.policies.prevent_secret_leak from "./.cupcake/policies/claude/leak.rego"

There are convenient error messages displayed when no policies have matched for the event. Let’s take a look at a PreCompact hook test event:

$ cat pre-compact.json
{
  "hook_event_name": "PreCompact",
  "session_id": "abc123",
  "transcript_path": "/path/to/transcript.md",
  "cwd": "/working/directory",
  "trigger": "manual",
  "custom_instructions": "Preserve the API documentation"
}

In the evaluation output we see No policies matched for this event:

$ cupcake eval --harness claude < pre-compact.json
...
2026-03-14T20:11:34.710253Z  INFO Engine initialization complete
2026-03-14T20:11:34.710724Z  INFO evaluate{trace_id=019cedf9-c3b6-7c01-8138-c4dd12658c82 event_name="PreCompact"  session_id="abc123"}: Evaluating event: PreCompact tool: None
2026-03-14T20:11:34.710750Z  INFO evaluate{trace_id=019cedf9-c3b6-7c01-8138-c4dd12658c82 event_name="PreCompact"  session_id="abc123"}: No policies matched for this event - allowing

Summary

When I started exploring Cupcake, I hoped policies would be easy to reuse across different AI coding agents. There are currently two limitations that prevent this.

First, policy files need to be placed in tool-specific directories (e.g. .cupcake/policies/claude/ and .cupcake/policies/opencode/). As long as it’s the same file, symlinks can of course be used. There is also a global setup possible for organizational-wide policies applying to all projects.

Second, as of cupcake 0.5.1, prompt events are not available for OpenCode, which only supports pre and post tool usage hooks. I hope that future versions of OpenCode and Cupcake will make this possible.

A few other properties that make Cupcake interesting as a project:

Signals allow integration of additional context passed to the policy evaluation. This keeps the decision rules simple.
Decision verbs designed for AI governance, which allow to extend the context (add_context) or prompt the user for confirmation (ask) before executing a potentially dangerous action.
Watchdog integrating LLM-as-a-judge capability for advanced decision making. While currently offering only OpenRouter integration for model access, the codebase can be extended to any OpenAI API compatible backend, enabling use of local guard models.

Overall, it’s an interesting project and my short experiment was useful to explore how locally executed and deterministic policies can be used to restrict AI coding agents in pursuit of enabling controlled autonomous execution.

Installation and setup#

Creating new policies#

Policy evaluation#

Troubleshooting#

Summary#

Installation and setup

Creating new policies

Policy evaluation

Troubleshooting

Summary