Auditing Claude Code:

How I Used Claude Code's Hook System to Build an Evidence-Based Permission Policy

You probably shouldn't let your AI assistant run whatever it wants. But you also shouldn't prompt-approve every single command. Here's how I found the middle ground.

Curtis (& Claude)


I've been using Claude Code — Anthropic's CLI-based coding assistant — as part of my daily development workflow. It reads files, runs bash commands, edits code, searches codebases, and even spawns sub-agents to parallelize work. It's powerful. And that power is exactly why I started asking myself: what is this thing actually doing on my machine?

Claude Code has a permissions system. You can allow certain commands to run without prompting, deny others outright, and leave the rest to require manual approval each time. But when I first set mine up, I was guessing. I allowed what seemed safe, denied what seemed dangerous, and left a big gray area in the middle where I was clicking "approve" dozens of times per session.

I wanted data. So I built a lightweight audit system using Claude Code's hook feature, ran it for a week, and used the results to build a permission policy based on what actually happens — not what I imagined might happen.


The Hook System

Claude Code supports lifecycle hooks in its settings file (~/.claude/settings.json). These are shell commands that fire in response to events — things like PostToolUse, PreToolUse, and PostToolUseFailure. The hook receives the tool call as JSON on stdin, which means you can capture, reshape, and log whatever you want.

Here's what I had Claude add:

"hooks": {
  "PostToolUse": [
    {
      "hooks": [
        {
          "type": "command",
          "command": "jq -c '{timestamp: (now | todate), tool: .tool_name, session: .session_id, params: .tool_input}' >> ~/.claude/tool-audit-$(date +%Y-%m-%d).jsonl"
        }
      ]
    }
  ],
  "PostToolUseFailure": [
    {
      "hooks": [
        {
          "type": "command",
          "command": "jq -c '{timestamp: (now | todate), tool: .tool_name, session: .session_id, error: .error, success: false}' >> ~/.claude/tool-audit-$(date +%Y-%m-%d).jsonl"
        }
      ]
    }
  ]
}

Two hooks. One catches every successful tool call, the other catches failures. Both pipe the incoming JSON through jq to extract the fields I care about — timestamp, tool name, session ID, and either the parameters or the error — and append a compact one-line JSON object to a daily log file.

The $(date +%Y-%m-%d) in the filename means each day gets its own file: tool-audit-2026-03-12.jsonl, tool-audit-2026-03-13.jsonl, and so on. This keeps files manageable and makes it easy to analyze specific time ranges.

The only dependency is jq, which you probably already have. If not: brew install jq.

That's it. No agents, no dashboards, no infrastructure. Just a one-liner that silently logs everything as you work.


A Week of Data

I let this run for a normal work week — code reviews, feature work, debugging, the usual. I didn't change my behavior or try to generate interesting data. I just worked.

At the end of the week, I had the audit files sitting in ~/.claude/. And in a bit of satisfying recursion, I asked Claude Code itself to analyze them.


create a summary of all the commands/tools used by claude over the
past week. There are tool audit files that are recorded by hooks in
the settings.json
    

Here's what a week looked like:

2,161 tool calls across 25 sessions.

Tool Usage Breakdown

Tool Count Share
Bash1,24857.7%
Read56226.0%
Agent954.4%
Grep954.4%
Glob793.7%
TodoWrite391.8%
Edit291.3%
Skill70.3%
Write20.1%
Other50.2%

The first thing that jumped out: this is a read-heavy workflow. Read, Grep, and Glob together account for over 34% of all tool calls. Claude spends a lot of time understanding code before it touches anything. Edits and writes combined were barely 1.4% of total activity.

The second thing: Bash dominates at nearly 58%. But "Bash" is a big bucket. What's inside it matters more.

What's Inside the Bash Calls

Command Count
gh pr271
gh api148
git log144
git show112
git diff15
git blame9
gh run9
git branch7
git status6
git add4
git push2
git commit1

This was the real revelation. The overwhelming majority of Bash usage was read-only GitHub and Git operations. gh pr view, gh api calls to fetch PR data, git log, git show — these are all safe, side-effect-free commands that I was sometimes being prompted to approve.

Meanwhile, write operations were rare.

The Code Review Pattern

A big chunk of the activity was driven by code reviews. I used Claude Code's code review skill six times during the week, and each review spawns a cascade of operations: fetching the PR diff, reading changed files, checking for CLAUDE.md conventions, running sub-agents in parallel to analyze different aspects of the code, and posting results via gh api.

A single code review session could generate 200+ tool calls, most of them gh pr, gh api, git show, and Read. All read-only. All things I'd want to run without friction.

Activity Patterns

The data also showed when I work (no surprises there — weekday afternoons, US Central time) and that weekends were nearly silent (12 total calls across Saturday and Sunday, all just Glob searches).

The busiest single session had 508 tool calls. The top three sessions accounted for over half of all weekly activity. This is the nature of agentic work — it's bursty. When Claude is deep in a code review or exploring a codebase, it chains hundreds of fast operations together. If each one required a permission prompt, the workflow would be unusable.


From Data to Policy

Claude Code's settings.json has three permission tiers:

Before the audit, my allow list was conservative and my deny list was based on gut feel. After the audit, I could make evidence-based decisions.

Allow: High-Frequency, Low-Risk

These are commands that appeared hundreds of times, are read-only or have minimal side effects, and would create constant friction if prompted:

"allow": [
  "Bash(git log*)",
  "Bash(git show*)",
  "Bash(git status*)",
  "Bash(git diff*)",
  "Bash(gh pr *)",
  "Bash(gh api *)",
  "Bash(gh run view *)",
  "Bash(gh run list *)"
]

The audit confirmed these are the backbone of Claude's workflow. Allowing them removes hundreds of approval prompts per week with effectively zero risk.

Prompt: Occasional, Has Side Effects

These appeared a handful of times and do things I want to see before they happen — but don't want to block entirely:

These are infrequent enough that the prompts don't create friction, but consequential enough that I want the checkpoint.

Deny: Destructive, Never Unattended

These are commands that should never run without me at the keyboard, regardless of what Claude thinks it needs:

"deny": [
  "Bash(rm *)",
  "Bash(git push*--force*)",
  "Bash(git reset --hard*)",
  "Bash(git clean*-f*)",
  "Bash(git branch -D*)",
  "Bash(git stash drop*)",
  "Bash(git stash clear*)",
  "Bash(sudo *)",
  "Bash(docker system prune*)",
  "Bash(docker compose down*--volumes*)"
]

None of these appeared in my audit data — which is a good sign. But the deny list exists as a safety net. AI assistants can hallucinate tool calls. A model under pressure to fix a failing test might decide git reset --hard is a reasonable approach. The deny list makes that impossible.


What I'd Recommend

If you're using Claude Code (or any AI coding assistant with configurable permissions), here's the approach:

  1. Start with audit, not policy. Add the logging hooks and work normally for a few days. You need to see what the tool actually does before you can make good decisions about what to allow.
  2. Look at the commands, not just the tools. "Bash" is not a useful category. git log and rm -rf are both Bash commands. The audit data lets you see the actual commands and make granular decisions.
  3. Allow the read-heavy stuff. In my data, read-only operations (git log, git show, gh pr view, file reads) accounted for the vast majority of tool calls. Prompting on every one of these adds friction without adding safety.
  4. Keep write operations on a leash. Commits, pushes, PR creation, file writes — these were rare but consequential. The prompt-to-approve default is the right place for them.
  5. Hard-deny the destructive stuff. Force pushes, hard resets, recursive deletes, privilege escalation — put these on the deny list from day one. You don't need audit data to know these should never run unattended.
  6. Revisit periodically. Your workflow changes. New tools get added. Run the audit again in a month and see if your policy still fits.

The Meta Moment

There's something worth noting about this whole exercise: I used Claude Code to analyze its own audit logs. I asked it to summarize the JSONL files, break down tool usage by day and type, identify the top bash commands, and count sessions. It did all of that in about 30 seconds, chaining together jq, sort, uniq, and awk commands.

And every one of those commands got logged by the hooks, adding to the very dataset it was analyzing.

This is the feedback loop that makes the hook system valuable. You're not just monitoring — you're building institutional knowledge about how AI tools integrate into your workflow. And that knowledge lets you make better decisions about trust, permissions, and risk.


Getting Started

Add the hooks to your ~/.claude/settings.json, install jq if you don't have it, and work normally for a week. Then run:

cat ~/.claude/tool-audit-*.jsonl | jq -r '.tool' | sort | uniq -c | sort -rn

Or just ask Claude to summarize it for you. It's pretty good at that.


If you're using Claude Code or exploring AI-assisted development workflows, I'd love to hear how you're thinking about permissions and trust boundaries. The tooling is new, the patterns are still forming, and we're all figuring this out together.

← curtiskelsey.com