Skip to main content

Exploring Copilot CLI Session Management to Improve Squad

· 13 min read

I've been using Squad, an AI team framework built on top of Copilot CLI, and I kept wondering: Copilot CLI already tracks everything that happens in a session — could that data make Squad's agents smarter? I spent some time digging into how both systems manage session data, and I think there's an untapped opportunity.

This post is my investigation notes — what I found, how the two systems compare, and where I think they could be combined for more value.

My working theory: Copilot is your diary (what happened). Squad is your playbook (what to do about it). Right now they're like two lighthouses on opposite shores of Bellingham Bay — both useful, but no bridge between them.

Two lighthouses on opposite shores of Bellingham Bay, their beams not quite connecting

What I Found: Two Memory Systems

Copilot CLI: The Raw Record

Copilot CLI records every session — prompts, responses, tool calls, file changes, and checkpoints. I discovered it powers:

  • /resume — pick up where you left off in any previous session
  • /chronicle — generate standup reports, get personalized tips, improve your custom instructions
  • /session — view and manage your sessions directly from the CLI

Session data lives in ~/.copilot/session-state/ as files and in ~/.copilot/session-store.db as a structured SQLite database.

What Copilot remembers: Everything that happened in every session — the full transcript.

What it doesn't do: Extract meaning. Copilot stores the raw conversation, not the conclusions you drew from it.

Squad: The Distilled Knowledge

Squad's memory is different — and this is where I see the gap. It's not a transcript — it's distilled knowledge, stored as markdown files in your repo:

WhatFilePurpose
Team decisions.squad/decisions.mdShared brain — every agent reads this
Agent memory.squad/agents/{name}/history.mdPersonal learnings per agent
Skills.copilot/skills/{name}/SKILL.mdRepeatable tasks with everything needed to execute
Session state.squad/sessions/*.jsonResume data (gitignored by default)
Scribe logs.squad/log/*.mdSession summaries (gitignored by default)

What Squad remembers: Decisions, patterns, preferences, and skills — the things that should change how agents behave next time.

What it doesn't do: Record the full conversation. That's Copilot's job.

The Gap I See

The two systems complement each other, but right now they're completely disconnected — like looking across Deception Pass and seeing the other side but having no way to cross.

Water rushing through a rocky gorge at Deception Pass — two cliff faces close together with no bridge between them

Here's where each system shines:

QuestionWhere to look
"What did I do last Tuesday?"Copilot — /session or /chronicle standup
"What did the team decide about auth?"Squad — .squad/decisions.md
"Have I worked on this file before?"Copilot — /session to browse past sessions
"How do we run a content audit?"Squad — .copilot/skills/content-audit/SKILL.md
"What went wrong last time I tried this?"Copilot — session transcript via /resume
"What does this agent know about TypeSpec?"Squad — .squad/agents/{name}/history.md

This separation works, but it's manual. You have to be the bridge — ferrying insights across the water yourself. That's the opportunity I'm investigating.

Where I Think Session Data Could Improve Squad

Squad has a built-in skill called reskill ("team, reskill") that audits agent charters and histories, extracts shared patterns into skills, and compresses bloated files. Think of it as sorting the morning catch on a Bellingham dock — keeping what's valuable, tossing the rest back.

Fisherman on a Bellingham dock sorting the morning catch into labeled crates

But reskill today is purely file-based — it reads .squad/ markdown and looks for textual duplication. It has no idea what actually happened in sessions.

Here's what I think session data could add:

Signal from Copilot sessionsWhat Squad could do with it
Agent X was spawned 40 times but only useful 25 timesRefine charter to reduce misfires
Agent Y always gets the same 3 files as inputBake those into charter's "What I Own"
Users keep correcting the same mistakeExtract as anti-pattern in a skill
An agent never gets spawnedFlag for removal during reskill
Two agents always get spawned togetherSuggest merging or formalizing the pairing
Certain skills are read but never appliedDeprecate during reskill
Session durations spike after charter changesDetect regressions from past reskills

There are two existing proposals in the Squad repo that go in this direction — tiered memory (#600, open) for hot/cold/wiki context layers, and reflect (#621, closed PR — not merged) for in-session learning capture. Neither one references Copilot CLI session data though. They're both Squad-internal. The bridge between Copilot's behavioral data and Squad's knowledge system doesn't exist yet.

Ideas I'm Exploring

The theme here is a feedback loop — raw session data flows downstream, gets refined into knowledge, and that knowledge shapes the next session. Like the Nooksack River circling back toward the mountains that feed it.

The Nooksack River looping back toward Mount Baker, papers transforming into books at the bend

1. Feed /chronicle into Reskill

After a productive session, Squad agents already extract the important parts:

  • Decisions go to .squad/decisions.md
  • Learnings go to agents/{name}/history.md
  • Reusable patterns become skills

But what if reskill could also query Copilot's session store to find patterns agents missed? /chronicle improve already analyzes session history to suggest custom instruction improvements. That same analysis could feed into Squad's skill extraction pipeline — Copilot finds the behavioral pattern, Squad encodes it permanently.

2. Use /chronicle for Behavioral Analysis

Copilot's /chronicle improve analyzes session history to find where agents struggled or needed correction. I'm thinking about how to make this a systematic input to Squad:

  • Run /chronicle improve periodically
  • Take the suggestions and apply them to agent charters or team directives
  • This creates a feedback loop: Copilot finds the pattern, Squad encodes it permanently

Today this is manual. I'd love to see a squad reskill --from-chronicle that automates the loop.

3. Use /session for Context

When starting work on something you've touched before, use /session to browse previous sessions and find relevant context:

"Before starting, check /session for any previous sessions 
that touched these files. Summarize what was done and any issues."

This gives agents a head start without you having to remember and re-explain.

4. Use Squad for Cross-Agent Memory

Copilot's session history is per-user. Squad's memory is per-team. When Agent A discovers something that Agent B needs to know, Squad's shared files make that happen:

  • Scribe writes cross-agent updates to affected agents' history.md
  • Decisions in decisions.md are read by every agent at spawn time
  • Skills are shared — any agent can use any skill

The Gitignore Decision

Squad gitignores session-related files by default. Here's what that means and when to change it:

FileDefaultChange when
.squad/sessions/GitignoredCommit if you need session transcripts in git (training repos, research)
.squad/log/GitignoredCommit if you want Scribe's summaries as an audit trail
.squad/orchestration-log/GitignoredCommit if you want agent routing history preserved
.squad/decisions.mdCommittedNever gitignore — this is the team's shared brain
.squad/agents/*/history.mdCommittedNever gitignore — this is each agent's knowledge
.copilot/skills/CommittedNever gitignore — these are your reusable patterns

The recommended hybrid: Keep sessions gitignored, but commit Scribe's logs for a lightweight audit trail. Remove .squad/log/ and .squad/orchestration-log/ from .gitignore to enable this.

⚠️ One caveat: If your org requires audit trails of AI interactions, git probably isn't the right system of record — no retention policies, no redaction, no legal hold. Worth checking before treating committed sessions as a compliance solution.

Under the Hood (Skip Unless Debugging)

Copilot CLI stores session data in two places: file-based events in ~/.copilot/session-state/{session-id}/events.jsonl and a searchable SQLite database at ~/.copilot/session-store.db. The database powers /chronicle and /session — you need "experimental": true in ~/.copilot/config.json to enable these features. Without experimental mode, /chronicle won't be available — enable it with /experimental on in any session.

Each session folder contains the event stream (every tool call, message, and model metric), workspace metadata, and checkpoint snapshots that /resume uses to reconstruct context. The session.shutdown event in events.jsonl is worth finding — it shows your token usage, cache hit rates, and code changes in one place.

The SQLite database (~59 MB after ~770 sessions in my case) holds structured records across seven tables: sessions, turns, checkpoints, session_files, session_refs, and an FTS5 search index. Records persist even after session directories are cleaned up. Don't delete the .db-wal file while Copilot is running — you'll lose recent writes.

What's in the .squad Session Files

If you're using Squad to orchestrate AI agents, there's a parallel session storage layer inside .squad/ in your repo. While .copilot/ tracks platform sessions, .squad/ accumulates session-by-session team memory.

Session-Scoped Files (Created Per Session)

These files can be traced back to a specific session:

FileWhat It Contains
orchestration-log/{timestamp}-{agent}.mdWho was spawned, why, what they did. Append-only audit trail.
log/{timestamp}-{topic}.mdScribe's session summary.
decisions/inbox/{agent}-{slug}.mdEphemeral drop-box — agents write decisions here during a session. Scribe merges them into decisions.md afterward.
identity/now.mdUpdated each session with current focus. Every agent reads this at spawn so they hit the ground running.

Running-State Files (Modified Across Sessions)

These files accumulate changes but don't track which session changed them:

FileHow It Changes
agents/*/history.mdGrows each session as agents record learnings. Scribe summarizes when it exceeds ~15 KB.
agents/*/charter.mdUpdated if an agent's role evolves. No session linkage.
skills/{name}/SKILL.mdCreated or updated when agents discover reusable patterns.
decisions.mdThe canonical decision ledger — grows each session, entries are dated.
team.md, routing.mdUpdated when members join or leave.
casting/registry.jsonNew agent names registered here. Persistent.

The distinction matters: session files are created per session and disposable. Running-state files are your team's accumulated intelligence — they compound over time.

The Two-Layer Model

Together, .copilot/ and .squad/ form a complete session memory system — like Whatcom County's geology, where buildings sit on the visible surface but the real water supply flows through the aquifer below.

Cross-section of Whatcom County geology — buildings on the surface, aquifer below, a well connecting them

LayerLocationScopeWhat it tracks per session
Platform~/.copilot/Per-user, cross-projectEvents, turns, tool calls, model metrics
Team.squad/ (in repo)Per-project, cross-sessionOrchestration logs, agent memory, decisions, focus

The platform layer is invisible infrastructure — you don't commit it, you query it. The team layer is committed to the repo — it travels with the code and survives across machines, sessions, and team members. Surface and aquifer, both feeding the same ecosystem.

Try This

Ready to explore your own session data? Here are three things you can do right now:

  1. Browse your sessions: Open ~/.copilot/session-state/ and look at the events.jsonl from your most recent session. Search for session.shutdown to see your token usage and cache hit rates.

  2. Query your history: In any Copilot CLI session, try /session to browse your past sessions. Use /resume to jump back into a previous session with full context.

  3. Feed Copilot into Squad: Run /chronicle improve and review the suggestions. Pick one that matches a recurring pattern and say: "Make that a skill" or "Add that to decisions."

If you're not using Squad yet, #1 and #2 still work — they're pure Copilot CLI. The session data is there whether you browse it or not.

If You're Building an Agent on Top of Copilot

This investigation was Squad-specific, but the underlying insight applies to anyone building on Copilot CLI: there's a lake of session data sitting right there in ~/.copilot/ — and most agents ignore it completely.

The good news is the plumbing already exists. The Copilot SDK (@github/copilot-sdk) exposes session listing, full event history, and real-time event subscriptions. You can filter sessions by repo or branch, pull every tool call and assistant response, and subscribe to events as they happen. The data access is there — what's missing is the intelligence layer on top.

Here are three ideas I keep coming back to — none of them Squad-specific:

1. Adaptive Prompt Tuning Based on Tool Failure Rates

I noticed in my own session data that certain tool calls fail repeatedly — grep with regex that doesn't match the codebase's naming conventions, for example. An agent could watch for these patterns and silently adjust its strategy — switching to glob patterns, broadening search terms, adding fallback chains — without me ever asking. Like a fishing guide who notices you keep casting into the wrong current and quietly repositions the boat.

2. Cross-Session Onboarding for New Repos

When I open a new repository for the first time, the agent has zero context about how I work. But my session history from other repos is right there — it shows whether I prefer TypeScript or JavaScript, whether I write tests first, which frameworks I reach for. An agent could mine that cross-project session data to bootstrap a developer profile, skipping the cold-start problem. First day in a new codebase, but the agent already knows your habits.

3. Drift Detection Between Intent and Outcome

Session data captures both what I asked for and what the agent actually did — tool calls, file edits, test results. Over time, an agent could spot drift: I keep correcting the same kind of CSS suggestion, or certain requests consistently take multiple follow-up turns. Imagine the agent saying, "You frequently adjust my styling — want me to follow a specific style guide?" That turns passive logs into active self-improvement.

The common thread: session data isn't just a transcript — it's telemetry. Any agent that treats it as a feedback signal rather than a static log has a real advantage, like reading the tides instead of just watching the water.

The Bottom Line

Use both memory systems intentionally:

  • Copilot handles the raw history. Let it. Don't try to replicate session transcripts in Squad files.
  • Squad handles the distilled knowledge. Invest here — decisions, history, and skills are what compound.
  • Feed insights from Copilot back into Squad via /chronicle improve, directives, and skill creation.
  • Start with the default gitignore. The valuable stuff is already being committed. Relax later if you need session trails.

Your agents get smarter not because they remember every conversation, but because the important conclusions persist in the right place. The river keeps flowing — what matters is what settles into the riverbed.

GitHub Account Cleanup: Audit, Archive & Remove Stale Repos

· 5 min read

My end of year project is a GitHub account repository-cleanup tool to provide safe, repeatable auditing and cleanup for my GitHub accounts. I also wanted to create a catalog of my active repos. This repo focuses on repository-level cleanup (archive/delete/catalog), but the same audit run can help you discover candidates for cloud-resource reclamation and CI/workflow maintenance.

When to use this project

  • Periodic account maintenance (end-of-year or scheduled audits).
  • Before publishing a portfolio or transferring repositories.
  • When you want a reproducible audit with a dry-run-first approach.

Functionality included in GitHub account cleanup

My TypeScript project cleans up my account in the following way:

  • Archive stale repositories,
  • Detect and delete empty repositories,
  • Remove forks,
  • Generate repo descriptions and topics with LLM and update repos with that info,
  • Generate a catalog of active repos for publishing.

High level architecture

1- primary entry is scripts/run-all.sh 2- workflows are optional 3- gh-cleanup calls github-rest and llm-completion 4- outputs go to generated/

The high-level architecture of the npm workspace monorepo:

  • packages/gh-cleanup — CLI commands and orchestration: categorization rules, scoring, reporting, and the runner that coordinates dry-run and apply flows.
  • packages/github-rest — GitHub REST helpers, typed endpoint wrappers, and shared network utilities.
  • packages/llm-completion — LLM/AI utilities: prompt helpers, request wrappers, retries, and response sanitization used by the describe step.
  • generated/ — Example outputs created by dry-run executions: catalog.md, active.json, descriptions.json, summaries, etc.
  • .github - Workflows and prompt files.
  • scripts - Top level script to clean up GitHub account, also used by

Prerequisites

This repo can be opened with Codespaces or locally with .devcontainer/devcontainer.json. The development container has all the developer setup for this project including Node.js. Once you have the repo open with the environment, create a GitHub token and an OpenAI key and an LLM model. Set these values in the root level .env.

  • A GitHub token in GH_TOKEN (classic PAT with repo scopes; delete_repo only required for destructive operations such as deleting a repo.)
  • OpenAI key for LLM generation of repo descriptions and topics and an LLM model. I used 4.1-mini from Azure OpenAI.
GH_TOKEN=
GH_USER=YOUR_GITHUB_USER_ACCOUNT
OPENAI_API_KEY=
OPENAI_ENDPOINT=https://RESOURCE-NAME.openai.azure.com/openai/deployments/MODEL_NAME/chat/completions?api-version=API_VERSION
OPENAI_MODEL=gpt-4.1-mini
OPENAI_TEMPERATURE=0.2

Install and build dependencies

The root package.json uses npm workspaces to control and access all the packages in ./packages. Use the root package.json scripts to install and build the tool.

npm install
npm run build

Try out the tool

One-line run examples

  • Remove forks (dry-run):

    npm run start -w gh-cleanup -- remove-forks
  • Archive stale repos older than a year (dry-run):

    npm run start -w gh-cleanup -- archive-stale-repos
  • Delete empty repos, no PRs, no commits, size is 0 KB (dry-run):

    npm run start -w gh-cleanup -- delete-empty-repos
  • Categorize repos (fetch languages + README, output Markdown):

    npm run start -w gh-cleanup -- categorize-repos --fetch --output=md --out=generated/catalog.md
  • Summary (write generated/summary.md):

    This creates the final active list of repos in a markdown table. I'll use this in my dfberry.github.io website to find projects with specific code, configuration, or CI.

    npm run start -w gh-cleanup -- summary --summary-out=generated/summary.md

Full run

Once you understand what the tool does, use the ./scripts/run-all.sh to clean up your GitHub account. I used my personal dfberry account to test while building. Now that it is complete, I'll use it on my work diberry account for Microsoft.

npm run run-all:apply

The apply means empty repos are deleted and active repos are updated for description and topics.

AI-assisted development

I use AI tools (Copilot, Ask/Plan/Agent) to speed development while keeping human oversight. I commit frequently and review AI suggestions line-by-line. Choose models and session types (local, background, cloud) deliberately to control cost and behavior. Never let AI run unattended on critical changes—use dry runs and manual tests. I scaffold the repo, add features incrementally, and document decisions so I can return later. Copilot helps with comments and diagrams, but I always review and adjust its output.

Next steps

Cleanup goes beyond removing unused repositories. When tidying a personal or org GitHub account you may also:

  • Reclaim unused cloud resources referenced by projects (e.g., old deployments, test clusters, or storage buckets).
  • Remove or archive unused repositories that are forks, abandoned, or no longer relevant.
  • Find and fix failing or stale GitHub Actions workflows (update action versions or workflow syntax) or remove workflows that are no longer useful.
  • Update CI matrices and runtimes (programming language versions, OS matrix entries) to reduce CI cost and avoid testing very-old combinations.
  • Bump pinned GitHub Action versions and dependencies to address deprecations and security fixes.

Deploy an Azure Functions app from a monorepo with a GitHub Action for Node.js

· 4 min read

Azure Functions apps can be locally deployed from Visual Studio Code using the Azure Functions extension or when you create the resource in the portal, you can configure deployment. These are straightforward when your app is the only thing in the repo but become a little more challenging in monorepos.

Single versus monorepo repositories

When you have a single function in a repo, the Azure Functions app is build and run from the root level package.json which is where hosting platforms look for those files.

- package.json
- package-lock.json
- src
- functions
- hello-world.js

In a monorepos, all these files are pushed down a level or two and there may or may not be a root-level package.json.

- package.json
- packages
- products
- package.json
- package-lock.json
- src
- functions
- product.js
- sales
- package.json
- package-lock.json
- src
- functions
- sales.js

If there is a root-level package.json, it may control developer tooling across all packages. While you can deploy the entire repo to a hosting platform and configure which package is launched, this isn't necessary and may lead to problems.

Monorepo repositories as a single source of truth

Monorepo repositories allow you to collect all source code or at least all source code for a project into a single place. This is ideal for microservices or full-stack apps. There is an extra layer of team education and repository management in order to efficiently operationalize this type of repository.

When starting the monorepo, you need to select the workspace management. I use npm workspaces but others exist. This requires a root-level package.json with the packages (source code projects) noted.

The syntax for npm workspaces allows you to select what is a package as well as what is not a package.

snippets/2024-04-07-functions-monorepo/package-workspaces.json
loading...

Azure Functions apps with Visual Studio Code

When you create a Functions app with Visual Studio Code with the Azure Functions extension you can select it to be created at the root, or in a package. As part of that creation process, a .vscode folder is created with files to help find and debug the app.

  • extensions.json: all Visual Studio Code extensions
  • launch.json: debug
  • settings.json: settings for extensions
  • tasks.json: tasks for launch.json

The settings.json includes azureFunctions.deploySubpath and azureFunctions.projectSubpath properties which tells Azure Functions where to find the source code. For a monorepo, the value of these settings may depend on the version of the extension you use.

As of March 2024, setting the exact path has worked for me, such as packages/sales/.

If you don't set the correct path for these values, the correct package may not be used with the extension or the hosting platform won't find the correct package.json to launch the Node.js Functions app.

  • During development: set the azureFunctions.projectSubpath to the single package path you are developing.
  • During deployment: set the azureFunctions.deploySubpath to the single package path so the hosting platform has the correct path to launch the app.

GitHub actions workflow file for Azure Functions monorepo app

When you create a Azure Functions app in the Azure portal and configure the deployment, the default (and not editable) workflow file is built for a monorepo where the app's package.json is at the root of the repository.

Yaml

snippets/2024-04-07-functions-monorepo/single-app-workflow.yml
loading...

This worklow sets the AZURE_FUNCTIONAPP_PACKAGE_PATH as the root of the project then pushes, pushd './${{ env.AZURE_FUNCTIONAPP_PACKAGE_PATH }}', into that path to build. The zip, zip release.zip ./* -r, packages up everything as the root. To use a monorepo, these need to be altered.

  1. Change the name of the workflow to indicate the package and project.

    name: Build & deploy Azure Function - sales
  2. Create a new global env parameter that sets the package location for the subdirectory source code.

    PACKAGE_PATH: 'packages/sales' 
  3. Change the Resolve Project Dependencies Using Npm to include the new environment variable.

    pushd './${{ env.AZURE_FUNCTIONAPP_PACKAGE_PATH }}/${{ PACKAGE_PATH }}'

    The pushd commands moves the context into that sales subdirectory.

  4. Change the Zip artifact for deployment to use pushd and popd and include the new environment variable. The popd command returns the context to the root of the project.

    Using the pushd command, change the location of the generated zip file to be in root directory.

    The result is that the zip file's file structure looks like:

    - package.json
    - src
    - functions
    - sales.js

  5. The final workflow file for a monorepo repository with an Azure functions package is:

snippets/2024-04-07-functions-monorepo/mono-app-workflow.yml
loading...