Evidence-first workflow intelligence scaffolding

Create a data moat automatically from the work a business already repeats.

automoat is broader than permits. It is a local-first system for finding recurring operational work, capturing the records, decisions, corrections, and outcomes it produces, and turning that stream into a structured dataset a company can reuse. Permit and inspection data is just the first MVP wedge because it is concrete, local, messy, and easy to evaluate.

Current State

Broad product, permit-data MVP as of May 27, 2026

The product vision is automatic data-moat creation for repeated business workflows. The current repo is still in MVP scaffolding mode, so the proof wedge is intentionally narrow: Dallas residential permit and inspection records. What exists today is an evidence-bearing artifact pipeline with cross-sample checks, not a production app.

2
Dallas intake variants with matching generated discovery runs
3
Normalized Dallas dataset paths: one synthetic scaffold plus imported CSV samples v1 and v2
13/13
Contract checks passing in generated/contracts/dallas-electrician-contract-summary-v1/
1083 / 536
Generated eval tasks and reviewed label rows in imported v2
  • MVP wedge: Dallas, Texas permit and inspection data, currently narrowed to residential electrical records.
  • What works: deterministic generation of discovery outputs, normalized rows, fixture packs, reviewed labels, eval scaffolds, contract checks, edge-case coverage, and a generated inspection action queue from repo-local inputs.
  • Most current proof: generated/contracts/dallas-electrician-contract-summary-v1/ passes 13/13 checks across one synthetic and two imported Dallas scaffolds.
  • Current gap: the action queue is still a generated artifact, not a browser workflow or production app; the next import-readiness gap is moving from fixture-backed proof to real Dallas records.
  • Operational note: the unattended loop has produced real artifacts, but the repo still needs a cleaner fail-closed supervisor path when child sessions break.
Moat Candidate

What could become defensible

The moat is the automatic capture of proprietary operating context: the messy records, human decisions, corrections, edge cases, and outcomes that accumulate while work gets done. The permit MVP is just one place to prove the loop before applying it to other recurring business workflows.

1
Workflow-to-dataset capture

Automoat should watch repeated work turn into structured records: inputs, decisions, approvals, exceptions, follow-ups, and outcomes.

2
Human correction memory

Every accepted, rejected, or edited recommendation becomes a label that generic models and public datasets do not have.

3
Reusable eval contracts

Each new domain should prove the same thing: the captured data improves predictions, routing, recommendations, or automation compared with a generic baseline.

4
Compounding operational memory

The more workflows run through the system, the more proprietary examples, edge cases, and outcome-linked playbooks the business owns.

Agent Cockpit

A terminal tunnel to an autonomous Codex loop is the product surface

The app now has a real local cockpit path: run python3 scripts/start_autonomous_cockpit_bridge.py to start a detached autonomous Codex loop, stream its terminal-style log, and watch each bounded iteration publish to main. For a read-only tunnel by itself, run python3 scripts/bridge_mvp_cockpit.py; it exposes a read-only tunnel so someone else can watch the local Codex loop without getting start/stop controls.

codex-loop tunnel
$ automoat loop --goal "create data moat from repeated workflow" $ python3 scripts/start_autonomous_cockpit_bridge.py $ python3 scripts/bridge_mvp_cockpit.py read .pixelbox/handoff.md inspect generated records, labels, evals, and action queue codex exec: make one bounded repo improvement verify, sync, commit, and push to main sleep until the next autonomous iteration stream every step to the remote cockpit
Loop runner

Codex now runs explicit autonomous iterations with a bounded prompt, changed files, checks, commit, push, and next step.

Live terminal stream

The cockpit shows the PTY/log stream so operators can see the agent working instead of trusting a black box.

Artifact feed

Diffs, generated datasets, eval reports, screenshots, decisions, and handoffs become structured product state.

Moat memory

Every prompt, correction, approval, failed check, and accepted recommendation becomes future training and eval material.

Live Remote Bridge

Embedded read-only loop feed

This panel fetches the local Codex loop through the read-only bridge as data, not as an iframe, so the landing page can render status, import readiness, contract checks, queue state, and logs inline.

open bridge
loading status
... iteration
... loop
... contract checks
... import readiness
... queue items
connecting to remote loop...
Artifact Inventory

What exists on disk today

The generated folder shows both sides of the current MVP: business-first discovery outputs and dataset-first eval scaffolding. Every claim below maps to a file in this repo.

Discovery Runs Two generated Dallas business profiles exist: dallas-electrician-sample-v1 and dallas-electrician-south-dallas-v1. Each includes a business profile, workflow map, moat hypotheses, data-gap plan, eval opportunities, and a short operator summary.
Normalized Sample Three Dallas normalized datasets now exist: a synthetic scaffold under generated/normalized/dallas-electrician-sample-v1/ plus imported CSV normalization runs under generated/normalized/dallas-electrician-import-sample-v1/ and generated/normalized/dallas-electrician-import-sample-v2/. The wider v2 import currently carries 530 permits, 1072 inspections, 1610 source-record lineage rows, 3 rule documents, and explicit coverage for pass, fail, partial, cancelled, not_ready, and unknown inspection outcomes.
Eval Scaffold Three generated eval samples now exist. The synthetic scaffold has 14 tasks and 5 reviewed label rows. The imported v1 scaffold has 18 tasks and 7 reviewed label rows, while imported v2 expands to 1083 tasks and 536 reviewed label rows across next_inspection_outcome, failure_reason_classification, recommended_next_action, and pattern_extraction.
Deterministic Writers The repo has thin local writers for discovery generation, fixture-pack generation, row-derived label reviews, eval generation, and raw CSV normalization. Current scripts are generate_dallas_discovery_artifacts.py, import_dallas_permit_extracts.py, generate_dallas_fixture_pack.py, generate_dallas_label_reviews.py, and generate_dallas_eval_artifacts.py.
Contract Summary generated/contracts/dallas-electrician-contract-summary-v1/ now compares the synthetic scaffold to imported v1 and v2, confirms 13 passing contract checks, includes optional imported rule_documents.jsonl coverage, keeps the four eval task families stable, and now checks repeated result-state, failure-reason, pattern-slice, and next-action support explicitly.
Edge-Case Coverage generated/coverage/dallas-electrician-edge-case-coverage-v1/ now makes repeated support visible across result states, failure reasons, pattern slices, and next-action groups. Imported v2 has repeated support for 6/6 result states, 5/5 failure reasons, 5/5 pattern slices, and 6/6 next-action groups.
Inspection Workflow generated/workflows/dallas-inspection-workflow-v1/ turns reviewed labels into a concrete browser-readable action queue. The current queue has 530 items, including 6 high-priority failed inspections and 524 medium-priority partial or not-ready inspections.
Latest Signal

Freshest generated evidence

Freshest non-page artifacts

The freshest generated data artifacts are generated/contracts/dallas-electrician-contract-summary-v1/summary.md and summary.json, written on May 27, 2026. They compare all three current Dallas scaffolds and confirm that the downstream contract still holds.

Coverage thresholds are now enforced

The contract summary now promotes the most important coverage expectations into checks: repeated current result states, repeated core failure reasons, repeated pattern slices, and repeated key next-action groups.

Imported sample v2 is still the data frontier

The widest normalized dataset remains dallas-electrician-import-sample-v2: 530 permits, 1072 inspections, 3 rule documents, 1610 source-record lineage rows, 1083 eval tasks, 536 reviewed label rows, and coverage for cancelled, fail, not_ready, partial, pass, and unknown.

Discovery breadth is intentionally small

Business-first discovery currently has two generated variants: dallas-electrician-sample-v1 and dallas-electrician-south-dallas-v1. That proves the contract can vary by profile, but it is still narrow by design.

Coverage now closes the thin spots

The edge-case coverage report now shows repeated latest-import support for every current result state, failure reason, pattern slice, and next-action group, including complete_remaining_work|schedule_reinspection.

There is now a product-shaped output

generated/workflows/dallas-inspection-workflow-v1/index.html shows the first operator-facing workflow: permit, address, contractor, inspection failure context, recommended actions, observed follow-up, and captured operator correction state for 530/530 queue items.

Build Changelog

Current build log

2026-05-27

Widened imported v2 with ELZ-2026-0731, one more Dallas electrical repair sequence that repeats the henatriacontarecentafoil-bracket incomplete-work repair path, captured the matching accepted operator correction, and regenerated the normalized, fixture, eval, coverage, contract, and workflow artifacts. The latest scaffold now has 530 permits, 1072 inspections, 1083 eval tasks, 536 reviewed label rows, 1610 source lineage rows, and repeated support for 6/6 next-action groups.

2026-05-17

Added scripts/generate_dallas_inspection_workflow.py and generated generated/workflows/dallas-inspection-workflow-v1/. The workflow turns reviewed Dallas inspection labels into a 13-item action queue with priority, address, contractor, trigger inspection, recommended actions, and observed follow-up fields, plus a static index.html page that can be opened locally.

2026-05-17

Promoted the main edge-case coverage expectations into scripts/generate_dallas_contract_summary.py. The generated contract summary now passes 13/13 checks, including repeated current result-state support, repeated core failure-reason support, and repeated key next-action support for the latest imported Dallas scaffold.

2026-05-17

Added scripts/generate_dallas_edge_case_coverage.py and generated generated/coverage/dallas-electrician-edge-case-coverage-v1/. The report makes repeated support visible across result states, failure reasons, pattern slices, and next-action groups; latest imported v2 now shows repeated support for 6/6 result states, 5/5 failure reasons, 5/5 pattern slices, and 6/6 next-action groups after the May 23 fixture-widening pass.

2026-05-17

Widened imported v2 with three more Dallas electrician permit sequences: a repeated cancelled/unknown remodel-final path and two repeated service-release failures around panel and disconnect corrections. Regenerated the normalized, fixture, eval, and contract-summary artifacts. Imported v2 now carries 13 permits, 38 inspections, 49 eval tasks, 19 reviewed label rows, 5 repeated pattern slices, and 5 repeated next-action groups while the shared contract remains 10/10.

2026-05-10

Widened imported v2 again with one repeated service-release sequence and one repeated access-blocked final sequence, tightened the importer and next-action hinting so access labels no longer come from accidental panel schedule matches, and regenerated the normalized, fixture, eval, and contract-summary artifacts. Imported v2 now carries 10 permits, 28 inspections, 37 eval tasks, 15 reviewed label rows, and a real repeated ensure_site_access|schedule_reinspection next-action group.

2026-04-26

Widened imported v2 with three more Dallas electrician permit sequences, regenerated the normalized, fixture, eval, and contract-summary artifacts, and pushed recurring remodel, new-install, and repair pattern slices to 2-permit support each. The shared contract now passes 10/10 checks and makes repeated support explicit.

2026-04-26

Refreshed generated/landing.html against the actual April 26 artifact set: the page now leads with the contract summary, keeps the product framing broad, corrects the imported v2 lineage and rule-document counts, and points the next-step language at the real remaining gap instead of inventing broader progress.

2026-04-26

Extended scripts/import_dallas_permit_extracts.py so imported Dallas samples can optionally normalize rule_documents.csv into rule_documents.jsonl plus matching source-lineage rows. Added Dallas electrical rule fixtures to imported v1 and v2, then regenerated the contract summary so the rules path is now checked explicitly.

2026-04-26

The unattended loop exposed a failure-propagation bug in the supervisor path: failed child sessions were breaking inner work but still returning success to the day loop, so the next runtime hardening step is to make that path fail closed instead of spinning.

2026-04-26

Added scripts/generate_dallas_contract_summary.py and generated generated/contracts/dallas-electrician-contract-summary-v1/, making the shared synthetic-versus-imported Dallas contract explicit. That contract is now broadened by the imported rule-document path; the next gap is now repeated support for the remaining service-release and access-heavy edge cases.

2026-04-26

Refreshed generated/landing.html again so it reflects the current repo state after the contract summary landed: three normalized dataset paths, two discovery variants, three eval scaffolds, exact task and reviewed-label counts, and the real remaining normalization gap instead of stale pre-summary language.

2026-04-26

Added generated/raw/dallas-electrician-import-sample-v2/ and generated normalized, fixtures, and evals -v2 artifacts from it, widening the Dallas importer coverage to include pass, fail, partial, cancelled, not_ready, and unknown inspection outcomes while keeping the downstream contracts stable.

2026-04-26

Updated generated/landing.html to act as a truthful landing page and changelog for the repo's current state: broad product framing, exact Dallas artifact counts, and explicit statements about what is still only scaffolding.

2026-04-26

Added scripts/import_dallas_permit_extracts.py plus generated/raw/dallas-electrician-import-sample-v1/, generated generated/normalized/dallas-electrician-import-sample-v1/, and proved that the imported sample can flow through generated/fixtures/dallas-electrician-import-sequences-v1/ and generated/evals/dallas-electrician-import-sample-v1/ without changing the downstream Dallas contracts.

2026-04-26

Added batch discovery generation in scripts/generate_dallas_discovery_artifacts.py, created generated/intake/dallas-electrician-south-dallas-v1/intake.json, and generated a second Dallas discovery run focused on older-home South Dallas and Oak Cliff work.

2026-04-25

Finished the row-backed path for reviewed supervision by adding scripts/generate_dallas_label_reviews.py and wiring scripts/generate_dallas_eval_artifacts.py to emit generated/evals/dallas-electrician-sample-v1/label_reviews.json directly from normalized rows.

2026-04-25

Added generated/normalized/dallas-electrician-sample-v1/ plus scripts/generate_dallas_fixture_pack.py, so the reusable Dallas fixture pack is generated from row-shaped permit and inspection records instead of hand-maintained JSON sequences.

2026-04-25

Added deterministic discovery and eval writers, along with the first generated discovery and eval sample directories, so the repo could produce its key MVP artifacts from structured inputs rather than prose alone.

2026-04-25

Locked the first implementation wedge to Dallas residential electrical permits and inspections and wrote the supporting spec, schema, eval, and discovery contract docs.

Truth In Advertising

What is not built yet

Built

Broad product framing, a narrow permit-data proof wedge, artifact contracts, deterministic writers, two discovery variants, three normalized Dallas dataset paths, three eval scaffolds, and a shared unattended loop for keeping generated status surfaces current.

Not built

No automatic end-to-end moat builder yet, no production workflow runner, no live baseline-versus-moat benchmarking, no dynamic local app route, and no evidence yet that the permit-data wedge generalizes into a durable data moat.