Daily Agent — Roadmap

The Roadmap

Four phases, prototype → priced product

A phased path to react to, not commit to. Each phase carries one primary dependency — the thing that gates it. We design the shared context / hand-off layer from day one, because that's where the compounding value sits.

Phase 0

Concept

Clickable prototypes + cross-functional pressure-test on feasibility, data readiness and cost.

Dependency: done

✓ Complete

Phase 1

Daily risk briefing

Daily Agent only, operational persona, on the daily risks we already detect. Email + in-app delivery.

Dependency: per-container state + daily snapshots

Next up

Phase 2

Personalisation

Onboarding persona / ICP → prioritisation. Threshold inheritance from the Command Center.

Dependency: onboarding capture + ranking logic

Planned

Phase 3

Hand-offs

Escalation Agent drafts outreach; links to Route Planning & Planning agents with shared context.

Dependency: context layer + human-in-the-loop UX

Planned

The System

A small team of first-person agents

Each has one clear job, hands off to the others, and shares context. Not a chatbot, not a faceless alert engine — assistants that curate, draft and propose, with a human in the loop for anything consequential.

🗞️

Daily Agent FOCUS

The analyst

Reviews the network overnight, curates what matters, authors the morning briefing.

✉️

Escalation Agent

The action-taker

On a flagged risk, drafts the outreach — carrier challenge, consignee/DC notice — for the user to review & send.

🧭

Route Planning Agent

The contingency planner

Searches & scores live sailings to reroute a disrupted shipment.

📈

Planning Agent

The analyst-on-demand

Carrier & lane reliability analytics, capacity / pre-booking strategy.

🧾

Freight Audit Agent

The auditor

Audits freight invoices, surfaces disputes & recoveries.

🤝

The differentiator

interaction + context-sharing

A single agent is a feature. Agents that pass context to each other are a system.

The flagship chain. Daily Agent surfaces a risk in the morning brief → user clicks in → Escalation Agent drafts the stakeholder comms and/or hands the lane to Route Planner, who reroutes → before committing, Planning Agent validates the carrier's reliability on that lane. Each hand-off carries the context, so the user never re-explains. Nothing leaves the building without a Confirm.

Phase 1 · How it works

What the Daily Agent does each morning

For each user, on a nightly batch — scoped to the same lanes, BUs and containers the Command Center already enforces.

Scopes to the user's watched lanes / BUs / containers, including region & segment focus.

Pulls current state of every container — location, ETA, vessel, transshipment, D&D clock, last carrier-feed update.

Applies thresholds — the Low / Med / High bands the user already set in the Command Center.

Computes deltas vs. yesterday — what's NEW, what's continuing, what de-escalated.

Filters for consequence — drops non-actionable noise; an ETA that improves never appears.

Ranks — escalations first, then urgency × business impact, personalised to who the user is.

Writes a natural-language briefing + headline, with deep-links into the Command Center.

Summarises activity — containers in the water, arriving, departing, OTP splits & changes since yesterday.

Delivers — email + in-app "Today" card + bundled view matching the email exactly.

Positioning

Daily Agent vs. standard alerting

Not a replacement — a layer. Real-time alerts still fire for genuine emergencies; the Daily Agent consumes the same thresholds but consolidates and prioritises everything else into one ranked brief instead of fifty pings.

Dimension	Standard alerting	Daily Agent
Trigger	A single rule / threshold breach, per card	A scheduled daily sweep across the user's whole scope
Output	One notification per alert setting	One prioritised, synthesised briefing
Granularity	Per-condition, in isolation	Cross-cutting — ranks & de-dupes across all conditions
Timing	Real-time / event-driven	Once-a-day digest, with NEW-since-yesterday
Reasoning	None — it's a rule	Natural language: explains why it matters & what changed
Answers	"Tell me the moment X happens"	"What deserves my attention today, and why?"

The Crux · Phase 1 gate

Is our data AI-ready?

A read-and-react agent needs more than dashboards-for-humans — it needs structured, current, queryable, diff-able data. The green/yellow/red read below is the call DS + Eng owe the group; it decides whether v1 is "assemble what exists" or "build a data layer first."

HaveCommand Center cards are effectively structured user intent — scope + thresholds — good raw material if stored per-user and queryable.

HaveWe already compute ETAs, transshipment status and D&D clocks for the live product, plus risk / disruption / congestion data.

ValidateIs there a clean, current per-container state record an agent can read in one place — or is state assembled only at render time?

ValidateAre risk evaluations stored with history so we can diff vs. yesterday? If we only have "now," daily diffing needs a snapshot layer.

ValidateFreshness varies by carrier/lane — we know when we detected a change, not when the carrier acted. The copy must never fabricate timestamps.

GapDo we expose a structured context / tool API the agent can call, or would we hand-assemble prompts from raw tables? A "context layer for agents" likely needs building.

GapDo we have ground-truth labels — "what actually mattered to this user" — to evaluate and tune ranking?

Feasibility · Cost · Pricing

What Finance & Eng need to size

Finance can't price until Eng / AI give a per-briefing run-cost estimate. Scale assumption: ~50–70 users per large account at full adoption.

Build (one-time)

State assembly + snapshotting
Ranking + NL generation
Email + in-app delivery
Onboarding UX (Phase 2)

Run (recurring)

LLM tokens / briefing × users × frequency
Compute for the nightly batch
Eval + monitoring
Per-user-per-day is the key number

Pricing options

Seat-based add-on to Command Center
Per-container
Tier upgrade
Usage-based

💸 GCP credits — runway, not pricing basis. ~$70–80k of Google Cloud credits run through Oct 2026. Built on Gemini + Google Compute, the agents' cash run-cost is ≈$0 near-term — which fully de-risks build + pilot with no infra-budget ask. But we must still model unit economics on the true post-credit cost at list rates, or we create a margin cliff when credits lapse. Credits let us build and validate for free; they shouldn't set the price.

For the call

Open questions, by discipline

What we'd like answered, or at least scoped, on the cross-functional call.

🧠 AI / ML

Right architecture — LLM-authored briefing over structured context, rules-based ranking, or hybrid?
Prioritisation — heuristic (escalations-first, urgency × impact) vs. learned?
Context-sharing / hand-off mechanism — shared memory, context object, tool-calling? Minimal version?
Accuracy & guardrails — hallucination control on numbers/ETAs; how we evaluate briefing quality.

🛠 Engineering

Build effort prototype → v1 and Command Center integration shape.
Where agents run, on what cadence, and how we snapshot state daily.
Hand-off plumbing + email / in-app delivery (overlap with existing alerting?).
Scale & fan-out cost at ~50–70 users per account.

📊 Data Science

Is our data adequate and AI-ready? (the crux above)
Do we have the per-container fields every risk needs, reliably populated?
Are risk signals pre-computed and historised, or derived ad hoc?
Persona / ICP → prioritisation logic — can we validate it against real behaviour?

💰 Finance

Run cost per user per day + build cost in eng-months.
Which pricing model fits, and what margin the run costs imply.
How to treat existing GCP credits vs. true post-credit cost.