Most "AI in the product" is a chat box bolted onto the corner. The agents that earn their keep are different: they have a personality, a ranked sense of what matters, a memory of the business, real tools to query live data, and the right surfaces to speak through. Here's the playbook I used to build three of them — an AI CEO, an AI CFO and an AI CMO — into one operational dashboard.
Written by Rhea Karuturi · CTO & co-founder, Hoovu Fresh
Before any prompt engineering, get the shape right. A useful agent sits on top of a loop that turns raw data into a point of view, and a point of view into action that creates more data. Build the loop first; the personality is the last 10%.
Your real operational truth — orders, cash, attendance.
A nightly-distilled, agent-readable summary of that data.
Persona + priorities + tools, reasoning over the memory.
Briefings, chat, emails, nudges — where it speaks.
People act → new data → the loop deepens.
Every day it knows the business a little better than the day before — because the loop feeds itself.
This is what separates an assistant from a colleague. An agent that knows who it is, how to speak, and what to care about feels like a member of the team — not a search box.
The single highest-leverage line in the whole system is the identity statement. Our agents don't say "the data shows" — they say "we". They're cast as a specific executive with a specific temperament: the CEO is calm, assured and direct; the CFO is a cash-obsessed realist; the CMO has taste and cultural fluency. Naming the role and the temperament up front colours every sentence that follows.
// One builder. Every prompt in the system flows through it, // so the voice is identical in a briefing, an email or a chat. function buildSystemPrompt(mode, staffMap) { const base = `You are the AI CEO of Hoovu, a B2B puja-flower supply chain in 9 cities. Voice: calm, assured, data-driven, direct. Use "we" — you are part of this team. Open with what is on track. Attach a concrete action to every gap. End with direction, not alarm.`; return base + toneFor(mode) + rulesBlock() + staffBlock(staffMap); }
The same fact needs to be said differently to different people. To a founder, "cash is tight this week" is useful candour. To the warehouse team, it's needless anxiety. So tone is a mode the builder switches on — same brain, different register. We ban specific words in team mode and require honesty in founder mode.
An agent with no priorities gives you a flat list of everything. An agent with a ranked value system gives you a point of view. The CMO, for instance, is told its priorities in order — and when two pull against each other, it knows which wins. The CEO leads with what's on track; the CFO thinks in cash before anything else. Rank order is how you get judgment instead of a data dump.
The fastest way to lose trust is for the agent to flag something that isn't actually a problem. So we encode the business's lived rules directly into the persona: revenue is final by 6 AM, so never say "tracking well"; fill rate reads 0% before 2 PM because invoices aren't entered yet, so don't panic; partial invoices land the next day, so don't flag a 48-hour gap. These rules are applied silently and never narrated.
// Apply silently — never name these rules to the reader. - Revenue is FINAL by 6 AM IST. Never say "at this pace". - Fill rate = 0% before 2 PM is normal. Flag only after 4 PM. - Partial invoices fill NEXT day. Don't flag a <48h gap. - Procurement 36–45% is fine; flag only if > 45%. - Benchmarks: fill ≥97% · labour ≤8% · AOV ≥₹35.
You can't paste a whole database into a prompt — it's too big, too slow, too expensive. The trick is to distil your raw data into a compact, human-readable memory the agent reads instead. This is the part everyone skips, and it's the part that makes the agent smart.
Raw data answers "what happened". Memory answers "what does it mean". We walk the whole operation and write a memory layer organised into eleven categories — clients, cities, flowers, cashflow, festivals, labour, wastage, farmers, pricing, tasks and the company's North Stars. Crucially, each category gets both structured metrics and a short narrative in plain language — because the narrative is what the agent reasons on most fluently.
AIMemory/ clients/narrative: "Zepto is our largest account but receivables have crept to 40 days. Swiggy steady…" clients/zepto: { daysSinceLastOrder: 0, revenueTrendPct: -8, fillRate: 0.97 } flowers/narrative: "Rose running hot pre-Navaratri…" cashflow/summary: { peakGapDay: "Thu", gap: 420000 } cities, festivals, labour, wastage, farmers, pricing, tasks, northstars // 11 categories total
Once the memory exists, every surface reads it — not the database. The morning briefing doesn't re-aggregate a month of orders; it reads the pre-written P&L narrative and the day's numbers. This keeps responses fast, cheap and — most importantly — consistent: every surface tells the same story because they all read the same memory.
The memory is rebuilt every night at 11:30 so the morning briefing reads a fresh picture. Scheduled jobs do the heavy lifting off-peak: the nightly memory build, the morning briefing, the end-of-day summary, personal daily plans, and a weekly recompile of platform best-practices for the CMO. Each job is idempotent and logs its own failures, so a bad night never corrupts the memory.
23:30 → rebuild AIMemory (all 11 categories) 06:00 → morning briefing (reads last night's memory) 10:30 → personal daily plans (per manager) 18:30 → end-of-day summary Mon 04:00 → recompile platform skills for the CMO // every 30 min → anomaly scan → proactive nudges
A mind with memory is still trapped until you give it tools to fetch what it doesn't know, and surfaces to speak through. This is where the agent stops being a report and starts being a presence.
The nightly memory can't hold everything, so the chat console gets tools: a set of
functions it can call to query live data mid-conversation. The model plans, calls a tool,
reads the result, and either answers or calls another — a tool-use loop. The discipline
is in the contract: every tool caps its date range, caps its result size, and
returns a predictable { summary, data, notes } shape so the model never drowns in rows.
getCityPnL({ city, fromDate, toDate }) { // 1. cap the range — never let it ask for 2 years range = clampDays(fromDate, toDate, 90); // 2. compute, then cap the payload (~30KB) return { summary: "Bangalore: ₹2.1L revenue, 41% procurement…", data: rows.slice(0, 200), notes: "Fill rate excludes today (invoices pending)." }; }
AI calls are slow and cost real money, so nothing recomputes that doesn't have to. The briefing card uses a three-tier cache: it checks the browser first, then a shared server-side cache (so the first teammate to log in pays for everyone), then the API only if both miss. Everything is timestamped — "generated 2h ago" — with a manual refresh button for when you truly want a fresh take.
The same agent shows up in four different ways, each suited to a different need. Get this mapping right and the agent meets people where they already are — instead of asking them to come find it.
A few bullets at the top of every home page — the answer before the data. Passive, glanceable, always there.
A tool-using console for the long tail. Ask anything; it runs live queries and cites the numbers. Active, conversational.
Morning briefing, EOD summary, personal daily plans — pushed to the inbox so the dashboard comes to you.
A 30-minute scan turns anomalies into pings — and pings into tasks. The agent reaches out first.
Passive when you're busy, active when you're curious, push when you're away, proactive when it matters. The art is matching the surface to the moment — not forcing everything through a chat box.
If you're building your own, this is the order I'd do it in — mind first, memory second, hands and voice last.
Don't build a chatbot. Build a colleague — one with a point of view, a memory, and the manners to speak up only when it helps.
None of this requires a heavy stack. Ours runs on static pages, one database, a handful of scheduled functions, and a couple of model APIs. The hard part was never the infrastructure — it was the judgment: deciding who the agent is, what it values, what it's allowed to say, and where it's allowed to interrupt. Get those right and the technology is almost incidental.