Orchestrating AI Agents for Pre-Sales: Claude Code + Retell AI

The Pre-Sales Bottleneck

In Salesforce consulting, the pre-sales cycle is where deals are won or lost long before a statement of work gets signed. It is a process that demands deep research into a prospect's business, an honest assessment of their existing Salesforce org, a careful understanding of their pain points, and then the human work of actually reaching out, qualifying the opportunity, and scheduling discovery calls. For a small consulting practice, this cycle is brutal. Every hour spent researching a prospect who turns out to be a poor fit is an hour not spent on billable delivery work.

I had been tracking my own pre-sales metrics for about six months and the numbers were discouraging. On average, I was spending 2.5 hours per prospect on initial research and qualification before even making first contact. Of the prospects I reached out to, roughly 30% converted to a discovery call, and of those, about 40% moved forward to a proposal. Running the math backwards, that meant I was investing roughly 20 hours of pre-sales effort for every signed engagement. For a solo architect trying to balance delivery with business development, that ratio was unsustainable.

The work itself was not intellectually complex. It was repetitive, structured, and followed predictable patterns: look up the company, identify their industry and size, figure out what Salesforce products they were likely using, check for public indicators of their org's maturity, draft a personalized outreach message, make the call, log the result. Each step was straightforward. The problem was the cumulative time cost across dozens of prospects per month, and the cognitive overhead of context-switching between deep technical delivery work and the very different mental mode of sales research.

I started asking myself whether this was a workflow that could be decomposed into discrete, automatable steps coordinated by an intelligent orchestration layer. The answer, it turned out, was yes — but it required thinking about AI agents not as monolithic assistants, but as specialized workers in a coordinated pipeline.

The Multi-Agent Architecture

The key insight that unlocked this project was recognizing that pre-sales is not one task. It is a pipeline of distinct tasks, each requiring different capabilities, different data sources, and different interaction modes. A single AI agent trying to do everything — research, analysis, qualification scoring, voice outreach, CRM updates — would be mediocre at all of them. But a system of specialized agents, each optimized for one stage of the pipeline and coordinated by an orchestration layer, could be genuinely good at each stage.

We designed a four-stage pipeline with three specialized agents coordinated by Claude Code acting as the orchestration brain:

Research Agent — Gathers company intelligence, identifies Salesforce org signals, maps organizational structure, and produces a structured prospect profile.
Qualification Agent — Scores and prioritizes prospects based on the research output, applies ideal customer profile criteria, and generates personalized talking points for outreach.
Voice Agent — Powered by Retell AI, conducts AI voice calls to qualified prospects for initial outreach, lead qualification, and demo scheduling.
CRM Sync — Updates Salesforce with all gathered intelligence, call outcomes, and next steps, maintaining a complete audit trail.

Claude Code sits at the center as the orchestration layer. It does not perform the research or make the calls itself. Instead, it coordinates the flow of data between agents, decides when each agent should be invoked, handles error states and retry logic, and maintains the overall pipeline state. Think of it as a project manager who delegates work to specialists and makes sure the handoffs between them are clean.

Key Takeaway

Multi-agent systems outperform monolithic agents when the workflow has distinct stages requiring different capabilities. The orchestration layer's job is not to do the work — it is to coordinate the specialists, manage data flow between them, and handle the edge cases that arise at stage boundaries.

Agent 1: The Research Agent

The Research Agent is the first stage of the pipeline and the foundation everything else builds on. Its job is to take a company name and domain as input and produce a structured prospect profile as output. The profile needs to be comprehensive enough that a human sales professional could walk into a discovery call fully prepared, without having done any manual research.

The agent gathers intelligence across several dimensions. Company fundamentals come first: industry vertical, employee count, revenue range, headquarters location, and recent news or funding events. Then it moves into Salesforce-specific signals: job postings mentioning Salesforce (which indicate active investment or pain), technology stack indicators from public sources, any Salesforce AppExchange reviews or community forum posts from the company's employees, and LinkedIn profiles of their Salesforce administrators or IT leadership. Finally, it looks for org maturity indicators: are they posting about basic administration challenges (suggesting a newer org) or complex integration and governance issues (suggesting a mature org that might need architectural guidance)?

The agent is implemented as a Claude Code tool-use pattern, where the LLM reasons about what information to gather and invokes a set of defined tools — web search, LinkedIn lookup, job board analysis, and a custom AppExchange review scraper — to collect the data. The structured output is a JSON document that feeds directly into the next stage. Here is a simplified version of the orchestration code that invokes the Research Agent:

import anthropic
import json
from datetime import datetime

client = anthropic.Anthropic()

RESEARCH_TOOLS = [
    {
        "name": "web_search",
        "description": "Search the web for company information, news, and public data.",
        "input_schema": {
            "type": "object",
            "properties": {
                "query": {"type": "string", "description": "Search query"},
                "num_results": {"type": "integer", "default": 10}
            },
            "required": ["query"]
        }
    },
    {
        "name": "analyze_job_postings",
        "description": "Search job boards for Salesforce-related postings from the target company.",
        "input_schema": {
            "type": "object",
            "properties": {
                "company_name": {"type": "string"},
                "keywords": {
                    "type": "array",
                    "items": {"type": "string"},
                    "default": ["Salesforce", "CRM", "Apex", "Lightning"]
                }
            },
            "required": ["company_name"]
        }
    },
    {
        "name": "build_prospect_profile",
        "description": "Compile all gathered data into a structured prospect profile.",
        "input_schema": {
            "type": "object",
            "properties": {
                "company_name": {"type": "string"},
                "domain": {"type": "string"},
                "company_data": {"type": "object"},
                "sf_signals": {"type": "object"},
                "org_maturity_indicators": {"type": "array", "items": {"type": "string"}}
            },
            "required": ["company_name", "domain", "company_data"]
        }
    }
]

def run_research_agent(company_name: str, domain: str) -> dict:
    """Execute the Research Agent for a single prospect."""
    messages = [
        {
            "role": "user",
            "content": f"""Research the company '{company_name}' (domain: {domain}) as a
potential Salesforce consulting prospect. Gather company fundamentals,
Salesforce-specific signals, org maturity indicators, and key contacts.
Use the available tools to build a comprehensive prospect profile."""
        }
    ]

    profile = None
    while profile is None:
        response = client.messages.create(
            model="claude-sonnet-4-20250514",
            max_tokens=4096,
            tools=RESEARCH_TOOLS,
            messages=messages
        )

        if response.stop_reason == "tool_use":
            tool_results = []
            for block in response.content:
                if block.type == "tool_use":
                    result = execute_tool(block.name, block.input)
                    tool_results.append({
                        "type": "tool_result",
                        "tool_use_id": block.id,
                        "content": json.dumps(result)
                    })
                    if block.name == "build_prospect_profile":
                        profile = result
            messages.append({"role": "assistant", "content": response.content})
            messages.append({"role": "user", "content": tool_results})
        else:
            break

    return profile

One critical design decision was having the Research Agent output a structured JSON profile rather than free-text analysis. Structured output ensures that downstream agents can parse the data programmatically. The Qualification Agent needs to read specific fields like estimated_employee_count or sf_products_detected without guessing at how to extract them from prose. This is a pattern I now apply to every multi-agent system: the contract between agents is a schema, not a natural language summary.

Agent 2: The Qualification Agent

The Qualification Agent takes the structured prospect profile from the Research Agent and produces two outputs: a numerical qualification score (0-100) and a set of personalized talking points for the voice outreach stage. This is where the system applies business judgment — determining not just who the prospect is, but whether they are a good fit and how to approach them.

The scoring model is built around what I call an Ideal Customer Profile (ICP) matrix. It is not a machine learning model in the traditional sense. It is a set of weighted criteria that reflect the characteristics of my best past engagements. Companies in the 200-2000 employee range score higher than enterprise accounts (where sales cycles are long and procurement is complex) or very small companies (where budgets are tight). Companies showing active Salesforce investment signals — recent Salesforce job postings, AppExchange activity, community engagement — score higher than those with dormant orgs. Companies in industries where I have deep domain expertise (financial services, healthcare technology, professional services) get an industry fit bonus.

The personalized talking points are the more interesting output. Based on the research profile, the Qualification Agent generates three to five conversation starters that are specific to the prospect's situation. If the Research Agent detected that the company recently posted for a Salesforce Administrator, the talking point might reference the challenges of scaling a Salesforce org without dedicated architectural guidance. If public indicators suggest they are running Classic rather than Lightning, the talking point addresses the modernization opportunity. These are not generic sales scripts — they are tailored observations based on real data, and they feed directly into the Voice Agent's conversation framework.

Prospects scoring above 70 are routed to the Voice Agent for immediate outreach. Prospects scoring 50-70 are queued for a secondary review where I manually verify the qualification before deciding on outreach. Prospects below 50 are logged in Salesforce with the full research profile but deprioritized. This tiered approach ensures the Voice Agent's time (and my credibility) is spent on the highest-probability opportunities.

Integrating Retell AI for Voice Outreach

This is where the system transitions from behind-the-scenes research to real-world interaction. Retell AI provides the infrastructure for building AI-powered voice agents that can conduct natural phone conversations. The voice agent is not reading a script line by line — it is engaging in a dynamic conversation, responding to what the prospect says, handling objections, answering basic questions, and working toward a specific call outcome: qualifying the prospect's interest and scheduling a follow-up discovery call with me.

Configuring the Retell voice agent required careful thought about conversation design. I built the call flow around three phases. The opening phase introduces the purpose of the call, establishes context (referencing something specific from the prospect's profile to demonstrate this is not a cold spray-and-pray call), and asks for permission to continue the conversation. The discovery phase asks two to three open-ended qualification questions: what their current Salesforce challenges are, whether they are planning any major initiatives in the next quarter, and what their timeline looks like for addressing their pain points. The closing phase either proposes a specific time for a discovery call (if the prospect is interested) or gracefully ends the conversation with an offer to send follow-up materials.

The Retell AI integration is handled via their API, with webhook callbacks delivering post-call analysis. Here is the core integration code that creates a call and processes the results:

import requests
from dataclasses import dataclass

RETELL_API_KEY = "your_retell_api_key"
RETELL_BASE_URL = "https://api.retellai.com"

@dataclass
class CallOutcome:
    call_id: str
    prospect_name: str
    disposition: str  # "interested", "not_interested", "callback", "no_answer"
    meeting_scheduled: bool
    meeting_time: str | None
    call_summary: str
    transcript: str
    sentiment_score: float
    qualification_signals: list[str]

def initiate_voice_outreach(
    prospect_profile: dict,
    talking_points: list[str],
    agent_id: str
) -> str:
    """Launch a Retell AI voice call for a qualified prospect."""

    # Build dynamic variables from prospect research
    dynamic_vars = {
        "prospect_name": prospect_profile["primary_contact"]["name"],
        "company_name": prospect_profile["company_name"],
        "industry": prospect_profile["industry"],
        "personalized_opener": talking_points[0],
        "talking_point_2": talking_points[1] if len(talking_points) > 1 else "",
        "talking_point_3": talking_points[2] if len(talking_points) > 2 else "",
        "sf_pain_points": ", ".join(
            prospect_profile.get("detected_pain_points", [])
        ),
        "caller_name": "Azlan Allahwala",
        "caller_title": "Salesforce Solution Architect"
    }

    # Create the call via Retell API
    response = requests.post(
        f"{RETELL_BASE_URL}/v2/create-phone-call",
        headers={
            "Authorization": f"Bearer {RETELL_API_KEY}",
            "Content-Type": "application/json"
        },
        json={
            "from_number": "+14155550123",
            "to_number": prospect_profile["primary_contact"]["phone"],
            "override_agent_id": agent_id,
            "retell_llm_dynamic_variables": dynamic_vars,
            "metadata": {
                "prospect_id": prospect_profile["id"],
                "qualification_score": prospect_profile["qualification_score"],
                "pipeline_run_id": prospect_profile["pipeline_run_id"]
            }
        }
    )

    call_data = response.json()
    return call_data["call_id"]

def process_call_webhook(webhook_payload: dict) -> CallOutcome:
    """Process Retell webhook after call completion."""
    call_id = webhook_payload["call_id"]
    call_analysis = webhook_payload.get("call_analysis", {})
    transcript = webhook_payload.get("transcript", "")

    # Extract structured outcomes from Retell's post-call analysis
    outcome = CallOutcome(
        call_id=call_id,
        prospect_name=webhook_payload["metadata"].get("prospect_name", ""),
        disposition=call_analysis.get("call_disposition", "unknown"),
        meeting_scheduled="meeting_booked" in call_analysis.get("tags", []),
        meeting_time=call_analysis.get("scheduled_meeting_time"),
        call_summary=call_analysis.get("call_summary", ""),
        transcript=transcript,
        sentiment_score=call_analysis.get("sentiment_score", 0.0),
        qualification_signals=call_analysis.get("qualification_signals", [])
    )

    return outcome

The retell_llm_dynamic_variables field is what makes each call feel personalized rather than robotic. By injecting the prospect's name, company, industry, and a research-backed opening line directly into the voice agent's context, the conversation starts with relevant specificity. The prospect hears a reference to their actual business situation in the first ten seconds of the call, which immediately differentiates this from a generic sales dial. Retell's post-call analysis webhook provides structured data about the call outcome, sentiment, and any meetings scheduled, which feeds directly back into the orchestration layer.

One lesson learned the hard way: voice agent conversation design requires extensive testing with real call scenarios. The first version of our call flow was too aggressive about pushing for a meeting. Prospects who expressed mild interest but wanted to think about it were being pushed toward commitment, which created friction. We redesigned the closing phase to include a softer path — offering to send a brief overview document and follow up in a few days — which actually increased our meeting booking rate by 15% because it reduced the pressure on the initial call.

The best multi-agent system is one where each agent does one thing exceptionally well, and the orchestration layer makes sure those agents never step on each other's work.

The Orchestration Layer: Claude Code as the Brain

Claude Code is the central nervous system of this entire pipeline. It does not merely sequence tasks — it makes decisions about flow control, handles errors gracefully, manages state across the pipeline, and adapts based on the outcomes of each stage. The orchestration logic is where the system's intelligence lives, separate from the specialized capabilities of each individual agent.

The orchestration layer manages several critical concerns. Pipeline state management tracks where each prospect is in the pipeline, what data has been gathered, and what the next step should be. If the Research Agent fails to find sufficient data on a prospect (which happens with smaller private companies), the orchestrator does not blindly push them to the Qualification Agent. Instead, it flags the prospect for manual research supplementation and moves on to the next prospect in the queue. Error handling and retry logic addresses the reality that external API calls fail. If Retell's API returns a rate limit error, the orchestrator implements exponential backoff and re-queues the call. If a webhook callback never arrives, a timeout mechanism triggers a manual review flag after 30 minutes.

Conditional routing is where the orchestration adds the most value. Not every prospect follows the same path through the pipeline. A prospect who scores 90 on qualification and has a phone number on file goes straight to voice outreach. A prospect who scores 75 but has no phone number gets routed to an email outreach path instead. A prospect whose research profile reveals they already work with a competing consultancy gets flagged for a different approach entirely — one focused on displacement positioning rather than greenfield opportunity messaging. These routing decisions live in the orchestration layer, not in any individual agent.

Concurrency management was a practical concern I did not anticipate. When processing a batch of 50 prospects, the Research Agent can run in parallel across multiple prospects (it is making independent API calls for each), but the Voice Agent has to be serialized — you cannot make 50 simultaneous outbound calls. The orchestrator manages a call queue with configurable concurrency limits and time-of-day windowing (calls only go out during business hours in the prospect's timezone). It also respects rate limits on all external APIs, batching requests to stay within quotas.

Key Takeaway

The orchestration layer is not just a task sequencer. It is the decision-making brain that handles conditional routing, error recovery, concurrency management, and state tracking. Invest as much design effort in orchestration logic as you do in the individual agents. A chain of excellent agents with poor orchestration will produce mediocre results.

Salesforce CRM Integration: Closing the Loop

Every piece of data generated by this pipeline — from the initial research profile to the voice call transcript — needs to land in Salesforce. This is not just about record-keeping. It is about creating a complete, auditable trail that makes the eventual human touchpoint (the discovery call with me) as productive as possible. When I sit down for a discovery call, I want to open the Lead or Opportunity record and see everything: the research profile, the qualification score, the talking points that resonated on the voice call, the prospect's stated pain points, and any specific questions they asked.

We built the CRM integration using Salesforce's REST API, with a custom set of fields and a related object structure designed to capture pipeline data without cluttering the standard Lead layout. The core Lead record gets standard field updates: company information, contact details, lead source (tagged as "AI Pipeline"), and a lead score mapped from the qualification score. A custom AI_Pipeline_Data__c related object stores the detailed research profile, qualification rationale, and talking points. A second custom object, AI_Call_Log__c, captures the voice call data: duration, disposition, transcript summary, sentiment score, and whether a meeting was scheduled.

The integration also handles the critical handoff moment: when a voice call results in a scheduled discovery meeting, the orchestrator creates a Salesforce Event linked to the Lead, sends me a notification with the complete prospect dossier, and updates the Lead status to "Discovery Scheduled." By the time I open my calendar for the meeting, I have a full briefing document generated entirely by the pipeline — the prospect's business context, their likely Salesforce challenges, what was discussed on the AI call, and recommended discussion topics for the discovery session.

One design decision I am particularly glad we made: every AI-generated data point in Salesforce is tagged with a Source__c field that identifies which agent produced it and when. This is not just for auditing. It is essential for debugging when something goes wrong. If a prospect tells me during a discovery call that the AI voice agent referenced incorrect information about their company, I can trace that data back to the specific Research Agent run, see what sources it pulled from, and correct both the data and the agent's behavior for future runs.

Results and Lessons Learned

After running this system for three months across approximately 200 prospects, the numbers tell a clear story. Research time per prospect dropped from 2.5 hours to approximately 15 minutes of human review time (the Research Agent does the heavy lifting, I verify and supplement). The qualification stage, which used to be an intuitive gut check, now produces consistent, criteria-based scores that have proven to be better predictors of eventual deal closure than my manual assessments were. And the voice outreach stage — which I was most skeptical about — has been the biggest surprise.

The AI voice agent achieved a 28% conversation rate (meaning 28% of answered calls resulted in a substantive conversation rather than an immediate hang-up or request to be removed from the list). Of those conversations, 34% resulted in a scheduled discovery call. For comparison, my manual cold-calling conversion rate was around 22% for conversations and 38% for meetings-from-conversations. The AI agent is slightly better at getting people to talk (likely because it is unfailingly patient and polished in its delivery) and slightly worse at converting conversations to meetings (likely because it cannot improvise as fluidly as a human when a prospect goes off-script). The net result is roughly equivalent pipeline generation at a fraction of the time investment.

The overall impact on my pre-sales efficiency has been substantial. I went from spending roughly 20 hours of pre-sales effort per signed engagement to approximately 6 hours — and most of those 6 hours are spent on the high-value activities (discovery calls, proposal writing) rather than the research and outreach grind. The pipeline volume has also increased. Because the system can process prospects in parallel, I am evaluating three to four times more prospects per month than I could manually, which has directly increased my pipeline coverage.

The lessons learned fall into a few categories. First, agent specialization matters more than agent sophistication. A Research Agent that is excellent at gathering and structuring data but cannot make a phone call is far more valuable than a general-purpose agent that does both poorly. Second, the orchestration layer is where the real complexity lives. Getting three agents to work together reliably, handling failures gracefully, managing state across a multi-stage pipeline — this is harder than building any individual agent. Third, voice AI is closer to production-ready than most people think. Retell AI's voice quality and conversational fluidity surprised me, and the prospects I have spoken with after their AI call rarely mention that anything felt off about the initial outreach. Fourth, the human-in-the-loop checkpoints are essential. The system works because I review qualification scores before voice outreach happens, I verify research profiles when something looks off, and I personally conduct every discovery call. The AI handles the volume; I handle the judgment.

This system is not replacing the human element of pre-sales. It is eliminating the parts of pre-sales that do not require human judgment — the research, the data gathering, the initial outreach — so that the human effort is concentrated where it actually moves deals forward. For a solo Salesforce architect trying to build a practice while delivering excellent work, that concentration of effort is the difference between a sustainable business and perpetual overwhelm.

Orchestrating AI Agents for Pre-Sales: Claude Code + Retell AI Voice Calling

The Pre-Sales Bottleneck

The Multi-Agent Architecture

Agent 1: The Research Agent

Agent 2: The Qualification Agent

Integrating Retell AI for Voice Outreach

The Orchestration Layer: Claude Code as the Brain

Salesforce CRM Integration: Closing the Loop

Results and Lessons Learned

Need a Salesforce architect?

Orchestrating AI Agents for Pre-Sales: Claude Code + Retell AI Voice Calling

The Pre-Sales Bottleneck

The Multi-Agent Architecture

Agent 1: The Research Agent

Agent 2: The Qualification Agent

Integrating Retell AI for Voice Outreach

The Orchestration Layer: Claude Code as the Brain

Salesforce CRM Integration: Closing the Loop

Results and Lessons Learned

Related Posts

Building a Salesforce Documentation Toolkit with Claude Code

Building an Agentforce-Powered Service Desk That Resolved 40% of Cases Automatically

Need a Salesforce architect?