Create AI-powered assistants that use tools, learn from feedback, and run on autopilot.
Agents are AI-powered assistants that can use tools to solve open-ended tasks. Unlike workflows that follow predetermined paths, agents make intelligent decisions about which tools to use and when, adapting their approach based on the task at hand.
Think of agents as intelligent assistants that can orchestrate your workflows. You give them a goal, provide them with tools (integrations and workflows), and they figure out how to accomplish the task by deciding which tools to use and when.Key Characteristics:
Adaptive: Different approaches for different situations
Tool-driven: Use integrations and workflows as needed
Context-aware: Consider your instructions and conversation history
Agents vs. Workflows: Agents are like quarterbacks who read the situation and call the right plays (workflows). Workflows are the plays themselves: reliable, repeatable sequences that execute consistently.
Select integrations and workflows your agent needs.Start Simple: Begin with 2-3 tools and add more as needed. Too many tools can overwhelm the agent and make behavior less predictable.
If you’re using workflows as tools, use descriptive names like “Get User Activity and Salesforce Status” instead of “User Workflow” so agents understand when to use them.
2
Write Instructions
Create clear, specific instructions:
Copy
Ask AI
You're a support operations assistant helping evaluate discount eligibility.When given a Zendesk ticket number:1. Retrieve ticket details and display them2. Ask for confirmation before proceeding3. Run "Get User Profile" workflow4. Read discount criteria from Notion page [URL]5. Evaluate eligibility with clear reasoningAlways respond professionally. If uncertain, ask for clarification.
Your agent is brilliant but new to your business. Be explicit about your preferences and expectations.
3
Test Thoroughly
Use the built-in chat interface to test:
Start Simple: Test basic functionality with straightforward requests
Find Edge Cases: Try unexpected inputs and ambiguous requests
Refine Instructions: When mistakes occur, ask the agent: “What could I add to your instructions to help you handle this correctly next time?”
Document Patterns: Keep notes on successful approaches
Limiting Tool Capabilities (Recommended)
For MCP integrations, you can restrict which specific tools the agent can access. This makes your agent more reliable and prevents unintended actions.How to limit tools:
Click on any MCP integration you’ve added (Gmail, Salesforce, Slack, etc.)
Toggle off specific tools you don’t want the agent to use
Add this to your system prompt: “If a user asks you to perform an action that’s disabled (like sending an email), explicitly tell them that capability is not available for this agent.”
Agents are equipped with tools to accomplish tasks:
MCP Integrations
Direct connections to external services:
Gmail: Read, search, and send emails
Salesforce: Query records and update data
Notion: Search documentation and databases
Zendesk: Retrieve and manage support tickets
Google Calendar: Check availability and schedule meetings
And many more
Agents use the personal default credentials of whoever is running them (unless team credentials are configured). You’ll be prompted to authenticate if needed.
Gumloop Workflows
Your workflows become powerful tools for agents:
Agents see the workflow name and description
Understand expected inputs and outputs
Call workflows when appropriate
Process results to inform next steps
Example: A workflow that queries BigQuery, pulls Salesforce data, and combines them becomes a single “Get User Profile” tool for your agent.
Custom MCP Servers
Connect your own MCP server for services not available natively:
The Code Sandbox is natively enabled on all agents. Your agent can execute code in a secure, isolated environment:
Run Python code for data analysis, visualizations, and computations
Execute shell commands for file operations and package installation
Read/write files in the sandbox filesystem
Upload/download files between Gumloop storage and the sandbox
The sandbox comes with 80+ pre-installed Python packages including pandas, numpy, matplotlib, scikit-learn, and more. See the Code Sandbox section below for details.
Skill Editing & Creation
This toggle controls whether your agent can create and edit skills on its own during conversations. It’s found in the Tools section of your agent configuration and is enabled by default.
Enabled (default, recommended): Your agent can create new skills, update existing ones, and fix its own mistakes. This is the preferred option because it allows the agent to be iterative and continuously improve its playbooks over time.Disabled: Your agent can still read and use the skills you’ve already attached, but it cannot create new skills or edit existing ones on the fly. Use this if you want tighter control over your skill library.
Keep this enabled to get the most value from skills. An agent that can refine its own playbooks gets better with every conversation.
The system prompt defines your agent’s personality, behavior, and decision-making:
Define a Role
Good: “You’re an executive assistant for a CFO of a Fortune 500 company.”Bad: “You help with tasks.”A specific role sets tone, expertise level, and communication style automatically.
Set Tool Usage Rules
Explain when and how to use specific tools:
Copy
Ask AI
When the user provides a ticket number:1. Retrieve ticket details and ask for confirmation2. Once confirmed, run the "Get User Profile" workflow3. Read the discount policy from Notion4. Assess eligibility and explain your reasoning
Establish Confirmation Rules
Define when the agent should ask before acting:“Always ask for confirmation before:
Sending emails
Deleting data
Making changes to external systems”
Specify Response Style
Control tone, length, and format:“Always respond professionally. Keep responses under 100 words. Use bullet points for lists.”
Your agent can update its own system prompt during any conversation. If something goes wrong, you don’t need to leave the chat and manually edit the instructions. Just tell the agent to fix it, right there in the conversation.Toggle this on in Agent Preferences, right below the system prompt editor. Enabled by default for new agents. Changes take effect immediately and persist across all future conversations.
You tell the agent
“Update your system prompt to never do X again” or “Add a rule that all emails should be under 100 words.” The agent modifies its instructions on the spot.
The agent learns on its own
You correct the agent (“don’t use that tone,” “always check Salesforce first”) and it proactively updates its instructions so the same mistake doesn’t happen again.
How it works under the hood
The agent has access to a file containing its current system prompt in its sandbox. When it modifies that file, changes are automatically detected, saved, and applied to the very next step in the conversation. Every future conversation uses the updated version.Safety guards: The agent cannot delete its own instructions (empty content is rejected), and there is no version history. If the agent makes an unwanted change, you can manually revert by editing the system prompt.
This works hand-in-hand with skills. Skills teach the agent how to do specific tasks; Self-Improve Instructions lets the agent refine its overall personality, tone, and decision-making rules. Together, they create an agent that gets better with every conversation.
Agents can run autonomously without manual interaction. Set up scheduled tasks to run on a recurring schedule or as a one-time task, or create event-based triggers to fire your agent when something happens in an external service.
Scheduled Tasks
Run your agent on a recurring schedule (e.g. every weekday at 9 AM) or as a one-time task (e.g. in 30 minutes). Your agent can even create and manage its own schedules during a conversation.
Event-Based Triggers
Fire your agent when a new email arrives, a Slack message is posted, a database record changes, and more. Supports Gmail, Slack, Teams, Google Sheets, Notion, Airtable, Zendesk, and other integrations.
Your system prompt defines who the agent is. Tools give it the ability to act. But without skills, your agent improvises every time. It might send a decent email, but it won’t follow your outreach sequence, use your templates, or log things the way you want.Skills fill that gap. They’re reusable knowledge packs that teach your agent how to do specific work your way:
Encode multi-step processes the agent should follow every time (outreach sequences, triage checklists, reporting workflows)
Store templates and domain knowledge too detailed for the system prompt
Load only when relevant, saving tokens compared to stuffing everything into the system prompt
Improve over time as the agent learns from your feedback and even creates new skills on its own
Creating an agent is the 0 to 1. Embedding that agent in a workflow is the 1 to 100.
The Agent node lets you run any of your pre-configured agents directly within your workflows. With Triggers, agents can already run on a schedule or respond to external events on their own. Embedding them in workflows unlocks additional capabilities: webhook triggers, chaining with other nodes, and batch processing.
The section above covers putting an agent inside a workflow, where the workflow triggers and orchestrates the agent. This is the reverse: giving your agent workflows it can call as tools.When a workflow is attached as a tool, the agent decides when to use it, fills in the inputs, kicks it off, and reads the outputs. The agent is the orchestrator; the workflow just does its job and returns the result.Tips for building workflows your agent will call:
Use Input and Output Nodes
Critical: Agents identify workflow parameters through Input and Output nodes.
Without clear inputs/outputs, the agent won’t know how to use the workflow properly.
Use Descriptive Names
Workflow names should clearly indicate their purpose:✅ Good: “Get User Activity and Salesforce Status”, “Enrich Lead from LinkedIn Profile”❌ Bad: “User Workflow”, “Workflow 1”, “Data Thing”
Add Clear Descriptions
Write descriptions that help agents understand:
What the workflow does
When it should be used
What inputs it expects
What outputs it provides
Example: “Takes a LinkedIn profile URL and returns enriched company data including size, industry, recent funding, and key contacts. Use when researching new leads or companies.”
Keep Workflows Focused
Each workflow should do one thing well:
✅ “Enrich Contact from Email”
✅ “Send Slack Notification”
❌ “Enrich Contact and Send Notification and Update CRM”
Let the agent orchestrate multiple focused workflows rather than building mega-workflows.
The Code Sandbox gives your agent the ability to execute Python code and shell commands in a secure, isolated environment. It’s natively enabled on all agents, so there’s nothing to configure.What the sandbox can do:
Run Python code for data analysis, visualizations, and computations
Execute shell commands for file operations and package installation
Read and write files in the sandbox filesystem
Upload and download files between Gumloop storage and the sandbox
The sandbox comes pre-loaded with 80+ Python packages including pandas, numpy, matplotlib, scikit-learn, and more. If you need something that isn’t installed, your agent can install it with pip install during the conversation.
The sandbox environment persists across tool calls within the same conversation, allowing your agent to build on previous work and maintain state throughout the interaction.
If you need a package that isn’t pre-installed, your agent can install it using pip install package-name via the shell command tool. However, installed packages only persist for the current conversation.
Fine-tune how your agent’s AI model behaves, manages long conversations, and handles failures. Access these settings by clicking the AI Advanced Settings button in your agent configuration.
Defaults are optimized. All settings are pre-configured for the best balance of performance, cost, and reliability. Only adjust these if you have a specific need.
Model Config
Auto Summarization
Fallback
Customize AI model behavior. Available parameters vary by provider (OpenAI, Anthropic, Google). Settings are stored per-provider, so switching models preserves your preferences for each.
OpenAI Parameters
Parameter
Range
Default
Description
Reasoning Effort
low, medium, high
medium
Controls computational effort before responding. Higher = more thorough reasoning but slower and more tokens. Only for o-series and GPT-5 models.
Upper bound for generated tokens (includes reasoning tokens). If unset, uses model default.
Max Tool Calls
1+
Auto
Limits total tool calls per response. Useful for cost control.
Top P
0–1
1
Nucleus sampling. Model considers only tokens in the top P probability mass. Only for GPT-4o, GPT-4.1, GPT-5.1, GPT-5.2, o3, o4.
Parallel Tool Calls
on/off
on
When on, model can execute multiple tools simultaneously. Disable if tools depend on each other’s results.
Adjust either Temperature or Top P, not both. Using both produces unpredictable results.
Anthropic (Claude) Parameters
Parameter
Range
Default
Description
Extended Thinking
enabled, disabled
enabled
Shows Claude’s reasoning process before the final answer. When enabled, temperature is forced to 1.0.
Budget Tokens
1,024+
10,000
Token budget for thinking. Higher = more thorough reasoning but slower. Only shown when thinking is enabled. Must be less than Max Tokens.
Temperature
0–1
1
Controls output randomness. Note: forced to 1.0 when Extended Thinking is enabled.
Max Tokens
1+
Auto
Maximum tokens to generate. Defaults: 64,000 (Claude 4.5), 8,192 (older models).
Top P
0–1
1
Nucleus sampling threshold.
Top K
1+
Auto
Limits sampling to top K most likely tokens. Lower values (10–40) produce more focused outputs by removing long-tail responses.
Disable Parallel Tool Use
on/off
off
When on, model outputs at most one tool call per response. Enable for strict sequential execution.
Google (Gemini) Parameters
Parameter
Range
Default
Description
Thinking Level
LOW, HIGH
HIGH
Controls depth of internal reasoning. LOW = faster responses, HIGH = deeper analysis. Only for Gemini 2.5 and 3 models.
Temperature
0–2
1
Controls output randomness.
Top P
0–1
0.95
Nucleus sampling. Note: Google’s default (0.95) is lower than other providers.
Top K
1+
Auto
Maximum tokens considered at each generation step.
Max Output Tokens
1+
Auto
Maximum tokens to generate.
Temperature vs Reasoning: Temperature controls randomness in word selection. Reasoning effort/thinking controls how much the model deliberates before responding. For complex analysis, increase reasoning, not temperature.
Manages long conversations by summarizing older messages when context limits are reached.
When a conversation approaches the model’s context limit, the system identifies the oldest messages, generates a structured summary (Goal, Actions Taken, Key Data, Status, Next Steps), and replaces them while keeping recent messages in full detail.
Compaction operates at the part level within messages, not just message level. This enables finer-grained compaction when a single message contains many large parts (e.g., 24 tool call results totaling 150k+ tokens).
Behavior
Auto Mode (Default)
Override Mode
When to summarize
System determines automatically
System determines automatically
Summary model
Auto-selected (fast, cost-effective)
You choose the model
Prune Protection
Optimized default (40,000 tokens)
Configurable
Summary Max Tokens
Optimized default (30,000 tokens)
Configurable
Keep defaults unless you have specific context requirements. Lower Prune Protect = more aggressive summarization. Higher Summary Max = summaries trigger earlier.
Ensures reliability by automatically switching to alternative AI models when your primary model is unavailable.
When errors occur, the system classifies the error, retries based on severity, then falls back to the next model in the chain. Models from the same provider are automatically excluded for true redundancy.
Error Type
Retries Before Fallback
Rationale
Rate Limit
2
Rate limits often clear quickly
Provider 5xx
1
Provider unhealthy, fallback faster
Network Error
0
Immediate fallback (no retries)
Timeout
1
Server overloaded, try once then fallback
Auto Mode (Default): Fallback models are selected based on your primary model’s characteristics:
Primary Model Type
Fallback Chain
Expert/Smartest (intelligence rating > 4)
Claude Opus 4.5 → Gemini 3 Pro → GPT-5.2
Fastest (speed rating > 4)
Gemini 3 Flash → Claude Haiku 4.5 → GPT-4.1
Recommended/Balanced
Claude Sonnet 4.5 → Gemini 3 Flash → GPT-5.2
Override Mode: Manually select up to 2 fallback models. Drag-and-drop to set priority order. Models from the same provider as your primary are automatically filtered out.
Disabling fallback means your agent will fail if the primary model is unavailable. Keep enabled unless you have specific compliance requirements.
Agents consume credits based on AI model usage, workflow executions, and integration operations.
Credits are charged per AI interaction. Cost depends on message length, model selected, conversation history (each message includes previous context), and number of tools available.
Credit costs are approximate per message. Actual costs vary based on message length, conversation history, tools available, and whether the agent executes workflows. Agents also support Kimi K2.5 and Qwen 3.5 via OpenRouter (exclusive to agents, not available in workflow nodes).
When an agent uses integrations and workflows, it needs credentials to access external services.
Key principle: Agents always use the credentials of the person running the agent, not the agent creator’s credentials. If you run the agent, it uses your credentials. If a teammate runs it, it uses their credentials.
Personal Agents (Recommended)
Personal agents are created in your personal space and always use the personal default credentials of whoever is running the agent.How it works:
When you chat with your personal agent, it uses your authenticated accounts
When you share the agent via chat link or Slack, each person’s requests use their own personal credentials
No one can access another person’s data through the agent
Each user must authenticate with the required services using their own accounts
Best for: Most use cases, maximum data privacy, individual productivity, testing and development
Always create agents in your personal space unless you specifically need team credentials or access control features.
Team Agents
Team agents are created in teams and have two key differences from personal agents:1. Access Control: Only members of the team can use the agent. Non-members will receive an “access denied” message.2. Credential Behavior: If an MCP integration or workflow is set to use “team default” credentials, those team credentials are used instead of personal credentials. Otherwise, the personal default credentials of whoever is running the agent are used.Best for: Team collaboration requiring shared credentials, controlled access to specific team membersLearn more about teams in the Organizations and Teams documentation.
Begin with 2-3 tools and straightforward instructions. Test thoroughly before adding more capabilities.Anti-pattern: Creating an agent with 15 tools and a 2000-word system prompt immediately.Better approach: Start with core functionality, observe how the agent behaves, then incrementally add tools and refine instructions based on real usage.
Treat Agents as Work in Progress
Agents improve over time as you refine instructions based on real interactions:
Review conversation history regularly for patterns
When agents make mistakes, ask: “What could I add to your instructions to prevent this?”
Document edge cases and add explicit handling rules
Celebrate successful patterns and codify them in instructions
Design Workflows for Agent Use
When creating workflows that agents will call:
Always use Input and Output nodes so agents understand parameters
Keep workflows focused on single responsibilities
Use descriptive names that indicate purpose
Write clear descriptions explaining when to use the workflow
Test workflows independently before giving them to agents
Set Clear Boundaries
Define what agents should NOT do to prevent unintended actions:
Copy
Ask AI
Never:- Delete customer data or records- Send emails without explicit user approval- Make purchases or financial commitments- Override manual decisions by team members- Modify production data without confirmation
Be explicit about destructive actions requiring human oversight.
Monitor and Measure Performance
Track metrics to understand agent effectiveness:
Time saved: Hours saved per week through automation
Success rate: Tasks completed successfully vs requiring intervention
Tool usage: Which workflows/integrations are used most frequently
Credit efficiency: Cost per completed task or interaction
Agent Uses Wrong Tool
Symptoms: Agent calls inappropriate tools or skips necessary stepsSolutions:
Make tool names more descriptive and explicit
Add specific “When to use” guidance in system prompt
Reduce number of similar tools that might confuse the agent
Provide examples: “When user asks X, use tool Y”
Check if workflow descriptions clearly explain their purpose
Agent Doesn't Understand Workflow Inputs/Outputs
Symptoms: Agent fails to call workflows correctly or misinterprets resultsSolutions:
Ensure workflows have Input and Output nodes defined
Use clear, descriptive names for input/output fields
Add workflow descriptions explaining parameters
Test workflow independently to verify inputs/outputs work
Simplify complex workflows into smaller, focused ones
Authentication Errors
Symptoms: Agent reports missing credentials or authentication failuresSolutions: