Prompt-Injection Risks Specific To ChatGPT Plugins And Connectors
When ChatGPT can read your email, browse the web, or call APIs, attackers can hide instructions inside that content. The risk is real and the defenses are mostly hygiene.
9 min · Reviewed 2026
What prompt injection is in this context
Direct prompt injection is when a user types adversarial instructions into ChatGPT. Indirect prompt injection is when ChatGPT reads content from a tool — a webpage, an email, a calendar invite — and that content contains instructions intended to override the system prompt. The model has no reliable way to tell instructions from data. That is the whole problem.
Where the risk concentrates in ChatGPT
Browser tools — a webpage can include hidden text targeting agents.
Email connectors — an inbound email can contain instructions to forward content.
Document Q&A — a malicious uploaded file can carry an injection payload.
Calendar invites — descriptions are user-controlled and reach the agent.
Custom GPT actions — return data from your API can contain hostile text from third-party sources.
Capability surface
Worst-case if injection succeeds
Mitigation
Browser / Operator
Agent visits attacker site, takes action
Approval gate every navigation
Email connector
Sensitive email forwarded to attacker
No 'send' action without explicit human approval
Document Q&A
Hidden instructions exfiltrate other docs
Strip / sanitize untrusted documents before indexing
Custom GPT action
Action calls attacker-controlled endpoint
Allowlist domains, never echo arbitrary URLs
Practical defenses for non-engineers
Treat any tool the model uses as if it could be hostile. Approve sends and reads explicitly.
Never let an agent take an irreversible action from data it pulled in by itself.
Scope connectors to the minimum needed. Revoke scope when the project ends.
Watch for surprise actions — an agent that suddenly wants to email someone is a tell.
Log everything your agent does. The audit trail is your only forensic tool.
Applied exercise
List every connector and Custom GPT action your account has live.
For each, write the worst-case outcome of a successful injection.
Disable any whose worst-case is unacceptable.
Set a 60-day reminder to repeat this audit.
The big idea: every tool you give the model expands the attack surface. Defense is mostly hygiene — minimum scope, explicit approvals, regular audits.
End-of-lesson check
15 questions · take it digitally for instant feedback at tendril.neural-forge.io/learn/quiz/end-openai-prompt-injection-risks-creators
Why is the model fundamentally unable to reliably distinguish instructions from data?
The model lacks sufficient training on security concepts
The model prioritizes user requests over system prompts
The model's context window is too small to track instruction origins
The model processes all text as tokens without inherent markers separating commands from content
A user enables a browser tool that lets ChatGPT visit websites. What is the primary injection risk?
Webpages can cause the model to generate incorrect factual answers
The model might download malware that executes on the user's computer
Webpages can contain hidden text designed to instruct the agent to take unauthorized actions
The browser tool exposes the user's browsing history to the model
What is the worst-case outcome if an attacker successfully injects instructions via an email connector?
The email account becomes locked due to failed login attempts
Sensitive emails are forwarded to the attacker without the user's knowledge
The model deletes all emails in the inbox
The model sends spam to all contacts in the address book
What specific attack was demonstrated by security researchers involving a poisoned document in a connected drive?
The document caused ChatGPT to crash and lose conversation history
The document created a backdoor account in the user's cloud storage
The document instructed the model to search for credentials and exfiltrate them via a markdown image URL
The document triggered an automatic software update that disabled security features
What is the recommended defense against malicious content in document Q&A tools?
Disable document Q&A entirely and use manual search instead
Use only PDF documents, never Word files
Strip or sanitize untrusted documents before indexing them for search
Require the model to confirm each sentence in a document before answering questions
Why are calendar invites considered a prompt injection risk?
The model cannot accurately parse calendar time formats
Calendar invite descriptions are user-controlled and reach the agent with potential instructions
Calendar invites use a proprietary format that confuses language models
Calendar APIs have known vulnerabilities that attackers exploit
What is an approval gate in the context of ChatGPT tool use?
A mandatory pause where the model must explicitly get human permission before executing sensitive actions
A technical barrier that prevents the model from accessing certain data
A setting that automatically approves all requests from verified sources
A password required to enable any connector
What does 'least privilege' mean when applied to ChatGPT connectors?
Limiting the number of queries a connector can process per day
Requiring users to prove they are at least 18 years old
Giving each connector only the minimum permissions needed for its current task
Only granting temporary access to premium features
What is the recommended practice for 'send' or 'write' actions in connected ChatGPT setups?
Allow the model to send automatically but log for review later
Require explicit human approval before any send/write action executes
Never allow send/write actions under any circumstances
Only allow sending to addresses already in the contact list
What is the purpose of conducting a 60-day audit of active connectors and Custom GPT actions?
To reset the model's memory and clear potential injection artifacts
To receive new feature updates from OpenAI
To comply with legal requirements for data retention
To verify that permissions haven't expanded and that active tools remain necessary
The lesson compares treating retrieved content to how a good editor treats a press release. What does this mean in practice?
The content requires legal review before the model can use it
The content should always be verified by a second human editor
The content should be published immediately without modification
The content is useful as input but should never be treated as instructions to follow
If you paste a webpage into a ChatGPT session that has connectors enabled, what risk are you creating?
The webpage will be shared with other ChatGPT users
You import potentially adversarial instructions into a context where the model can act
The webpage will slow down the model's response time
The model will remember the webpage forever
What is the community consensus among security researchers regarding prompt injection in connected ChatGPT setups?
Prompt injection has been completely solved by current AI models
Prompt injection is the dominant practical risk and no model-side defense is sufficient on its own
Prompt injection only affects enterprise accounts, not individual users
Prompt injection risks are negligible compared to traditional hacking
What warning sign indicates a possible prompt injection attack in progress?
The model asks for clarification more frequently
The agent suddenly wants to take an unexpected action like emailing someone it hasn't mentioned before
The model begins responses with more hedged language
The model responds more slowly than usual
What is the purpose of logging everything an agent does?
To train future AI models on user behavior
To improve the model's language capabilities
To charge the user for API usage
To create an audit trail that serves as the primary forensic tool if something goes wrong