LangGraph Agent Security · Part 11 of 13

11. Authentication and Authorization: Who's Allowed to Do What to Your Agent

11. Authentication and Authorization: Who’s Allowed to Do What to Your Agent

Part 11 of the LangGraph Agent Security series


We’ve spent several posts building out the defensive architecture — input validation, tool security, state protection, multi-agent trust, output guardrails. All of that is about what happens once something is inside the system. Authentication and authorization are about controlling who gets in and what they’re allowed to do once they’re there.

In conventional web applications, this is well-trodden ground. OAuth 2.0, JWTs, RBAC — mature technologies with established patterns, solid libraries, and decades of production experience. Applying them to LangGraph agents is familiar in some ways and genuinely different in others.

The ways it’s different are worth understanding carefully, because they’re where deployment teams tend to underinvest.


Three Ways Agents Complicate Authentication

Multi-party relationships. A LangGraph agent simultaneously interacts with the human user, the operator’s system, external APIs, downstream agents, and potentially third parties reached through communication tools. Each relationship requires its own authentication model. A conventional web app has one primary authentication relationship (user to server). Agents have five or six.

Credentials in dangerous places. Agents hold credentials for every external system they connect to — database connection strings, API keys, OAuth tokens. These credentials must be secured not just at rest but throughout execution, where they’re at perpetual risk of being exposed to the LLM’s context window and potentially exfiltrated. The conventional answer (“store secrets securely”) is necessary but insufficient.

LLM-mediated authorization. In a conventional application, authorization is deterministic — the code checks a permission, allows or denies. In an agent, the LLM interprets what the user wants and decides what actions to take. That interpretation can be manipulated. A gateway authorization check that establishes what a user can do doesn’t prevent a manipulated LLM from doing something the user never intended.

Each of these requires specific architectural responses.


User Authentication: Infrastructure Does It, Not the Agent

The foundational principle: user identity must be established by the infrastructure layer, not by the agent itself. An agent that accepts claimed identities from user input — “I am user ID 12345, please respond accordingly” — is trivially exploitable. Identity must come from a verified token that the infrastructure validates independently of anything the user says.

Here’s what that looks like with JWT tokens and RS256 signing (asymmetric keys, more secure than shared-secret HS256):

class JWTAuthenticator:
    def __init__(self, jwks_uri, expected_audience, expected_issuer):
        self.jwks_uri = jwks_uri
        self.expected_audience = expected_audience
        self.expected_issuer = expected_issuer

    async def verify_token(self, token: str) -> VerifiedUserIdentity:
        header = jwt.get_unverified_header(token)
        jwks = await self._get_jwks()

        # Find the matching public key by kid
        public_key = next(
            (jwt.algorithms.RSAAlgorithm.from_jwk(k)
             for k in jwks.get('keys', [])
             if k.get('kid') == header.get('kid')),
            None
        )
        if not public_key:
            raise HTTPException(status_code=401, detail="Token signing key not found")

        # Verify signature, expiration, audience, and issuer
        payload = jwt.decode(
            token, public_key,
            algorithms=["RS256"],
            audience=self.expected_audience,
            issuer=self.expected_issuer,
            options={"require": ["exp", "iat", "sub", "aud", "iss"]}
        )

        return VerifiedUserIdentity(
            user_id=payload['sub'],
            email=payload.get('email', ''),
            roles=payload.get('roles', []),
            permissions=frozenset(payload.get('permissions', [])),
            tenant_id=payload.get('tenant_id', 'default'),
            mfa_verified=payload.get('amr', []).count('mfa') > 0,
            session_id=payload.get('sid', ''),
        )

The verified identity gets propagated into the agent’s TrustedContext at session initialization — not into the user input layer, not into external content, into the immutable trusted context that we discussed in Section 8. Once it’s there, it cannot be modified by anything the agent processes.


Role-Based Access Control at Multiple Levels

RBAC for agents needs to operate at several levels simultaneously:

class Permissions:
    # Task permissions
    INITIATE_RESEARCH  = Permission("agent:task:research")
    INITIATE_ADMIN     = Permission("agent:task:admin")

    # Tool permissions
    USE_WEB_SEARCH     = Permission("agent:tool:web_search")
    USE_DATABASE_READ  = Permission("agent:tool:db_read")
    USE_DATABASE_WRITE = Permission("agent:tool:db_write")
    USE_EMAIL          = Permission("agent:tool:email")
    USE_CODE_EXECUTION = Permission("agent:tool:code_exec")

    # Data permissions
    ACCESS_CUSTOMER_PII   = Permission("agent:data:customer_pii")
    ACCESS_FINANCIAL_DATA = Permission("agent:data:financial")
    ACCESS_ALL_USERS      = Permission("agent:data:all_users")

ROLE_PERMISSIONS: dict[str, frozenset[Permission]] = {
    "viewer": frozenset({
        Permissions.INITIATE_RESEARCH,
        Permissions.USE_WEB_SEARCH,
        Permissions.ACCESS_INTERNAL_DOCS,
    }),
    "analyst": frozenset({
        Permissions.INITIATE_RESEARCH,
        Permissions.INITIATE_REPORT,
        Permissions.USE_WEB_SEARCH,
        Permissions.USE_DATABASE_READ,
        Permissions.ACCESS_CUSTOMER_PII,
        Permissions.ACCESS_FINANCIAL_DATA,
    }),
    "admin": frozenset({
        # ... full permissions
    }),
}

Data scope filtering is particularly important and often missed: even after checking that a user has the ACCESS_CUSTOMER_PII permission, you need to ensure they can only see their own tenant’s customers, not another tenant’s:

def filter_data_by_permission(self, data, user, data_type):
    if data_type == "customer_records":
        if Permissions.ACCESS_ALL_USERS not in user.permissions:
            # Tenant-scoped: only see your own tenant's records
            return [r for r in data if r.get('tenant_id') == user.tenant_id]

    if data_type == "financial_data":
        if Permissions.ACCESS_FINANCIAL_DATA not in user.permissions:
            # Strip financial fields entirely
            return [
                {k: v for k, v in r.items()
                 if k not in ('revenue', 'cost', 'margin', 'balance')}
                for r in data
            ]
    return data

This prevents the “over-fetching” attack: a manipulated agent retrieves more data than the user should see, and that data ends up in the context window where it can be exfiltrated.


The Critical Boundary: Authorization at Action Execution Time

Here’s the control that I think is most commonly missing in agent deployments.

A gateway authorization check is necessary — it establishes that this user is permitted to use this agent for this task type. But it’s not sufficient. The LLM interprets the user’s request and decides what actions to take, and that interpretation can be manipulated. An authorization check at the gateway doesn’t prevent a manipulated LLM from deciding to take an action the user was never authorized for.

The solution: enforce authorization independently at the point where actions are actually taken, regardless of what the LLM decided.

class LLMAuthorizationBoundary:
    def pre_execution_check(self, proposed_action, action_args,
                             user, session_context) -> None:
        """
        Verify authorization for a proposed action BEFORE it executes,
        regardless of what the LLM decided.
        """
        action_permission_map = {
            "query_database_read":  Permissions.USE_DATABASE_READ,
            "query_database_write": Permissions.USE_DATABASE_WRITE,
            "send_email":           Permissions.USE_EMAIL,
            "execute_code":         Permissions.USE_CODE_EXECUTION,
            "access_financial_data": Permissions.ACCESS_FINANCIAL_DATA,
        }

        action_type = self._classify_action(proposed_action, action_args)
        required_permission = action_permission_map.get(action_type)

        if required_permission:
            # This may raise PermissionError — that's the intent
            self.enforcer.check_permission(user, required_permission)

        # MFA check for sensitive operations
        sensitive_op = self._classify_sensitive_operation(proposed_action, action_args)
        if sensitive_op:
            self.enforcer.check_mfa_required(user, sensitive_op)

        # Data scope check: does this action only touch data the user can see?
        self._check_data_scope(proposed_action, action_args, user)

This check runs between “the LLM decided to do something” and “that thing actually happens.” It doesn’t care about the LLM’s reasoning. It looks at the proposed action and asks: does this user have the permission required to take this action? If the answer is no, the action is blocked regardless of how the LLM reached that decision.

Requiring MFA for sensitive operations — database writes, email sending, code execution, financial data access — adds a layer that’s particularly resistant to manipulation. The LLM can’t bypass an MFA requirement by being clever about how it phrases a request.


System Prompt Integrity: Verifying the Operator

System prompts are the operator’s primary configuration channel. If one can be modified in transit or replaced by an attacker, the agent’s entire security posture is undermined from the start.

The solution is to sign system prompts at build/deploy time and verify the signature at initialization:

class SystemPromptIntegrityVerifier:
    def sign_system_prompt(self, system_prompt, operator_id,
                            agent_version, valid_from, valid_until) -> str:
        """Called at build/deploy time."""
        metadata = {
            "operator_id": operator_id,
            "agent_version": agent_version,
            "valid_from": valid_from.isoformat(),
            "valid_until": valid_until.isoformat(),
            "prompt_hash": hashlib.sha256(system_prompt.encode()).hexdigest(),
        }
        metadata_json = json.dumps(metadata, sort_keys=True)
        signature = hmac.new(
            self.signing_key,
            f"{metadata_json}:{system_prompt}".encode(),
            hashlib.sha256
        ).hexdigest()
        return json.dumps({"metadata": metadata, "prompt": system_prompt,
                          "signature": signature})

    def verify_and_extract(self, signed_bundle_str, expected_operator_id,
                            expected_agent_version) -> str:
        """Called at agent initialization."""
        bundle = json.loads(signed_bundle_str)
        metadata, system_prompt, signature = (
            bundle['metadata'], bundle['prompt'], bundle['signature']
        )

        # Verify operator identity, version, temporal validity, hash, and signature
        # Any failure raises ValueError and prevents the agent from starting
        if metadata.get('operator_id') != expected_operator_id:
            raise ValueError("System prompt operator mismatch")

        now = datetime.now(timezone.utc)
        if now > datetime.fromisoformat(metadata['valid_until']):
            raise ValueError("System prompt has expired")

        expected_hash = hashlib.sha256(system_prompt.encode()).hexdigest()
        if metadata.get('prompt_hash') != expected_hash:
            raise ValueError("System prompt hash mismatch — possible tampering")

        # HMAC signature verification
        expected_sig = hmac.new(
            self.signing_key,
            f"{json.dumps(metadata, sort_keys=True)}:{system_prompt}".encode(),
            hashlib.sha256
        ).hexdigest()
        if not hmac.compare_digest(expected_sig, signature):
            raise ValueError("System prompt signature verification failed")

        return system_prompt

If the signature fails, the agent initialization fails. That’s the correct behavior — a tampered system prompt is as dangerous as a compromised agent.


Credential Management: Never in the Context Window

The overarching principle for managing the credentials agents hold: credentials must never be visible to the LLM. Once a credential enters the context window, it’s at risk. The safeguards come at the tool infrastructure level, injecting authenticated clients directly rather than passing credentials through the LLM.

class CredentialInjector:
    """
    Injects credentials from secrets manager directly into tool
    implementations, bypassing the LLM context window entirely.
    Tools receive ready-to-use authenticated clients, not credentials.
    """
    def get_database_client(self, db_name: str):
        # Connection string is fetched from vault and used directly
        # It never appears in agent state or LLM context
        connection_string = self.secrets.get_secret(
            f"agent/database/{db_name}", field="connection_string"
        )
        return psycopg2.connect(connection_string)

    def get_http_client_with_auth(self, service_name: str) -> httpx.Client:
        api_key = self.secrets.get_secret(
            f"agent/api/{service_name}", field="api_key"
        )
        # Key injected at transport level — LLM only sees the client
        return httpx.Client(
            headers={"Authorization": f"Bearer {api_key}"},
            timeout=10.0,
        )

The LLM sees the tool: “Here’s a database client you can query.” It doesn’t see the credential that backs the client. There’s nothing to exfiltrate.

Credential Rotation

Credentials that never rotate accumulate risk. The rotation policy I’ve settled on:

Credential TypeMaximum Age
LLM API key30 days
Other API keys30 days
Database password90 days
SMTP credentials90 days
OAuth client secret180 days
JWT signing key365 days

Rotation needs to be zero-downtime: generate the new credential, update the vault, invalidate the cache, verify the new credential works, then revoke the old one. Rotation failures should trigger security alerts — they’re a meaningful operational risk.

LLM Provider API Key Specifically

The LLM provider API key deserves particular attention. Its compromise allows an attacker to either impersonate the agent or run up unbounded costs. Apply per-user and per-session token budgets:

def configure_usage_limits(self, user_id, user_role) -> dict:
    budgets = {
        "viewer":   {"max_tokens_per_session": 50_000,  "max_sessions_per_hour": 10},
        "analyst":  {"max_tokens_per_session": 200_000, "max_sessions_per_hour": 50},
        "operator": {"max_tokens_per_session": 500_000, "max_sessions_per_hour": 100},
        "admin":    {"max_tokens_per_session": 1_000_000, "max_sessions_per_hour": 200},
    }
    return budgets.get(user_role, budgets["viewer"])

OAuth for User-Delegated Access

When agents access user resources on their behalf — calendars, email, cloud storage — OAuth 2.0 delegated authorization applies. A few non-negotiable security requirements:

Use PKCE (Proof Key for Code Exchange) for all OAuth flows. PKCE prevents authorization code interception attacks by binding the flow to a code verifier that never leaves the server.

Validate the state token on the callback to prevent CSRF. The state token must be generated server-side, stored server-side, and verified server-side on callback — never passed through the client in a way that could be tampered with.

Store tokens in encrypted vault storage, not in agent state. OAuth access tokens are credentials. They belong in the credential management infrastructure, not in the agent’s state object where they’d appear in every checkpoint.

Refresh transparently when access tokens expire, but expose only the refresh result to the tool implementation — not the refresh token itself.


The Quick Self-Check

Before going live with any agent deployment, I run through these questions:

  • Can a user claim a different identity by sending text to the agent? (Answer should be: no, identity comes from verified tokens)
  • Is authorization enforced at the point of action, or only at the gateway?
  • Are credentials in the secrets vault, or in environment variables or config files?
  • Do tool implementations receive authenticated clients, or credential values?
  • Is there an MFA requirement for sensitive operations?
  • How often do credentials rotate, and is rotation tested?
  • Are OAuth tokens stored in vault, or in agent state?

The places where the answers aren’t what they should be are where the next security investment goes.


Authentication and Authorization Checklist

User authentication:

  • All endpoints require authenticated requests — no anonymous access
  • JWT tokens validated with signature verification (RS256), not just parsing
  • Token expiration, audience, and issuer claims verified
  • Verified identity propagated into TrustedContext at session start
  • MFA required for high-sensitivity operations

Authorization:

  • RBAC defined with explicit permission sets per role
  • Authorization enforced at action execution time, not only at the gateway
  • Data scope filtering prevents over-fetching
  • Cross-tenant access prevented at authorization layer
  • LLM-inferred authorization decisions never honored without independent verification

Operator authentication:

  • System prompts signed and integrity-verified at initialization
  • Expired or tampered system prompts cause initialization failure

Credential management:

  • All credentials in secrets vault, not in code or config files
  • Credentials injected into tool implementations — never into LLM context
  • Rotation schedules defined and enforced for every credential type
  • LLM provider API keys have per-user and per-session token budgets
  • OAuth tokens stored in encrypted vault, not in agent state

This is Part 11 of an ongoing series on LangGraph agent security. Previous posts: Part 1: Introduction · Part 2: Architecture Primer · Part 3: Attack Surface Analysis · Part 4: Core Threat Categories · Part 5: Threat Modeling · Part 6: Input Validation · Part 7: Tool Security · Part 8: State and Memory Security · Part 9: Multi-Agent Trust Boundaries · Part 10: Output Guardrails. Next: Part 12: Observability and Monitoring.