Skip to content

Images

Secure images given to or produced by your agentic system.

At the core of computer vision agents is the ability to perceive their environment through images, typically by taking screenshots to assess the current state. This visual perception allows agents to understand interfaces, identify interactive elements, and make decisions based on what they "see."

Additionally, some systems may allow users to submit images, posing additional risks.

Image Risks
Images may be produced by, or provided to, an agentic system, presenting potential security risks. For example, an insecure agent could:

  • Capture personally identifiable information (PII) like names or addresses.

  • View credentials such as passwords, API keys, or access tokens present in passport images or other documents.

  • Get prompt injected or jailbroken from text in an image.

  • Generate images with explicit or harmful content.

Guardrails provide a powerful way to enforce visual security policies, and to limit the agent's perception to only the visual information that is necessary and appropriate for the task at hand.

ocr

def ocr(
    data: str | list[str],
    config: dict | None = None
) -> list[str]
Given an image as input, this parser extracts and returns the text in the image using Tesseract.

Parameters

Name Type Description
data str | list[str] A single base64 encoded image or a list of base64 encoded images.

Returns

Type Description
list[str] A list of extracted pieces of text from data.

Analyzing Text in Images

The ocr function is a so it returns the data found from parsing its content; in this case, any text present in an image will be extracted. The extracted text can then be used for further detection, for example detecting a prompt injection in an image, like the example below.

Example: Image prompt injection detection.

from invariant.detectors import prompt_injection
from invariant.parsers import ocr

raise "Found Prompt Injection in Image" if:
    (msg: Image)
    ocr_results := ocr(msg)
    prompt_injection(ocr_results)

The text extracted from the image can be checked using, for example, detectors.

image

def image(
    content: Content | list[Content]
) -> list[ImageContent]
Given some Content, this extracts all ImageContent. This is useful when messages may contain mixed content.

Parameters

Name Type Description
content Content | List[Content] A single instance of Content or a list of Content, possibly with mixed types.

Returns

Type Description
List[Image] A list of extracted Images from content.

Extracting Images

Some policies may wish to check images and text in specific ways. Using image and text we can create a policy that detects prompt injection attacks in user input, even when we allow users to submit images.

Example: Prompt Injection Detection in Both Images and Text

from invariant.detectors import prompt_injection
from invariant.parsers import ocr

raise "Found Prompt Injection" if:
    (msg: Message)

    # Only check user messages
    msg.role == 'user'

    # Use the image function to get images
    ocr_results := ocr(image(msg))

    # Check both text and images
    prompt_injection(text(msg))
    prompt_injection(ocr_results)

Extract specific content types from mixed-content messages.