X Tutup
Skip to content

feat(pydantic-ai): Support ImageUrl content type in span instrumentation#5629

Open
ericapisani wants to merge 5 commits intomasterfrom
ep/pydantic-ai-support-base64-images-in-url-46s
Open

feat(pydantic-ai): Support ImageUrl content type in span instrumentation#5629
ericapisani wants to merge 5 commits intomasterfrom
ep/pydantic-ai-support-base64-images-in-url-46s

Conversation

@ericapisani
Copy link
Member

@ericapisani ericapisani commented Mar 10, 2026

Fixes PY-2129 and #5627

Add handling for the pydantic-ai ImageUrl message content type in span instrumentation.

Previously, only BinaryContent was handled for non-text message parts. With recent pydantic-ai versions, users can pass ImageUrl objects as part of their prompts. Without handling this type, ImageUrl items would fall through to safe_serialize, losing structured information about the content.

Behavior:

  • ImageUrl with a data URL (e.g. data:image/png;base64,...) — the base64 content is redacted and replaced with a placeholder ([Filtered]), similar to how BinaryContent is handled. The MIME type is extracted from the data URL and included.
  • ImageUrl with a regular HTTP URL — the URL string is preserved as-is, since it does not contain inline binary data.

Also refactors the binary content serialization into shared helpers (_serialize_binary_content_item and _serialize_image_url_item) in spans/utils.py, removing duplication between ai_client.py and invoke_agent.py.

Add handling for the pydantic-ai `ImageUrl` message content type in the
pydantic-ai integration. For data URLs containing base64-encoded images,
the content is redacted and replaced with a placeholder to avoid sending
large binary payloads to Sentry. For regular HTTP URLs, the URL string is
preserved as-is.

Refactor binary content serialization into shared helper functions
`_serialize_binary_content_item` and `_serialize_image_url_item` in
`spans/utils.py` to remove duplication between `ai_client.py` and
`invoke_agent.py`.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@linear-code
Copy link

linear-code bot commented Mar 10, 2026

@github-actions
Copy link
Contributor

github-actions bot commented Mar 10, 2026

Semver Impact of This PR

🟡 Minor (new features)

📋 Changelog Preview

This is how your changes will appear in the changelog.
Entries from this PR are highlighted with a left border (blockquote style).


New Features ✨

Pydantic Ai

  • Support ImageUrl content type in span instrumentation by ericapisani in #5629
  • Add tool description to execute_tool spans by ericapisani in #5596

Other

  • (crons) Add owner field to MonitorConfig by julwhitney13 in #5610

Bug Fixes 🐛

  • (celery) Propagate user-set headers by sentrivana in #5581
  • (utils) Avoid double serialization of strings in safe_serialize by ericapisani in #5587

Documentation 📚

  • (openai-agents) Remove inapplicable comment by alexander-alderman-webb in #5495
  • Add AGENTS.md by sentrivana in #5579
  • Add set_attribute example to changelog by sentrivana in #5578

Internal Changes 🔧

Openai Agents

  • Do not fail on new tool fields by alexander-alderman-webb in #5625
  • Stop expecting a specific function name by alexander-alderman-webb in #5623
  • Set streaming header when library uses with_streaming_response() by alexander-alderman-webb in #5583
  • Replace mocks with httpx for streamed responses by alexander-alderman-webb in #5580
  • Replace mocks with httpx in non-MCP tool tests by alexander-alderman-webb in #5602
  • Replace mocks with httpx in MCP tool tests by alexander-alderman-webb in #5605
  • Replace mocks with httpx in handoff tests by alexander-alderman-webb in #5604
  • Replace mocks with httpx in API error test by alexander-alderman-webb in #5601
  • Replace mocks with httpx in non-error single-response tests by alexander-alderman-webb in #5600
  • Remove test for unreachable state by alexander-alderman-webb in #5584
  • Expect namespace tool field for new openai versions by alexander-alderman-webb in #5599

Other

  • (httpx) Resolve type checking failures by alexander-alderman-webb in #5626
  • (pyramid) Support alpha suffixes in version parsing by alexander-alderman-webb in #5618
  • Normalize dots in package names in populate_tox.py by alexander-alderman-webb in #5574
  • Do not run actions on potel-base by sentrivana in #5614

🤖 This preview updates automatically when you update the PR.

@github-actions
Copy link
Contributor

github-actions bot commented Mar 10, 2026

Codecov Results 📊

13 passed | Total: 13 | Pass Rate: 100% | Execution Time: 7.83s

All tests are passing successfully.

❌ Patch coverage is 0.00%. Project has 13966 uncovered lines.

Files with missing lines (4)
File Patch % Lines
ai_client.py 0.00% ⚠️ 147 Missing
invoke_agent.py 0.00% ⚠️ 81 Missing
utils.py 0.00% ⚠️ 27 Missing
consts.py 0.00% ⚠️ 3 Missing

Generated by Codecov Action

@ericapisani ericapisani marked this pull request as ready for review March 10, 2026 15:49
@ericapisani ericapisani requested a review from a team as a code owner March 10, 2026 15:49
The regex used to detect and redact base64 data URLs only allowed
alphabetic characters in MIME types, causing it to fail for types like
`image/svg+xml`, `application/vnd.ms-excel`, or `font/woff2`.

When the match failed, the full raw data URL (including base64 content)
was passed through to Sentry instead of being redacted with
BLOB_DATA_SUBSTITUTE, resulting in unintended data leakage.

Expand the MIME type character class to include digits, `.`, `+`, and
`-` to match all common MIME types per RFC 2045.

Co-Authored-By: Claude <noreply@anthropic.com>
Cover the case where data URLs include optional parameters between the
MIME type and base64 encoding, e.g. `data:image/png;name=file.png;base64,...`
and `data:text/plain;charset=utf-8;name=hello.txt;base64,...`. These should
be matched and redacted by DATA_URL_BASE64_REGEX.

Co-Authored-By: Claude <noreply@anthropic.com>
Copy link

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Bugbot Autofix prepared a fix for the issue found in the latest run.

  • ✅ Fixed: Unused imports of BinaryContent and ImageUrl in utils
    • Removed unused BinaryContent and ImageUrl imports from spans/utils.py as they were never referenced in the file.

Create PR

Or push these changes by commenting:

@cursor push 85791c9981
Preview (85791c9981)
diff --git a/sentry_sdk/integrations/pydantic_ai/spans/utils.py b/sentry_sdk/integrations/pydantic_ai/spans/utils.py
--- a/sentry_sdk/integrations/pydantic_ai/spans/utils.py
+++ b/sentry_sdk/integrations/pydantic_ai/spans/utils.py
@@ -13,13 +13,7 @@
     from typing import Union, Dict, Any, List, Optional
     from pydantic_ai.usage import RequestUsage, RunUsage  # type: ignore
 
-try:
-    from pydantic_ai.messages import BinaryContent, ImageUrl  # type: ignore
-except ImportError:
-    BinaryContent = None
-    ImageUrl = None
 
-
 def _serialize_image_url_item(item: "Any") -> "Dict[str, Any]":
     """Serialize an ImageUrl content item for span data.

This Bugbot Autofix run was free. To enable autofix for future PRs, go to the Cursor dashboard.

Remove unused imports (BinaryContent, ImageUrl, Optional, List) from
utils.py and add explicit assertion in test to ensure image content is
actually found in messages data rather than silently passing.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant

X Tutup