feat(pydantic-ai): Support ImageUrl content type in span instrumentation#5629
Open
ericapisani wants to merge 5 commits intomasterfrom
Open
feat(pydantic-ai): Support ImageUrl content type in span instrumentation#5629ericapisani wants to merge 5 commits intomasterfrom
ericapisani wants to merge 5 commits intomasterfrom
Conversation
Add handling for the pydantic-ai `ImageUrl` message content type in the pydantic-ai integration. For data URLs containing base64-encoded images, the content is redacted and replaced with a placeholder to avoid sending large binary payloads to Sentry. For regular HTTP URLs, the URL string is preserved as-is. Refactor binary content serialization into shared helper functions `_serialize_binary_content_item` and `_serialize_image_url_item` in `spans/utils.py` to remove duplication between `ai_client.py` and `invoke_agent.py`. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Contributor
Semver Impact of This PR🟡 Minor (new features) 📋 Changelog PreviewThis is how your changes will appear in the changelog. New Features ✨Pydantic Ai
Other
Bug Fixes 🐛
Documentation 📚
Internal Changes 🔧Openai Agents
Other
🤖 This preview updates automatically when you update the PR. |
Contributor
Codecov Results 📊✅ 13 passed | Total: 13 | Pass Rate: 100% | Execution Time: 7.83s All tests are passing successfully. ❌ Patch coverage is 0.00%. Project has 13966 uncovered lines. Files with missing lines (4)
Generated by Codecov Action |
The regex used to detect and redact base64 data URLs only allowed alphabetic characters in MIME types, causing it to fail for types like `image/svg+xml`, `application/vnd.ms-excel`, or `font/woff2`. When the match failed, the full raw data URL (including base64 content) was passed through to Sentry instead of being redacted with BLOB_DATA_SUBSTITUTE, resulting in unintended data leakage. Expand the MIME type character class to include digits, `.`, `+`, and `-` to match all common MIME types per RFC 2045. Co-Authored-By: Claude <noreply@anthropic.com>
Cover the case where data URLs include optional parameters between the MIME type and base64 encoding, e.g. `data:image/png;name=file.png;base64,...` and `data:text/plain;charset=utf-8;name=hello.txt;base64,...`. These should be matched and redacted by DATA_URL_BASE64_REGEX. Co-Authored-By: Claude <noreply@anthropic.com>
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
Bugbot Autofix prepared a fix for the issue found in the latest run.
- ✅ Fixed: Unused imports of BinaryContent and ImageUrl in utils
- Removed unused BinaryContent and ImageUrl imports from spans/utils.py as they were never referenced in the file.
Or push these changes by commenting:
@cursor push 85791c9981
Preview (85791c9981)
diff --git a/sentry_sdk/integrations/pydantic_ai/spans/utils.py b/sentry_sdk/integrations/pydantic_ai/spans/utils.py
--- a/sentry_sdk/integrations/pydantic_ai/spans/utils.py
+++ b/sentry_sdk/integrations/pydantic_ai/spans/utils.py
@@ -13,13 +13,7 @@
from typing import Union, Dict, Any, List, Optional
from pydantic_ai.usage import RequestUsage, RunUsage # type: ignore
-try:
- from pydantic_ai.messages import BinaryContent, ImageUrl # type: ignore
-except ImportError:
- BinaryContent = None
- ImageUrl = None
-
def _serialize_image_url_item(item: "Any") -> "Dict[str, Any]":
"""Serialize an ImageUrl content item for span data.This Bugbot Autofix run was free. To enable autofix for future PRs, go to the Cursor dashboard.
Remove unused imports (BinaryContent, ImageUrl, Optional, List) from utils.py and add explicit assertion in test to ensure image content is actually found in messages data rather than silently passing. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.

Fixes PY-2129 and #5627
Add handling for the pydantic-ai
ImageUrlmessage content type in span instrumentation.Previously, only
BinaryContentwas handled for non-text message parts. With recent pydantic-ai versions, users can passImageUrlobjects as part of their prompts. Without handling this type,ImageUrlitems would fall through tosafe_serialize, losing structured information about the content.Behavior:
ImageUrlwith a data URL (e.g.data:image/png;base64,...) — the base64 content is redacted and replaced with a placeholder ([Filtered]), similar to howBinaryContentis handled. The MIME type is extracted from the data URL and included.ImageUrlwith a regular HTTP URL — the URL string is preserved as-is, since it does not contain inline binary data.Also refactors the binary content serialization into shared helpers (
_serialize_binary_content_itemand_serialize_image_url_item) inspans/utils.py, removing duplication betweenai_client.pyandinvoke_agent.py.