-
Notifications
You must be signed in to change notification settings - Fork 807
fix(anthropic): fix with_raw_response wrapper consistency and re-enable beta API instrumentation #3297
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…le beta API instrumentation This fixes inconsistent response data handling that was causing issues with with_raw_response wrapper methods. The original PR 3250 was partially reverted due to response object corruption, but the fix was incomplete. Key changes: - Updated ashared_metrics_attributes and shared_metrics_attributes to properly handle with_raw_response objects by calling parse() when needed - Fixed _aset_token_usage and _set_token_usage to use consistent safe attribute access with getattr() instead of dict-style access - Re-enabled beta API instrumentation methods that were temporarily disabled - Enabled all bedrock with_raw_response tests that were previously skipped - Ensured both dictionary responses (streaming) and object responses work consistently All 67 tests now pass, including the previously failing bedrock with_raw_response tests. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <[email protected]>
Note Other AI code review bot(s) detectedCodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review. Warning Rate limit exceeded@nirga has exceeded the limit for the number of commits or files that can be reviewed per hour. Please wait 4 minutes and 35 seconds before requesting another review. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. 📒 Files selected for processing (1)
WalkthroughAdds beta Messages endpoints (Anthropic and Bedrock) to sync/async instrumentation; refactors response, model, and token-usage extraction to handle coroutines, with_raw_response wrappers, and varied response shapes using getattr/parse with added logging; enables previously skipped async Bedrock tests (still gated by fixtures). Changes
Sequence Diagram(s)sequenceDiagram
participant Caller
participant Instrumentation
participant SDK as Anthropic SDK
participant Parser as with_raw_response.parse()
Caller->>Instrumentation: invoke Messages.create/stream(...)
Instrumentation->>SDK: call underlying SDK method
alt SDK returns coroutine
Instrumentation->>SDK: await coroutine
Note right of Instrumentation: debug log on await failure
end
alt SDK/Wrapper provides parse()
Instrumentation->>Parser: parse()
Parser-->>Instrumentation: parsed response
Note right of Instrumentation: debug log on parse failure
end
Instrumentation->>Instrumentation: extract usage/model via getattr or fallback compute
Instrumentation-->>Caller: span attributes + original response
sequenceDiagram
participant Caller
participant Utils
participant Resp
Caller->>Utils: shared_metrics_attributes(response)
alt async and response is coroutine
Utils->>Resp: await coroutine (log on failure)
else sync and response is coroutine
Utils->>Utils: warn and skip processing
end
alt response has parse()
Utils->>Resp: parse()
Resp-->>Utils: parsed object
end
Utils->>Utils: model = getattr(..., "model", None) or resp.get("model")
Utils-->>Caller: attributes dict (model, tokens, stop_reason)
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Suggested reviewers
Poem
✨ Finishing Touches
🧪 Generate unit tests
🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. CodeRabbit Commands (Invoked using PR/Issue comments)Type Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Caution
Changes requested ❌
Reviewed everything up to 955cee8 in 1 minute and 14 seconds. Click for details.
- Reviewed
421
lines of code in3
files - Skipped
0
files when reviewing. - Skipped posting
3
draft comments. View those below. - Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.
1. packages/opentelemetry-instrumentation-anthropic/opentelemetry/instrumentation/anthropic/__init__.py:85
- Draft comment:
Beta API instrumentation methods are now re-enabled. Ensure that documentation/changelog is updated to reflect the support for these beta methods. - Reason this comment was not posted:
Confidence changes required:0%
<= threshold50%
None
2. packages/opentelemetry-instrumentation-anthropic/opentelemetry/instrumentation/anthropic/utils.py:151
- Draft comment:
The response parsing logic for with_raw_response objects is repeated in ashared_metrics_attributes and shared_metrics_attributes. Consider centralizing this pattern into a helper function to reduce duplication. - Reason this comment was not posted:
Marked as duplicate.
3. packages/opentelemetry-instrumentation-anthropic/tests/test_bedrock_with_raw_response.py:28
- Draft comment:
Tests previously marked as skipped are now enabled. Ensure these tests remain stable and do not become flaky in CI. - Reason this comment was not posted:
Comment did not seem useful. Confidence is useful =0%
<= threshold50%
The comment is asking the PR author to ensure that the tests remain stable, which is a form of asking them to double-check something. This violates the rule against asking the author to ensure or verify things.
Workflow ID: wflow_RjiWhbde1CanBdfp
You can customize by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.
from opentelemetry.instrumentation.anthropic.utils import _aextract_response_data | ||
import inspect | ||
|
||
# If we get a coroutine, await it |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Both _aset_token_usage and _set_token_usage duplicate the logic for handling with_raw_response responses (checking for a coroutine and parsing via 'parse'). Consider extracting this common logic into a helper to improve maintainability.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
🧹 Nitpick comments (6)
packages/opentelemetry-instrumentation-anthropic/tests/test_bedrock_with_raw_response.py (3)
28-81
: Unskipping this test looks good; add an assertion for stop_reason.Great to see this test enabled and guarded by the AsyncAnthropicBedrock fixture. Since instrumentation now captures stop_reason, assert it to improve coverage.
Apply this diff to assert stop_reason:
@@ assert ( anthropic_span.attributes.get(f"{SpanAttributes.LLM_COMPLETIONS}.0.role") == "assistant" ) + # Validate stop reason is present (common values: "end_turn", "max_tokens", "stop_sequence") + assert anthropic_span.attributes.get(SpanAttributes.LLM_RESPONSE_STOP_REASON)
83-131
: Unskipping regular create test looks good; add stop_reason assertion for parity.Mirror the stop_reason assertion here to keep parity with the raw-response test and to catch potential regressions in non-raw paths.
Apply this diff:
@@ assert ( anthropic_span.attributes.get(f"{SpanAttributes.LLM_COMPLETIONS}.0.role") == "assistant" ) + assert anthropic_span.attributes.get(SpanAttributes.LLM_RESPONSE_STOP_REASON)
132-187
: Beta with_raw_response test re-enabled: add stop_reason assertion.Enabling this path is valuable. Add a stop_reason check to validate the new attribute consistently across beta endpoints too.
Apply this diff:
@@ assert ( anthropic_span.attributes.get(f"{SpanAttributes.LLM_COMPLETIONS}.0.role") == "assistant" ) + assert anthropic_span.attributes.get(SpanAttributes.LLM_RESPONSE_STOP_REASON)
packages/opentelemetry-instrumentation-anthropic/opentelemetry/instrumentation/anthropic/utils.py (3)
210-242
: Completion token fallback should also support dict responses.If a dict response slips through (e.g., certain wrappers or future SDK changes), completion tokens will be stuck at 0. Add dict-aware fallback, mirroring the choices logic below.
Apply this diff:
- if hasattr(anthropic, "count_tokens"): - completion_attr = getattr(response, "completion", None) - content_attr = getattr(response, "content", None) + if hasattr(anthropic, "count_tokens"): + if isinstance(response, dict): + completion_attr = response.get("completion") + content_attr = response.get("content") + else: + completion_attr = getattr(response, "completion", None) + content_attr = getattr(response, "content", None) if completion_attr: completion_tokens = await anthropic.count_tokens(completion_attr) - elif content_attr and len(content_attr) > 0: - completion_tokens = await anthropic.count_tokens( - content_attr[0].text - ) + elif isinstance(content_attr, list) and len(content_attr) > 0: + first_item = content_attr[0] + text = first_item.get("text") if isinstance(first_item, dict) else getattr(first_item, "text", "") + if text: + completion_tokens = await anthropic.count_tokens(text)
344-356
: Mirror dict-aware completion token fallback in sync path.Keep sync/async logic symmetric to reduce surprises.
Apply this diff:
- if hasattr(anthropic, "count_tokens"): - completion_attr = getattr(response, "completion", None) - content_attr = getattr(response, "content", None) + if hasattr(anthropic, "count_tokens"): + if isinstance(response, dict): + completion_attr = response.get("completion") + content_attr = response.get("content") + else: + completion_attr = getattr(response, "completion", None) + content_attr = getattr(response, "content", None) if completion_attr: - completion_tokens = anthropic.count_tokens(completion_attr) - elif content_attr and len(content_attr) > 0: - completion_tokens = anthropic.count_tokens( - content_attr[0].text - ) + completion_tokens = anthropic.count_tokens(completion_attr) + elif isinstance(content_attr, list) and len(content_attr) > 0: + first_item = content_attr[0] + text = first_item.get("text") if isinstance(first_item, dict) else getattr(first_item, "text", "") + if text: + completion_tokens = anthropic.count_tokens(text)
146-149
: Minor: centralize logging instead of per-branch imports.You already import logging at module scope. Consider defining a module-level logger and using it to reduce repetition.
Add once near the top of this file:
logger = logging.getLogger(__name__)Then replace the local imports/usages in these blocks with
logger.debug(...)
.Also applies to: 160-163, 193-200
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (3)
packages/opentelemetry-instrumentation-anthropic/opentelemetry/instrumentation/anthropic/__init__.py
(8 hunks)packages/opentelemetry-instrumentation-anthropic/opentelemetry/instrumentation/anthropic/utils.py
(1 hunks)packages/opentelemetry-instrumentation-anthropic/tests/test_bedrock_with_raw_response.py
(3 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py
📄 CodeRabbit Inference Engine (CLAUDE.md)
**/*.py
: Store API keys only in environment variables/secure vaults; never hardcode secrets in code
Use Flake8 for code linting and adhere to its rules
Files:
packages/opentelemetry-instrumentation-anthropic/tests/test_bedrock_with_raw_response.py
packages/opentelemetry-instrumentation-anthropic/opentelemetry/instrumentation/anthropic/utils.py
packages/opentelemetry-instrumentation-anthropic/opentelemetry/instrumentation/anthropic/__init__.py
🧬 Code Graph Analysis (2)
packages/opentelemetry-instrumentation-anthropic/opentelemetry/instrumentation/anthropic/utils.py (3)
packages/opentelemetry-instrumentation-anthropic/opentelemetry/instrumentation/anthropic/config.py (1)
Config
(6-13)packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/utils.py (1)
dont_throw
(123-151)packages/opentelemetry-instrumentation-vertexai/opentelemetry/instrumentation/vertexai/utils.py (1)
dont_throw
(17-39)
packages/opentelemetry-instrumentation-anthropic/opentelemetry/instrumentation/anthropic/__init__.py (1)
packages/opentelemetry-semantic-conventions-ai/opentelemetry/semconv_ai/__init__.py (1)
SpanAttributes
(64-257)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: Lint
- GitHub Check: Build Packages (3.11)
🔇 Additional comments (6)
packages/opentelemetry-instrumentation-anthropic/opentelemetry/instrumentation/anthropic/utils.py (3)
141-150
: Async coroutine handling in ashared_metrics_attributes: LGTM.Awaiting coroutine responses with a defensive try/except is correct and avoids propagating failures into metrics computation.
171-174
: Good: LLM_RESPONSE_MODEL now derived via consistent safe access.Aligns with PR objective to avoid response object corruption and with_raw_response inconsistencies.
Also applies to: 207-211
139-167
: Keep_extract_response_data
and_aextract_response_data
—they’re actively usedThe helpers are referenced in
span_utils.py
, so they aren’t dead code:
- imports at lines 11, 173, 309, 335
- calls at lines 175–176, 243–247, 311–314, 335–337
Likely an incorrect or invalid review comment.
packages/opentelemetry-instrumentation-anthropic/opentelemetry/instrumentation/anthropic/__init__.py (3)
84-109
: Re-enabling Beta API (regular and Bedrock) in WRAPPED_METHODS: LGTM.This restores coverage for beta messages create/stream and aligns tests with instrumented paths. Exceptions during wrapping are handled gracefully.
185-206
: Token usage extraction via getattr and parse(): LGTM.This avoids dict-only assumptions and fixes the corruption issues tied to earlier extraction paths. The fallback to request-based counting is appropriate.
Also applies to: 321-333
260-273
: Nice touch: choices counting and stop_reason attribute.Counting choices based on content/completion and recording stop_reason aligns with semconv and improves downstream analytics.
Also applies to: 372-387
# Beta API async methods (regular Anthropic SDK) | ||
{ | ||
"package": "anthropic.resources.beta.messages.messages", | ||
"object": "AsyncMessages", | ||
"method": "create", | ||
"span_name": "anthropic.chat", | ||
}, | ||
{ | ||
"package": "anthropic.resources.beta.messages.messages", | ||
"object": "AsyncMessages", | ||
"method": "stream", | ||
"span_name": "anthropic.chat", | ||
}, | ||
# Beta API async methods (Bedrock SDK) | ||
{ | ||
"package": "anthropic.lib.bedrock._beta_messages", | ||
"object": "AsyncMessages", | ||
"method": "create", | ||
"span_name": "anthropic.chat", | ||
}, | ||
{ | ||
"package": "anthropic.lib.bedrock._beta_messages", | ||
"object": "AsyncMessages", | ||
"method": "stream", | ||
"span_name": "anthropic.chat", | ||
}, | ||
] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Verification agent
🧩 Analysis chain
Async Beta wrappers added: verify stream wrapping strategy for AsyncMessages.stream.
For non-beta Messages, AsyncMessages.stream is wrapped with the sync wrapper because it's an async context manager (used with “async with” and not awaited). Beta AsyncMessages.stream likely follows the same pattern. If so, wrapping it via WRAPPED_AMETHODS (async wrapper) might be incorrect and the sync-wrapper approach should be used instead.
Consider this change if verification confirms AsyncMessages.stream is an async context manager for beta as well:
@@ WRAPPED_METHODS @@
{
"package": "anthropic.resources.beta.messages.messages",
"object": "Messages",
"method": "stream",
"span_name": "anthropic.chat",
},
+ # Beta AsyncMessages.stream is an async context manager; use the sync wrapper
+ {
+ "package": "anthropic.resources.beta.messages.messages",
+ "object": "AsyncMessages",
+ "method": "stream",
+ "span_name": "anthropic.chat",
+ },
@@
{
"package": "anthropic.lib.bedrock._beta_messages",
"object": "Messages",
"method": "stream",
"span_name": "anthropic.chat",
},
+ {
+ "package": "anthropic.lib.bedrock._beta_messages",
+ "object": "AsyncMessages",
+ "method": "stream",
+ "span_name": "anthropic.chat",
+ },
@@ WRAPPED_AMETHODS @@
- {
- "package": "anthropic.resources.beta.messages.messages",
- "object": "AsyncMessages",
- "method": "stream",
- "span_name": "anthropic.chat",
- },
@@
- {
- "package": "anthropic.lib.bedrock._beta_messages",
- "object": "AsyncMessages",
- "method": "stream",
- "span_name": "anthropic.chat",
- },
To confirm via docs:
🌐 Web query:
In the Anthropic Python SDK, is beta AsyncMessages.stream used as an async context manager (async with) for both regular and Bedrock clients, returning AsyncMessageStreamManager?
💡 Result:
Short answer: Yes.
Details:
- The async streaming helper is used as an async context manager — e.g. async with client.messages.stream(...) — for the async Anthropic client, and the stream object exposes .text_stream, .get_final_message(), .close(), etc. (mmacy.github.io, docs.anthropic.com)
- The Bedrock variant exposes the same streaming interface (the Bedrock client/classes provide the same streaming helpers and can be used the same way). Examples and docs for the Anthropic Bedrock clients show equivalent streaming usage. (github.com, anthropic-bedrock.com)
- The SDK’s API docs show the messages.stream helper returns a MessageStreamManager (sync) / the async equivalent (AsyncMessageStreamManager) which is the context-manager type that yields the stream object you iterate over. Wrappers and examples in the ecosystem also refer to AsyncMessageStreamManager/AsyncMessageStream. (github.com, docs.smith.langchain.com)
If you want, I can paste a minimal async-with example for your exact SDK version (or check which installed version you have and show the correct import / beta vs non-beta call).
Citations:
- 1: https://mmacy.github.io/anthropic-sdk-python/helpers.html?utm_source=chatgpt.com
- 2: https://docs.anthropic.com/claude/reference/messages-streaming?utm_source=chatgpt.com
- 3: https://github.com/anthropics/anthropic-sdk-python/issues/866?utm_source=chatgpt.com
- 4: https://www.anthropic-bedrock.com/python/streaming?utm_source=chatgpt.com
- 5: https://github.com/anthropics/anthropic-sdk-python/blob/main/api.md?utm_source=chatgpt.com
- 6: https://docs.smith.langchain.com/reference/python/_modules/langsmith/wrappers/_anthropic?utm_source=chatgpt.com
Correct AsyncMessages.stream instrumentation
The beta AsyncMessages.stream
helper in both regular and Bedrock SDKs is an async context manager (returns an AsyncMessageStreamManager
used with async with
), so it should be wrapped via the sync‐wrapper list (WRAPPED_METHODS
), not the async‐wrapper list (WRAPPED_AMETHODS
).
Locations to update:
- In
WRAPPED_METHODS
, add entries forAsyncMessages.stream
under both beta packages. - In
WRAPPED_AMETHODS
, remove the existingAsyncMessages.stream
entries under both beta packages.
Suggested diff:
@@ WRAPPED_METHODS @@
{
"package": "anthropic.resources.beta.messages.messages",
"object": "Messages",
"method": "stream",
"span_name": "anthropic.chat",
},
+ # Beta AsyncMessages.stream is an async context manager; use the sync wrapper
+ {
+ "package": "anthropic.resources.beta.messages.messages",
+ "object": "AsyncMessages",
+ "method": "stream",
+ "span_name": "anthropic.chat",
+ },
@@ WRAPPED_METHODS @@
{
"package": "anthropic.lib.bedrock._beta_messages",
"object": "Messages",
"method": "stream",
"span_name": "anthropic.chat",
},
+ {
+ "package": "anthropic.lib.bedrock._beta_messages",
+ "object": "AsyncMessages",
+ "method": "stream",
+ "span_name": "anthropic.chat",
+ },
@@ WRAPPED_AMETHODS @@
- {
- "package": "anthropic.resources.beta.messages.messages",
- "object": "AsyncMessages",
- "method": "stream",
- "span_name": "anthropic.chat",
- },
@@ WRAPPED_AMETHODS @@
- {
- "package": "anthropic.lib.bedrock._beta_messages",
- "object": "AsyncMessages",
- "method": "stream",
- "span_name": "anthropic.chat",
- },
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
# Beta API async methods (regular Anthropic SDK) | |
{ | |
"package": "anthropic.resources.beta.messages.messages", | |
"object": "AsyncMessages", | |
"method": "create", | |
"span_name": "anthropic.chat", | |
}, | |
{ | |
"package": "anthropic.resources.beta.messages.messages", | |
"object": "AsyncMessages", | |
"method": "stream", | |
"span_name": "anthropic.chat", | |
}, | |
# Beta API async methods (Bedrock SDK) | |
{ | |
"package": "anthropic.lib.bedrock._beta_messages", | |
"object": "AsyncMessages", | |
"method": "create", | |
"span_name": "anthropic.chat", | |
}, | |
{ | |
"package": "anthropic.lib.bedrock._beta_messages", | |
"object": "AsyncMessages", | |
"method": "stream", | |
"span_name": "anthropic.chat", | |
}, | |
] | |
# === WRAPPED_METHODS === | |
WRAPPED_METHODS = [ | |
# Beta API sync methods (regular Anthropic SDK) | |
{ | |
"package": "anthropic.resources.beta.messages.messages", | |
"object": "Messages", | |
"method": "stream", | |
"span_name": "anthropic.chat", | |
}, | |
# Beta AsyncMessages.stream is an async context manager; use the sync wrapper | |
{ | |
"package": "anthropic.resources.beta.messages.messages", | |
"object": "AsyncMessages", | |
"method": "stream", | |
"span_name": "anthropic.chat", | |
}, | |
# Beta API sync methods (Bedrock SDK) | |
{ | |
"package": "anthropic.lib.bedrock._beta_messages", | |
"object": "Messages", | |
"method": "stream", | |
"span_name": "anthropic.chat", | |
}, | |
{ | |
"package": "anthropic.lib.bedrock._beta_messages", | |
"object": "AsyncMessages", | |
"method": "stream", | |
"span_name": "anthropic.chat", | |
}, | |
# ... other entries ... | |
] | |
# === WRAPPED_AMETHODS === | |
WRAPPED_AMETHODS = [ | |
# Beta API async methods (regular Anthropic SDK) | |
{ | |
"package": "anthropic.resources.beta.messages.messages", | |
"object": "AsyncMessages", | |
"method": "create", | |
"span_name": "anthropic.chat", | |
}, | |
# (AsyncMessages.stream removed) | |
# Beta API async methods (Bedrock SDK) | |
{ | |
"package": "anthropic.lib.bedrock._beta_messages", | |
"object": "AsyncMessages", | |
"method": "create", | |
"span_name": "anthropic.chat", | |
}, | |
# (AsyncMessages.stream removed) | |
# ... other entries ... | |
] |
if isinstance(response, dict): | ||
model = response.get("model") | ||
else: | ||
# Handle with_raw_response wrapped responses first | ||
if response and hasattr(response, "parse") and callable(response.parse): | ||
try: | ||
response = response.parse() | ||
except Exception as e: | ||
import logging | ||
logger = logging.getLogger(__name__) | ||
logger.debug(f"Failed to parse with_raw_response: {e}") | ||
response = None | ||
|
||
# Safely get model attribute without extracting the whole object | ||
model = getattr(response, "model", None) if response else None | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Model extraction may miss dicts returned by parse(); normalize before branching.
If response.parse() returns a dict, the current code flows to getattr(dict, "model", None) and loses the model. Normalize to a parsed object first, then branch on dict vs attr.
Apply this diff:
- # If it's already a dict (e.g., from streaming), use it directly
- if isinstance(response, dict):
- model = response.get("model")
- else:
- # Handle with_raw_response wrapped responses first
- if response and hasattr(response, "parse") and callable(response.parse):
- try:
- response = response.parse()
- except Exception as e:
- import logging
- logger = logging.getLogger(__name__)
- logger.debug(f"Failed to parse with_raw_response: {e}")
- response = None
-
- # Safely get model attribute without extracting the whole object
- model = getattr(response, "model", None) if response else None
+ # Normalize to parsed object (if applicable), then extract model from dict or attr
+ resp_obj = response
+ if resp_obj and hasattr(resp_obj, "parse") and callable(resp_obj.parse):
+ try:
+ resp_obj = resp_obj.parse()
+ except Exception as e:
+ import logging
+ logger = logging.getLogger(__name__)
+ logger.debug(f"Failed to parse with_raw_response: {e}")
+ resp_obj = None
+
+ if isinstance(resp_obj, dict):
+ model = resp_obj.get("model")
+ else:
+ # Safely get model attribute without extracting the whole object
+ model = getattr(resp_obj, "model", None) if resp_obj else None
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
if isinstance(response, dict): | |
model = response.get("model") | |
else: | |
# Handle with_raw_response wrapped responses first | |
if response and hasattr(response, "parse") and callable(response.parse): | |
try: | |
response = response.parse() | |
except Exception as e: | |
import logging | |
logger = logging.getLogger(__name__) | |
logger.debug(f"Failed to parse with_raw_response: {e}") | |
response = None | |
# Safely get model attribute without extracting the whole object | |
model = getattr(response, "model", None) if response else None | |
# Normalize to parsed object (if applicable), then extract model from dict or attr | |
resp_obj = response | |
if resp_obj and hasattr(resp_obj, "parse") and callable(resp_obj.parse): | |
try: | |
resp_obj = resp_obj.parse() | |
except Exception as e: | |
import logging | |
logger = logging.getLogger(__name__) | |
logger.debug(f"Failed to parse with_raw_response: {e}") | |
resp_obj = None | |
if isinstance(resp_obj, dict): | |
model = resp_obj.get("model") | |
else: | |
# Safely get model attribute without extracting the whole object | |
model = getattr(resp_obj, "model", None) if resp_obj else None |
🤖 Prompt for AI Agents
In
packages/opentelemetry-instrumentation-anthropic/opentelemetry/instrumentation/anthropic/utils.py
around lines 152 to 167, the code calls response.parse() but then immediately
uses getattr on the original/parsed value which can be a dict, losing model
extraction; update the flow to first attempt to parse (catching/logging
exceptions as currently done) and assign the parsed result back to response,
then branch: if isinstance(response, dict) use response.get("model"), else use
getattr(response, "model", None). Ensure the existing exception logging remains
and only falls back to None if parse fails.
if isinstance(response, dict): | ||
model = response.get("model") | ||
else: | ||
# Handle with_raw_response wrapped responses first | ||
if response and hasattr(response, "parse") and callable(response.parse): | ||
try: | ||
response = response.parse() | ||
except Exception as e: | ||
import logging | ||
logger = logging.getLogger(__name__) | ||
logger.debug(f"Failed to parse with_raw_response: {e}") | ||
response = None | ||
|
||
# Safely get model attribute without extracting the whole object | ||
model = getattr(response, "model", None) if response else None | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Same dict-after-parse issue in shared_metrics_attributes; align with async fix.
Mirror the normalization before model extraction to avoid losing the model when parse() returns a dict.
Apply this diff:
- # If it's already a dict (e.g., from streaming), use it directly
- if isinstance(response, dict):
- model = response.get("model")
- else:
- # Handle with_raw_response wrapped responses first
- if response and hasattr(response, "parse") and callable(response.parse):
- try:
- response = response.parse()
- except Exception as e:
- import logging
- logger = logging.getLogger(__name__)
- logger.debug(f"Failed to parse with_raw_response: {e}")
- response = None
-
- # Safely get model attribute without extracting the whole object
- model = getattr(response, "model", None) if response else None
+ # Normalize to parsed object (if applicable), then extract model from dict or attr
+ resp_obj = response
+ if resp_obj and hasattr(resp_obj, "parse") and callable(resp_obj.parse):
+ try:
+ resp_obj = resp_obj.parse()
+ except Exception as e:
+ import logging
+ logger = logging.getLogger(__name__)
+ logger.debug(f"Failed to parse with_raw_response: {e}")
+ resp_obj = None
+
+ if isinstance(resp_obj, dict):
+ model = resp_obj.get("model")
+ else:
+ # Safely get model attribute without extracting the whole object
+ model = getattr(resp_obj, "model", None) if resp_obj else None
Committable suggestion skipped: line range outside the PR's diff.
🤖 Prompt for AI Agents
In
packages/opentelemetry-instrumentation-anthropic/opentelemetry/instrumentation/anthropic/utils.py
around lines 189 to 204, the code extracts model after attempting to parse
with_raw_response but doesn’t normalize the parsed result, so if parse() returns
a dict the model is lost; change the logic so that when response has a callable
parse() you call response.parse(), assign its return value back to response, and
then normalize by checking if response is a dict to extract model via
response.get("model") else use getattr(response, "model", None) — keep the
try/except logging behavior for parse() failures and ensure response is only set
to None on exception, mirroring the async fix.
…pic client initialization The Bedrock instrumentation was creating an Anthropic client at module import time, which could fail with "Client.__init__() got an unexpected keyword argument 'proxies'" in certain environments. Changes: - Changed anthropic_client from global initialization to lazy initialization - Added proper error handling when Anthropic client creation fails - Removed try-catch from init_bedrock_instrumentor to allow proper error debugging - Token counting now gracefully degrades to 0 if client creation fails This fixes the sample app crash and makes the Bedrock instrumentation more robust. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Important
Looks good to me! 👍
Reviewed 9d81e36 in 1 minute and 26 seconds. Click for details.
- Reviewed
79
lines of code in2
files - Skipped
0
files when reviewing. - Skipped posting
2
draft comments. View those below. - Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.
1. packages/opentelemetry-instrumentation-bedrock/opentelemetry/instrumentation/bedrock/span_utils.py:15
- Draft comment:
Good use of lazy initialization for the Anthropic client, but be aware that using a global mutable client may lead to race conditions in multi-threaded contexts. - Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 20% vs. threshold = 50% While the comment correctly identifies a potential thread safety issue with the global mutable state, several factors make me hesitant: 1) The client is only used for token counting, which is not critical functionality 2) The code already handles failures gracefully by returning 0 3) Even in a race condition, the worst case is multiple client initializations or failed counts 4) Making this thread-safe would add complexity for minimal benefit 5) The comment doesn't provide clear actionable guidance on how to fix it. The comment does identify a real technical concern. Thread safety issues can be subtle and cause hard-to-debug problems in production. While thread safety is important, in this specific case the risk and impact are very low since this is just for token counting, and failures are already handled gracefully. The complexity cost of fixing it likely outweighs the benefit. The comment should be deleted. While technically correct, it raises concerns about a low-risk issue where the added complexity of fixing it would likely outweigh the benefits.
2. packages/traceloop-sdk/traceloop/sdk/tracing/tracing.py:888
- Draft comment:
Removing the try/except block in init_bedrock_instrumentor reduces error resilience compared to similar instrumentors. Consider restoring exception handling for consistency and to prevent unhandled exceptions. - Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 30% vs. threshold = 50% While most other instrumentor functions do have try/except blocks, the bedrock instrumentor still handles exceptions through the exception_logger parameter. The removed try/except was likely redundant since exceptions are already being caught and logged. The function is also quite simple with minimal points of failure. I could be wrong about the exception handling - perhaps there are specific exceptions that need to be caught at the function level rather than passed to the instrumentor. The consistency argument also has merit. While consistency is valuable, duplicating error handling isn't necessarily better. The current implementation still handles errors appropriately through the instrumentor's exception_logger. The comment raises a valid point about consistency but the current error handling through exception_logger is sufficient. The removed try/except block was likely redundant.
Workflow ID: wflow_qwkWZxjogBFot8gy
You can customize by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (2)
packages/opentelemetry-instrumentation-bedrock/opentelemetry/instrumentation/bedrock/span_utils.py (2)
15-15
: Lazy client init is good; consider a module-level logger and (optional) lock for thread safetySetting anthropic_client = None and lazily initializing is a solid improvement. Two follow-ups:
- Use a module-level logger instead of repeatedly creating loggers inside functions.
- If token counting may be invoked concurrently, guard client initialization with a simple Lock to avoid duplicate inits.
Add these at module scope:
# at top-level (outside functions) import logging import threading logger = logging.getLogger(__name__) _anthropic_client_lock = threading.Lock()Then change the lazy init inside _count_anthropic_tokens:
# inside _count_anthropic_tokens if anthropic_client is None: with _anthropic_client_lock: if anthropic_client is None: try: anthropic_client = anthropic.Anthropic() except Exception as e: logger.debug(f"Failed to initialize Anthropic client for token counting: {e}") return 0
275-299
: Avoid zeroing the entire token count due to a single bad entry; filter None/empty and continue per-messageCurrently, any exception while counting tokens for one message returns 0 for the whole batch, which can hide useful partial counts. Also, None values may slip in (e.g., content parts without "text"), triggering exceptions. Filter falsy entries and continue on per-message errors to preserve partial counts.
Proposed patch:
def _count_anthropic_tokens(messages: list[str]): global anthropic_client - + import logging + logger = logging.getLogger(__name__) # Lazy initialization of the Anthropic client if anthropic_client is None: try: anthropic_client = anthropic.Anthropic() except Exception as e: - import logging - logger = logging.getLogger(__name__) logger.debug(f"Failed to initialize Anthropic client for token counting: {e}") # Return 0 if we can't create the client return 0 - - count = 0 - try: - for message in messages: - count += anthropic_client.count_tokens(text=message) - except Exception as e: - import logging - logger = logging.getLogger(__name__) - logger.debug(f"Failed to count tokens with Anthropic client: {e}") - return 0 - - return count + + total = 0 + for msg in messages or []: + if not msg: + continue # skip None/empty + try: + total += anthropic_client.count_tokens(text=msg) + except Exception as e: + # Continue counting other messages; log at debug level + logger.debug(f"Failed to count tokens for a message with Anthropic client: {e}") + continue + return total
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (2)
packages/opentelemetry-instrumentation-bedrock/opentelemetry/instrumentation/bedrock/span_utils.py
(2 hunks)packages/traceloop-sdk/traceloop/sdk/tracing/tracing.py
(1 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py
📄 CodeRabbit Inference Engine (CLAUDE.md)
**/*.py
: Store API keys only in environment variables/secure vaults; never hardcode secrets in code
Use Flake8 for code linting and adhere to its rules
Files:
packages/opentelemetry-instrumentation-bedrock/opentelemetry/instrumentation/bedrock/span_utils.py
packages/traceloop-sdk/traceloop/sdk/tracing/tracing.py
🧬 Code Graph Analysis (2)
packages/opentelemetry-instrumentation-bedrock/opentelemetry/instrumentation/bedrock/span_utils.py (1)
packages/opentelemetry-instrumentation-anthropic/tests/conftest.py (1)
anthropic_client
(73-74)
packages/traceloop-sdk/traceloop/sdk/tracing/tracing.py (3)
packages/traceloop-sdk/traceloop/sdk/utils/package_check.py (1)
is_package_installed
(14-15)packages/opentelemetry-instrumentation-bedrock/opentelemetry/instrumentation/bedrock/__init__.py (1)
BedrockInstrumentor
(557-657)packages/traceloop-sdk/traceloop/sdk/telemetry.py (2)
Telemetry
(12-94)log_exception
(75-86)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
- GitHub Check: Test Packages (3.10)
- GitHub Check: Test Packages (3.11)
- GitHub Check: Test Packages (3.12)
- GitHub Check: Lint
- GitHub Check: Build Packages (3.11)
def init_bedrock_instrumentor(should_enrich_metrics: bool): | ||
try: | ||
if is_package_installed("boto3"): | ||
from opentelemetry.instrumentation.bedrock import BedrockInstrumentor | ||
if is_package_installed("boto3"): | ||
from opentelemetry.instrumentation.bedrock import BedrockInstrumentor | ||
|
||
instrumentor = BedrockInstrumentor( | ||
exception_logger=lambda e: Telemetry().log_exception(e), | ||
enrich_token_usage=should_enrich_metrics, | ||
) | ||
if not instrumentor.is_instrumented_by_opentelemetry: | ||
instrumentor.instrument() | ||
return True | ||
except Exception as e: | ||
logging.error(f"Error initializing Bedrock instrumentor: {e}") | ||
Telemetry().log_exception(e) | ||
instrumentor = BedrockInstrumentor( | ||
exception_logger=lambda e: Telemetry().log_exception(e), | ||
enrich_token_usage=should_enrich_metrics, | ||
) | ||
if not instrumentor.is_instrumented_by_opentelemetry: | ||
instrumentor.instrument() | ||
return True | ||
return False |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Verification agent
🧩 Analysis chain
Restore defensive try/except for Bedrock instrumentor init to match other instrumentors and avoid breaking init on import/runtime errors
Every other init_*_instrumentor() in this file swallows exceptions and logs via Telemetry to prevent SDK init from crashing. This function now propagates exceptions (import errors, constructor issues, instrument() failures), which can bring down Traceloop.init() in environments where boto3 is present but Bedrock deps/configs are misaligned. Recommend restoring the same defensive pattern and adding a Telemetry capture for parity.
Proposed patch:
-def init_bedrock_instrumentor(should_enrich_metrics: bool):
- if is_package_installed("boto3"):
- from opentelemetry.instrumentation.bedrock import BedrockInstrumentor
-
- instrumentor = BedrockInstrumentor(
- exception_logger=lambda e: Telemetry().log_exception(e),
- enrich_token_usage=should_enrich_metrics,
- )
- if not instrumentor.is_instrumented_by_opentelemetry:
- instrumentor.instrument()
- return True
- return False
+def init_bedrock_instrumentor(should_enrich_metrics: bool):
+ try:
+ if is_package_installed("boto3"):
+ Telemetry().capture("instrumentation:bedrock:init")
+ from opentelemetry.instrumentation.bedrock import BedrockInstrumentor
+
+ instrumentor = BedrockInstrumentor(
+ exception_logger=lambda e: Telemetry().log_exception(e),
+ enrich_token_usage=should_enrich_metrics,
+ )
+ if not instrumentor.is_instrumented_by_opentelemetry:
+ instrumentor.instrument()
+ return True
+ except Exception as e:
+ logging.error(f"Error initializing Bedrock instrumentor: {e}")
+ Telemetry().log_exception(e)
+ return False
🏁 Script executed:
#!/bin/bash
# List all init_*_instrumentor definitions with surrounding context to verify error handling patterns
rg -nP -C5 "def init_.*instrumentor" -n packages/traceloop-sdk/traceloop/sdk/tracing/tracing.py
Length of output: 13490
Restore defensive error handling for Bedrock instrumentor init
The init_bedrock_instrumentor
function currently skips the try/except
wrapper and Telemetry capture used by all other instrumentors, so any import/runtime error will bubble up and crash Traceloop.init()
. Please align it with the existing pattern:
• File: packages/traceloop-sdk/traceloop/sdk/tracing/tracing.py
• Lines: ~889–900
Proposed patch:
def init_bedrock_instrumentor(should_enrich_metrics: bool):
- if is_package_installed("boto3"):
- from opentelemetry.instrumentation.bedrock import BedrockInstrumentor
-
- instrumentor = BedrockInstrumentor(
- exception_logger=lambda e: Telemetry().log_exception(e),
- enrich_token_usage=should_enrich_metrics,
- )
- if not instrumentor.is_instrumented_by_opentelemetry:
- instrumentor.instrument()
- return True
- return False
+ try:
+ if is_package_installed("boto3"):
+ Telemetry().capture("instrumentation:bedrock:init")
+ from opentelemetry.instrumentation.bedrock import BedrockInstrumentor
+
+ instrumentor = BedrockInstrumentor(
+ exception_logger=lambda e: Telemetry().log_exception(e),
+ enrich_token_usage=should_enrich_metrics,
+ )
+ if not instrumentor.is_instrumented_by_opentelemetry:
+ instrumentor.instrument()
+ return True
+ except Exception as e:
+ logging.error(f"Error initializing Bedrock instrumentor: {e}")
+ Telemetry().log_exception(e)
+ return False
This ensures init failures are logged and swallowed, preserving parity with other instrumentors.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
def init_bedrock_instrumentor(should_enrich_metrics: bool): | |
try: | |
if is_package_installed("boto3"): | |
from opentelemetry.instrumentation.bedrock import BedrockInstrumentor | |
if is_package_installed("boto3"): | |
from opentelemetry.instrumentation.bedrock import BedrockInstrumentor | |
instrumentor = BedrockInstrumentor( | |
exception_logger=lambda e: Telemetry().log_exception(e), | |
enrich_token_usage=should_enrich_metrics, | |
) | |
if not instrumentor.is_instrumented_by_opentelemetry: | |
instrumentor.instrument() | |
return True | |
except Exception as e: | |
logging.error(f"Error initializing Bedrock instrumentor: {e}") | |
Telemetry().log_exception(e) | |
instrumentor = BedrockInstrumentor( | |
exception_logger=lambda e: Telemetry().log_exception(e), | |
enrich_token_usage=should_enrich_metrics, | |
) | |
if not instrumentor.is_instrumented_by_opentelemetry: | |
instrumentor.instrument() | |
return True | |
return False | |
def init_bedrock_instrumentor(should_enrich_metrics: bool): | |
- if is_package_installed("boto3"): | |
- from opentelemetry.instrumentation.bedrock import BedrockInstrumentor | |
- | |
- instrumentor = BedrockInstrumentor( | |
- exception_logger=lambda e: Telemetry().log_exception(e), | |
- enrich_token_usage=should_enrich_metrics, | |
- ) | |
- if not instrumentor.is_instrumented_by_opentelemetry: | |
- instrumentor.instrument() | |
- return True | |
try: | |
if is_package_installed("boto3"): | |
Telemetry().capture("instrumentation:bedrock:init") | |
from opentelemetry.instrumentation.bedrock import BedrockInstrumentor | |
instrumentor = BedrockInstrumentor( | |
exception_logger=lambda e: Telemetry().log_exception(e), | |
enrich_token_usage=should_enrich_metrics, | |
) | |
if not instrumentor.is_instrumented_by_opentelemetry: | |
instrumentor.instrument() | |
return True | |
except Exception as e: | |
logging.error(f"Error initializing Bedrock instrumentor: {e}") | |
Telemetry().log_exception(e) | |
return False |
🤖 Prompt for AI Agents
In packages/traceloop-sdk/traceloop/sdk/tracing/tracing.py around lines 889–900,
the Bedrock instrumentor init is missing the try/except wrapper used by other
instrumentors so import/runtime errors can crash Traceloop.init; wrap the import
and initialization in a try/except, on exception call
Telemetry().log_exception(e) and return False, otherwise proceed to create the
BedrockInstrumentor with the same exception_logger and enrich_token_usage when
installed, instrument if not already instrumented, and return True.
Fixed whitespace-only blank lines that were causing flake8 W293 errors. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Important
Looks good to me! 👍
Reviewed f083133 in 44 seconds. Click for details.
- Reviewed
31
lines of code in1
files - Skipped
0
files when reviewing. - Skipped posting
3
draft comments. View those below. - Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.
1. packages/opentelemetry-instrumentation-bedrock/opentelemetry/instrumentation/bedrock/span_utils.py:277
- Draft comment:
Removed extraneous whitespace after the 'global anthropic_client' declaration for consistency. - Reason this comment was not posted:
Confidence changes required:0%
<= threshold50%
None
2. packages/opentelemetry-instrumentation-bedrock/opentelemetry/instrumentation/bedrock/span_utils.py:285
- Draft comment:
Removed unnecessary blank line after handling the exception block to improve readability. - Reason this comment was not posted:
Confidence changes required:0%
<= threshold50%
None
3. packages/opentelemetry-instrumentation-bedrock/opentelemetry/instrumentation/bedrock/span_utils.py:295
- Draft comment:
Removed surplus whitespace before the final return statement for cleaner formatting. - Reason this comment was not posted:
Confidence changes required:0%
<= threshold50%
None
Workflow ID: wflow_gq4nysS7UklNnhFF
You can customize by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.
Summary
with_raw_response
wrapper methodsBackground
PR #3250 introduced support for
with_raw_response
wrapper but was partially reverted due to response object corruption issues. While some functions were updated to use safegetattr()
access, others still used the problematic_extract_response_data
approach, creating inconsistency.Changes
ashared_metrics_attributes
andshared_metrics_attributes
to properly handlewith_raw_response
objects by callingresponse.parse()
when needed_aset_token_usage
and_set_token_usage
to use consistent safe attribute access withgetattr()
instead of dict-style accessWRAPPED_METHODS
andWRAPPED_AMETHODS
@pytest.mark.skip
in bedrock with_raw_response testsTest plan
with_raw_response
tests🤖 Generated with Claude Code
Important
Fixes
with_raw_response
handling, re-enables beta API instrumentation, and updates tests for consistency.with_raw_response
methods, ensuring safe attribute access withgetattr()
in_aset_token_usage
and_set_token_usage
.__init__.py
by uncommenting methods inWRAPPED_METHODS
andWRAPPED_AMETHODS
.ashared_metrics_attributes
andshared_metrics_attributes
inutils.py
to handlewith_raw_response
objects consistently.anthropic_client
inspan_utils.py
for token counting.test_bedrock_with_raw_response.py
by uncommenting@pytest.mark.skip
decorators.This description was created by
for f083133. You can customize this summary. It will automatically update as commits are pushed.
Summary by CodeRabbit
New Features
Bug Fixes
Tests
Chores