fix: use .get() for optional fields in User.__init__ and Client.request#418
fix: use .get() for optional fields in User.__init__ and Client.request#418PawiX25 wants to merge 1 commit into
Conversation
Twitter/X API does not always return all expected fields in user legacy
data or error response objects, causing KeyError crashes.
Changes:
- twikit/user.py: Replace ~30 direct dict accesses in User.__init__
with .get() and appropriate type defaults (str='', int=0, bool=False,
list=[])
- twikit/client/client.py: Change response_data['errors'][0]['code']
to .get('code') to handle error responses without a code field
Fixes d60#417
Reviewer's GuideMakes User model initialization and client error handling resilient to missing fields in Twitter/X API responses by replacing direct dict indexing with safe .get() calls and appropriate defaults. Class diagram for updated User model initializationclassDiagram
class User {
+Client _client
+str id
+str created_at
+str name
+str screen_name
+str profile_image_url
+str profile_banner_url
+str url
+str location
+str description
+list description_urls
+list urls
+list pinned_tweet_ids
+bool is_blue_verified
+bool verified
+bool possibly_sensitive
+bool can_dm
+bool can_media_tag
+bool want_retweets
+bool default_profile
+bool default_profile_image
+bool has_custom_timelines
+int followers_count
+int fast_followers_count
+int normal_followers_count
+int following_count
+int favourites_count
+int listed_count
+int media_count
+int statuses_count
+bool is_translator
+str translator_type
+list withheld_in_countries
+bool protected
+User(Client client, dict data)
}
class Client {
}
Client <.. User : uses
File-Level Changes
Assessment against linked issues
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
📝 WalkthroughWalkthroughThe pull request adds defensive programming measures to prevent Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Hey - I've found 1 issue, and left some high level feedback:
- Using
data.get('rest_id', '')and defaulting many fields to''/0/Falsemay silently hide malformed responses; consider making these attributesOptional[...]and defaulting toNone(or raising) for fields that are expected to be present in normal operation. - In
Client.request, you guard for'errors' in response_databut still assumeresponse_data['errors'][0]exists; consider handling an emptyerrorsarray or non-list value to avoid a potentialIndexError/TypeError.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- Using `data.get('rest_id', '')` and defaulting many fields to `''`/`0`/`False` may silently hide malformed responses; consider making these attributes `Optional[...]` and defaulting to `None` (or raising) for fields that are expected to be present in normal operation.
- In `Client.request`, you guard for `'errors' in response_data` but still assume `response_data['errors'][0]` exists; consider handling an empty `errors` array or non-list value to avoid a potential `IndexError`/`TypeError`.
## Individual Comments
### Comment 1
<location path="twikit/user.py" line_range="102-104" />
<code_context>
- self.withheld_in_countries: list[str] = legacy['withheld_in_countries']
+ self.location: str = legacy.get('location', '')
+ self.description: str = legacy.get('description', '')
+ self.description_urls: list = legacy.get('entities', {}).get('description', {}).get('urls', [])
+ self.urls: list = legacy.get('entities', {}).get('url', {}).get('urls')
+ self.pinned_tweet_ids: list[str] = legacy.get('pinned_tweet_ids_str', [])
+ self.is_blue_verified: bool = data.get('is_blue_verified', False)
</code_context>
<issue_to_address>
**suggestion:** Align `urls` default with `description_urls` to consistently return a list.
`description_urls` always returns a list (defaulting to `[]`), but `urls` may be `None` when `entities`/`url` is missing. This forces consumers to handle both `None` and `list` for similar fields. To keep the API consistent and predictable, consider defaulting `urls` to an empty list as well (e.g. `...get('urls', [])`).
```suggestion
self.description_urls: list = legacy.get('entities', {}).get('description', {}).get('urls', [])
self.urls: list = legacy.get('entities', {}).get('url', {}).get('urls', [])
self.pinned_tweet_ids: list[str] = legacy.get('pinned_tweet_ids_str', [])
```
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
| self.description_urls: list = legacy.get('entities', {}).get('description', {}).get('urls', []) | ||
| self.urls: list = legacy.get('entities', {}).get('url', {}).get('urls') | ||
| self.pinned_tweet_ids: list[str] = legacy.get('pinned_tweet_ids_str', []) |
There was a problem hiding this comment.
suggestion: Align urls default with description_urls to consistently return a list.
description_urls always returns a list (defaulting to []), but urls may be None when entities/url is missing. This forces consumers to handle both None and list for similar fields. To keep the API consistent and predictable, consider defaulting urls to an empty list as well (e.g. ...get('urls', [])).
| self.description_urls: list = legacy.get('entities', {}).get('description', {}).get('urls', []) | |
| self.urls: list = legacy.get('entities', {}).get('url', {}).get('urls') | |
| self.pinned_tweet_ids: list[str] = legacy.get('pinned_tweet_ids_str', []) | |
| self.description_urls: list = legacy.get('entities', {}).get('description', {}).get('urls', []) | |
| self.urls: list = legacy.get('entities', {}).get('url', {}).get('urls', []) | |
| self.pinned_tweet_ids: list[str] = legacy.get('pinned_tweet_ids_str', []) |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@twikit/user.py`:
- Around line 102-103: The code uses chained dict.get calls on legacy which can
be None (e.g., legacy.get('entities', {}) returns None if entities exists but is
None), causing AttributeError; change accesses in the User initializer to
defensively coalesce with or {} (e.g., use (legacy.get('entities') or {}) then
.get('description') or {} and .get('url') or {}) and ensure both
self.description_urls and self.urls default to empty lists ([]) not None to keep
types stable; also consider hardening the earlier legacy extraction in
build_user_data by using (data.get('legacy') or {}) so legacy is never None.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 17b1673c-6f40-49ff-a80c-1b847f1a4d0e
📒 Files selected for processing (2)
twikit/client/client.pytwikit/user.py
| self.description_urls: list = legacy.get('entities', {}).get('description', {}).get('urls', []) | ||
| self.urls: list = legacy.get('entities', {}).get('url', {}).get('urls') |
There was a problem hiding this comment.
Chained .get(..., {}) still breaks when intermediate values are explicitly None.
dict.get(key, default) only returns default when the key is missing; it returns None if the key is present with a None value. Looking at build_user_data in twikit/utils.py (used by follow_user/block_user/mute_user/etc.), legacy['entities'] is set via raw_data.get('entities'), so it can be None. In that case:
legacy.get('entities', {}) # -> None
.get('description', {}) # -> AttributeError: 'NoneType' object has no attribute 'get'
This re-introduces the same class of crash the PR is trying to eliminate. Same concern applies to entities.description and entities.url being None.
Also, there is an inconsistency on Line 103: self.urls defaults to None while the docstring/type annotation says list, and self.description_urls on Line 102 defaults to []. Prefer [] for both to keep the attribute type stable for downstream consumers.
🛡️ Proposed fix
- self.description_urls: list = legacy.get('entities', {}).get('description', {}).get('urls', [])
- self.urls: list = legacy.get('entities', {}).get('url', {}).get('urls')
+ entities = legacy.get('entities') or {}
+ description_entities = entities.get('description') or {}
+ url_entities = entities.get('url') or {}
+ self.description_urls: list = description_entities.get('urls', [])
+ self.urls: list = url_entities.get('urls', [])The same or {} pattern also covers the data.get('legacy', {}) on Line 91 being None if you want to harden that further:
- legacy = data.get('legacy', {})
+ legacy = data.get('legacy') or {}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| self.description_urls: list = legacy.get('entities', {}).get('description', {}).get('urls', []) | |
| self.urls: list = legacy.get('entities', {}).get('url', {}).get('urls') | |
| entities = legacy.get('entities') or {} | |
| description_entities = entities.get('description') or {} | |
| url_entities = entities.get('url') or {} | |
| self.description_urls: list = description_entities.get('urls', []) | |
| self.urls: list = url_entities.get('urls', []) |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@twikit/user.py` around lines 102 - 103, The code uses chained dict.get calls
on legacy which can be None (e.g., legacy.get('entities', {}) returns None if
entities exists but is None), causing AttributeError; change accesses in the
User initializer to defensively coalesce with or {} (e.g., use
(legacy.get('entities') or {}) then .get('description') or {} and .get('url') or
{}) and ensure both self.description_urls and self.urls default to empty lists
([]) not None to keep types stable; also consider hardening the earlier legacy
extraction in build_user_data by using (data.get('legacy') or {}) so legacy is
never None.
|
thanks |
- PATCHES.md documents the cherry-picks that diverge this branch from upstream d60/twikit:main. Anyone landing here can see at a glance which commits are ours and why each exists. - .github/workflows/drift-check.yml runs weekly, checks whether any of the upstream PRs listed in PATCHES.md have been merged, and opens an issue here when one has so we know to retire the cherry-pick. Cherry-picks currently carried (see PATCHES.md for detail): - d60#419 (7 commits) — SearchTimeline queryId refresh + defensive parsing + 429 recursion guard + ondemand.s extractor. - d60#418 (1 commit) — .get() for optional fields in User.__init__ and Client.request.
|
Landed here from #421 (now closed as duplicate). One small thing worth folding into this PR — the errors = response_data.get('errors') if isinstance(response_data, dict) else None
if errors and isinstance(errors, list):
first_error = errors[0] if isinstance(errors[0], dict) else {}
error_code = first_error.get('code')
error_message = first_error.get('message')Also flags malformed-error cases (where |
|
Stop copy-pasting AI suggestions you don't understand. You opened a duplicate PR, closed it, and now you're trying to pad your contribution by pushing bot-generated edge cases onto my fix. This PR addresses real crashes — if you have a real bug to fix, open your own PR and own it. |
|
|
Fixes #417
Twitter/X API does not always return all expected fields in user legacy data or error response objects, causing
KeyErrorcrashes.Changes
twikit/user.pylegacy['key']) inUser.__init__with.get('key', default)using appropriate type defaults:strfields →''intfields →0boolfields →Falselistfields →[]entities.description.urlswith chained.get()calls.get()(profile_banner_url,url,protected) left unchangedtwikit/client/client.pyresponse_data['errors'][0]['code']to.get('code')on line 158 to handle error responses without acodefieldTesting
KeyErrorcrashes observed in production (KeyError: 'urls',KeyError: 'withheld_in_countries',KeyError: 'code').get()returns the same value when the key is presentSummary by Sourcery
Make user and error parsing resilient to missing fields in Twitter API responses.
Bug Fixes:
Enhancements:
Summary by CodeRabbit