I'm using the following tweepy function:
https://docs.tweepy.org/en/stable/client.html#search-tweets
For every request, it may return (or not) a param called next_token, that works like a pagination to start next request from where i stopped.
requests_list = []
tweets = client.search_all_tweets(query=query,
start_time=start_time,
end_time=end_time,
max_results=max_results,
expansions=expansions,
tweet_fields=tweet_fields,
user_fields=user_fields,
place_fields=place_fields)
requests_list.append(tweets)
while True:
if 'next_token' in tweets.meta:
tweets = client.search_all_tweets(query=query,
start_time=start_time,
end_time=end_time,
max_results=max_results,
expansions=expansions,
tweet_fields=tweet_fields,
user_fields=user_fields,
place_fields=place_fields,
next_token = tweets.meta['next_token'])
requests_list.append(tweets)
else:
break
So, what i did:
- I make a first request outside the loop, 'cause i know that the first request i don't have next_token yet to pass to the function.
- I loop, if next_token is available i pass it through and append to a list the results, if its not available i break the loop.
How can i make it better (less code duplicated)?
Can i 'merge' the results (the same structure) instead of appending them into a list?
CodePudding user response:
You can use argument unpacking to pass key work arguments to a function.
tweets = None # initial case
requests_list = []
next_token = None
while tweets is None or next_token:
kwargs = {}
if next_token:
kwargs['next_token'] = next_token
tweets = client.search_all_tweets(query=query,
start_time=start_time,
end_time=end_time,
max_results=max_results,
expansions=expansions,
tweet_fields=tweet_fields,
user_fields=user_fields,
place_fields=place_fields,
**kwargs)
requests_list.append(tweets)
next_token = tweets.meta.get('next_token')
CodePudding user response:
You can use a dictionary to pass your arguments, something like this:
args = {'query':query,....}
flag = False
while True:
if (!flag or 'next_token' in tweets.meta):
if(flag):
args['next_token'] = tweets.meta['next_token']
tweets = client.search_all_tweets(**args)
flag = True
