I have a list object json_reponse that contains 28 tweet objects mentioning two different politicians i.e. each_dict (2x14). I am trying to retrieve data on author_id, created_at, tweet_id, text, bio, image_url. Not all 28 objects have a bio or image_url. Thus the if-else statement in #2 and #3. My code retrieves author_id, created_at, tweet_id, text as expected. However, the bio and image_url data duplicates the last object for each politician throughout 14 rows as can be seen in the print(df).
I have tried different indentations but it does not solve the issue. Is there a fundamental error in the way I loop over json_response
A look at the structure of json_response. NOTE: due to the character limit I have deleted some of the objects in json_response
print(json.dumps(json_response, indent=4, sort_keys=True)) # look at json_response object.
[
{
"data": [
{
"author_id": "737885223858384896",
"created_at": "2021-03-26T21:56:02.000Z",
"id": "1375567243082338314",
"text": "@hogan_1969 @LindseyGrahamSC LOL She Blocked me.. could not admit the truth could she now. okay so where is her source for the shirts? and that is what he said. I (quote) We immediately surge the border all those seeking asylum. What about his lie about the cages? no Answer lol."
},
{
"author_id": "27327319",
"created_at": "2021-03-02T11:53:16.000Z",
"id": "1366718245521211393",
"text": "@fedupinNHtoo @LindseyGrahamSC Exactly. I asked that question of a Republican on Facebook last night and she blocked me"
},
{
"author_id": "917634626247647232",
"created_at": "2021-02-28T18:16:45.000Z",
"id": "1366089974907432961",
"text": "@gop this is for you! @tedcruz @LindseyGrahamSC @MittRomney @mikepompeo\n#BitchyMcC blocked me!\ud83d\udc4d\nWatch \"Jack Off Jill - Hypocrite lyrics\" on YouTube [link here]"
},
{
"author_id": "1231059979844456448",
"created_at": "2021-02-26T04:25:49.000Z",
"id": "1365156089554067459",
"text": "@KelleyALynch1 @marwilliamson @therecount @LindseyGrahamSC She's fine with that just as she's fine with Biden's Nazis in Ukraine. She wants war with Russia, too. She blocked me for this tweet because she couldn't even condemn Biden's Nazis in Ukraine. She's a fauxgressive warmonger, a wolf in sheep's clothing. \n[link here]"
},
{
"author_id": "1315477593303310336",
"created_at": "2021-02-23T00:00:41.000Z",
"id": "1364002202843451399",
"text": "@MistyKitty3 @BlairMurray83 @FrankAmari2 @LindseyGrahamSC \ud83e\udd23 Someone didn\u2019t like what I said and blocked me."
},
{
"author_id": "79007230",
"created_at": "2021-02-13T13:10:54.000Z",
"id": "1360577189750702080",
"text": "@Fuzzy_Fuzzbutt @DearAuntCrabby @LindseyGrahamSC I know you blocked me, poor little boo-boo feels attacked by a rhetorical question. But the reality is that cartoon is WILDLY homophobic."
}
],
"includes": {
"media": [
{
"media_key": "3_1361344652264280068",
"type": "photo",
"url": "[link here]"
}
],
"users": [
{
"created_at": "2016-06-01T05:55:21.000Z",
"description": "Biden Inflation the worst in 30 years. His Handlers trying to Rebrand Brandon is Hilarious.",
"id": "737885223858384896",
"name": "Biden is a complete mess and you know it.",
"username": "zelda3024"
},
{
"created_at": "2017-03-31T00:54:05.000Z",
"description": "Love God, Love Family, Love Country, Love Freedom - if we put those things first everything else will be great. MAGA",
"id": "847612931487416323",
"name": "Joey Bagadonuts",
"username": "AmericanGr8ness"
},
{
"created_at": "2009-01-05T15:25:55.000Z",
"description": "small & local garlic farmer; independent American; old surfer dude; working to find and speak truth to power; \ud83c\uddfa\ud83c\uddf8; mahalo and Maluhia",
"id": "18634205",
"name": "MacGregorGarlic",
"username": "MacGregorGarlic"
},
{
"created_at": "2009-03-28T22:53:28.000Z",
"description": "Let's Go Darwin!",
"id": "27327319",
"name": "Karen Kennedy",
"username": "KayKay68"
},
{
"created_at": "2017-10-10T06:15:18.000Z",
"description": "Mom\ud83d\udc95Cannactivist\ud83c\udf3fSecularHumanist\ud83c\udf10 BLM\u270a\ud83c\udfff\ud83c\udf08Ally\ud83e\udd8bCPTSD\u2695\ufe0f FTD\ud83e\udd14MeToo\ud83c\udf38ProChoice\ud83d\udc93CRPS\ud83d\ude23ClimateChange\ud83c\udf0e DACA\ud83c\uddfa\ud83c\uddf2AdoptDontShop\ud83d\udc3e#Steelers \ud83d\udda4\ud83d\udc9b #Vaxxed2TheMax\u270a\ud83d\udc9a",
"id": "917634626247647232",
"name": "Raven The Hemptress #LegalizeGlobally\ud83d\udc9a\ud83c\udf3f\u267f",
"username": "Kraven_Raven24"
},
{
"created_at": "2020-02-22T03:35:56.000Z",
"description": "Monetarism is the underlying cause of our disease; human progress and peace through development is the cure. Eurasian integration will benefit all of humanity!",
"id": "1231059979844456448",
"name": "\ud83c\udd70pocalypsis \ud83c\udd70pocalypseos \u2014 BRI Is The Future",
"username": "apocalypseos"
},
{
"created_at": "2020-10-12T02:21:21.000Z",
"description": "Father of two beautiful boys. Believer in the Constitution of the United States. Protector of my own rights. #Meatatarian",
"id": "1315477593303310336",
"name": "\ud83e\udd85 Steven Duggin \u2665\ufe0f \ud83c\uddfa\ud83c\uddf8\ud83d\uddfd",
"username": "itsStevenDuggin"
},
{
"created_at": "2018-12-02T06:25:16.000Z",
"description": "",
"id": "1069115263671562240",
"name": "Barhag",
"username": "TheBarhag"
},
{
"created_at": "2020-09-08T13:19:17.000Z",
"description": "Not the liberals cup of tea",
"id": "1303321972227690496",
"name": "Christy",
"username": "Christy54177764"
},
{
"created_at": "2009-03-31T19:34:24.000Z",
"description": "NY-grown, FL-tanned, scribe, word nerd, TV junkie, game show champ, yenta, wife, twin mama, hot sauce collector, Bloody Mary maven &, says @NYPost, savvy gadfly",
"id": "27943005",
"name": "Lesley Abravanel",
"username": "lesleyabravanel"
},
{
"created_at": "2019-05-08T22:15:51.000Z",
"description": "\ud83c\udf37 Wholesome account dedicated to Yuuko Aioi. \ud83c\udf37\n\nI also enjoy making new friends and posting about games, my everyday life, cats, NASCAR, memes and fumos!",
"id": "1126249378279297027",
"name": "Vaxen #DailyYuuko \u2603\ufe0f",
"username": "YuukoEnjoyer"
},
{
"created_at": "2019-12-18T22:47:10.000Z",
"description": "The Republican party is bad for America. The Conservatives are Trump bootlickers who are afraid to stand up to him. This great nation is in serious trouble.",
"id": "1207432044390699008",
"name": "Angry Patriot",
"username": "AngryPatriot20"
},
{
"created_at": "2012-11-05T05:19:37.000Z",
"description": "Employment lawyer. Represent employers and employees. 30 years ago, my mentor told me to seek the truth as a lawyer. Still do that. Tweets are not legal advice.",
"id": "926909484",
"name": "Alfred Southerland",
"username": "TexasEEOLaw"
},
{
"created_at": "2009-10-01T21:17:18.000Z",
"description": "Knitter, IT PM, Mom/Wife, Dem, not-observant Jewish Boomer (Gen Jones). If you seem like a troll, have a locked acct or put me on a list, I'll \ud83d\udeab you.\n\nShe/her.",
"id": "79007230",
"name": "\u2721\ufe0f MG \u2721\ufe0f #DoYourPart #GetBoosted #MaskUp",
"username": "knitvspurl"
}
]
},
"meta": {
"newest_id": "1375567243082338314",
"next_token": "b26v89c19zqg8o3fosnr8o52h2bpczky08inrqbn3ad8d",
"oldest_id": "1360577189750702080",
"result_count": 14
}
},
{
"data": [
{
"author_id": "1248251899884814336",
"created_at": "2021-03-27T13:36:45.000Z",
"id": "1375803982409576450",
"text": "@gavinjeffries0 @steven86026859 @MSNBC @SenBooker Uh Oh our friend Steve blocked me, I guess not being able to answer your simple question and being asked to was too much for him."
},
{
"author_id": "752266160352010241",
"created_at": "2021-02-06T20:34:06.000Z",
"id": "1358152008948195328",
"text": "@fattypinner @tkbone32221 @SenSchumer @SenBooker @RonWyden He blocked me \ud83e\udd23\ud83d\ude2d\ud83e\udd23\ud83e\udd23\ud83e\udd23\ud83d\ude2d"
},
{
"author_id": "70127580",
"created_at": "2021-01-20T21:35:26.000Z",
"id": "1352006847226671114",
"text": "@HUMAN1TY_tweets @KMJeezy @fancytomboy Chile @SenBooker @CoryBooker is not playing. Gurl he really trying hard to pull off straight. @rosariodawson blocked me. . . . Guess I touched a nerve. . . . I thought she was Bi? LOL \ud83d\ude02"
},
{
"author_id": "386603314",
"created_at": "2021-01-07T02:18:04.000Z",
"id": "1347004547844096001",
"text": "Lmao it was a few hours I just posted Cory Booker memes that nobody BUT HIM really saw and he blocked me that\u2019s what it was [link here]"
},
{
"attachments": {
"media_keys": [
"3_1347003932875358217"
]
},
"author_id": "386603314",
"created_at": "2021-01-07T02:15:39.000Z",
"id": "1347003935916052480",
"text": "Remember when Cory Booker blocked me lol [link here]"
},
{
"author_id": "1108691472214155265",
"created_at": "2020-12-31T06:38:50.000Z",
"id": "1344533452956295168",
"text": "@DemToRose @CarolenaMatus @SavageJoyMarie1 @KamalaHarris @SenSchumer @SenKamalaHarris @SenBooker @SenFeinstein He blocked me - what\u2019d he say?"
},
{
"attachments": {
"media_keys": [
"3_1340532486812549121",
"3_1340532486808297475"
]
},
"author_id": "2706897311",
"created_at": "2020-12-20T05:40:26.000Z",
"id": "1340532491594153984",
"text": "@NeaminZeleke @SenWarren @SenBooker @CornelWest It\u2019s very interesting. He(smgebru) posted the following on Instagram & I (si_feam) challenged him. He blocked me. He doesn\u2019t want to talk truth but only being propagandist for TPLF. [link here]"
},
{
"author_id": "1293905938131251200",
"created_at": "2020-12-19T02:13:30.000Z",
"id": "1340118027937955841",
"text": "@heatherklus @AwakenedJoyce @data_nerd @FriedrichHayek @RepYvetteClarke @alexleavitt @RonWyden @Google @sundarpichai @timnitGebru @RepAnnaEshoo @SenBooker @NydiaVelazquez @RepBillFoster @RepMcNerney @SenWarren Naturally, this know-nothing dilettante blocked me now. Which is the standard response of the #woke when they run out of arguments."
},
{
"author_id": "1100925481803804673",
"created_at": "2020-11-19T08:28:39.000Z",
"id": "1329340801915228161",
"text": "Not calling @HerSpiritistheTruth45 a Democrat. She misread my meaning but then blocked me so if someone would let her know I was referring to Cory Booker, not her!!! [link here]"
},
{
"author_id": "309190582",
"created_at": "2020-11-17T17:45:40.000Z",
"id": "1328756204626186242",
"text": "@HillaryClinton See you soon I tried really hard to make the cut I couldn't do the election and Cory Booker sorry madam. He was the issue with your campaign. He blocked me so hard. He killed my entire family. \nMommy said she was dying so sorry I really hustle. \n\nRobinMichelle"
},
{
"author_id": "1254067781621997572",
"created_at": "2020-11-15T02:15:30.000Z",
"id": "1327797342997766145",
"text": "Reminds me of that time Cory Booker hopped in my DMs to argue about charter schools. He blocked me shortly there after."
}
],
"includes": {
"media": [
{
"media_key": "3_1358448920632909825",
"type": "photo",
"url": "[link here]"
},
{
"media_key": "3_1347003932875358217",
"type": "photo",
"url": "[link here]"
},
{
"media_key": "3_1340532486812549121",
"type": "photo",
"url": "[link here]"
},
{
"media_key": "3_1340532486808297475",
"type": "photo",
"url": "[link here]"
}
],
"users": [
{
"created_at": "2020-04-09T14:11:04.000Z",
"description": "",
"id": "1248251899884814336",
"name": "Firstcomm",
"username": "Firstcomm1"
},
{
"created_at": "2011-05-04T19:26:22.000Z",
"description": "Cinephile, balletomane, book lover, tennis fan, K-Drama fanatic, Jang Na-ra fangirl, USC School of Cinematic Arts alumna, Hillary Clinton and Nancy Pelosi Dem.",
"id": "293104735",
"name": "Joyce Tyler",
"username": "joyce_tyler"
},
{
"created_at": "2011-09-27T14:50:37.000Z",
"description": "Spelman College, BA, George Washington University MA, University of South Florida Ph.D. in Political Science, proud Ted Kennedy, Obama, Biden/Harris Democrat!",
"id": "380970864",
"name": "Stephanie L. Williams, Ph.D.",
"username": "slwilliams1101"
},
{
"created_at": "2016-10-31T19:37:19.000Z",
"description": "Loves: life, fam, cats, cars, tattoos, reality TV; collector of t-shirts & Volkswagen\u2019s. Hates: Oxford commas. #CombatVet #Medic #BidenHarris2020 #Resist",
"id": "793175035322171397",
"name": "Que Sarah Sarah \ud83d\udda4",
"username": "sarahalli13"
},
{
"created_at": "2016-07-10T22:20:03.000Z",
"description": "3x Hollywood Video Street Fighter 2 Champion",
"id": "752266160352010241",
"name": "Sugarcoder",
"username": "TheSugarCoder"
},
{
"created_at": "2009-08-30T14:15:02.000Z",
"description": "",
"id": "70127580",
"name": "Michael J. Stratton",
"username": "mjstra"
},
{
"created_at": "2011-10-07T15:28:05.000Z",
"description": "host of @WeDidntDoIt true crime/comedy podcast | future true crime legend | audio engineer | comedian | retired mma fighter | bjj purple belt | #ReMeMber",
"id": "386603314",
"name": "\u02b0\u1d58\u207f\u1d4d \u1d34\u1d2c\u1d3a\u1d37 \uea00",
"username": "yunghank"
},
{
"created_at": "2019-03-21T11:27:00.000Z",
"description": "",
"id": "1108691472214155265",
"name": "lu",
"username": "lubean13"
},
{
"created_at": "2014-07-13T12:10:52.000Z",
"description": "Scie-Techie|Centrist|Prefer_Equilibrium|",
"id": "2706897311",
"name": "Mesthio\ud83c\uddea\ud83c\uddf9",
"username": "MEsthio"
},
{
"created_at": "2020-08-13T13:43:19.000Z",
"description": "Trump stans are just as idiotic as the woke.",
"id": "1293905938131251200",
"name": "Pen\u00e9lope",
"username": "Penlope56181829"
},
{
"created_at": "2019-02-28T01:07:43.000Z",
"description": "An earthen vessel sojourning through the adventures & challenges of life. Singer/songwriter, poet, artist. Intercessor, worshiper, mystic. Mom, grannyX4.",
"id": "1100925481803804673",
"name": "AVessel",
"username": "AVessel2"
},
{
"created_at": "2011-06-01T17:48:10.000Z",
"description": "Publicist, Educator, Public Relations butterfly. Newsroom & Communications Specialist dabbling in Politics and Entertainment rt not necessarily endorsement",
"id": "309190582",
"name": "RMC Communications PR",
"username": "RobinMCouch"
},
{
"created_at": "2020-04-25T15:21:03.000Z",
"description": "\u201cAll labor has dignity.\u201d- MLK Ronald Reagan is the devil. Bill Clinton screwed us all. I\u2019m just a blue collar dude from the D. @jbrous14",
"id": "1254067781621997572",
"name": "Ella Septima-Hamer",
"username": "jbrous41"
}
]
},
"meta": {
"newest_id": "1375803982409576450",
"next_token": "b26v89c19zqg8o3fosbv3bjxj42w0zkak6xf94932g3ul",
"oldest_id": "1327797342997766145",
"result_count": 14
}
}
]
Code to retrieve data from json_response and save it in df
# append data (author_id, created_at, tweet_id, text, bio, image_url) from json_response to a data frame
df = pd.DataFrame()
for each_dict in json_response:
# 1. loop for data object
for tweet in each_dict['data']:
row = {} # empty dict for data
row["author_id"] = tweet['author_id'] # 1. id of user
row["created_at"] = dateutil.parser.parse(tweet['created_at']) # 2. time of tweet
row["tweet_id"] = tweet['id'] # 3. tweet id
row["text"] = tweet['text'] # 4. tweet
# 2. loop for user object
for user in each_dict['includes']['users']:
# 5. user bio
if 'description' in user:
row['bio'] = user.get('description') # if user has bio get bio
else:
row['bio'] = user.get(' ') # if user has no bio fill row with NaN
# 3. loop for media object
for media in each_dict['includes']['media']:
# 6. image url
if 'url' in media:
row['image_url'] = media.get('url') # if user tweet url get url
else:
row['image_url'] = media.get(' ') # if user does not tweet url fill row with NaN
df = df.append(row, ignore_index=True) # append data to empty df
The result from the code above. Issue is that bio and image_url columns contain duplicates of the last object in each each_dict
author_id created_at tweet_id text bio image_url
0 737885223858384896 2021-03-26 21:56:02 00:00 1375567243082338314 @hogan_1969 @LindseyGrahamSC LOL She Blocked me.. could not admit the truth could she now. okay so where is her source for the shirts? and that is what he said. I (quote) We immediately surge the border all those seeking asylum. What about his lie about the cages? no Answer lol. Knitter, IT PM, Mom/Wife, Dem, not-observant Jewish Boomer (Gen Jones). If you seem like a troll, have a locked acct or put me on a list, I'll 