I have a BigQuery table that I am hoping to populate using pandas-gbq. The table has a predefined schema that includes nullable int and string fields. Currently, I am generating a dict of one list for each data field and putting pandas.NA or None (I've tried both) when I am missing values. I am currently missing values for one of my nullable int fields, e.g.:
df_dict = {'ints': [1,2,None, 3], 'strings': ['a','b','c','d']}
df = pandas.DataFrame(df_dict)
df.astype({"ints":"int", "strings":"object"}) # throws error on None in ints
Ultimately I want to upload this to BigQuery with a pre-existing schema so I need the null values in a format that pandas-gbq and BigQuery itself will accept. Any ideas?
CodePudding user response:
You can try to use this to cast a column to an integer type in pandas
df.astype({"ints":"Int64", "strings":"object"})
which should be able to handle Nones or null values. Apart from that, you must ensure that the columns in BigQuery are defined as being Nullable.
