I am using Python to convert a .dat file (which you can find here) to csv in order for me to use it later in numpy or csv reader.
import csv
# read flash.dat to a list of lists
datContent = [i.strip().split() for i in open("./i2019.dat").readlines()]
# write it as a new CSV file
with open("./i2019.csv", "wb") as f:
writer = csv.writer(f)
writer.writerows(datContent)
But this results in an error message of
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x8d in position 68: invalid start byte
Any help would be appreciated!
CodePudding user response:
It seems like your dat file uses Shift JIS(Japanese) encoding.
So you can pass shift_jis as the encoding argument to the open function.
datContent = [i.strip().split() for i in open("./i2019.dat", encoding='shift_jis').readlines()]
