Home > Enterprise >  \ufeff on writing and reading a csv file on visualstudio
\ufeff on writing and reading a csv file on visualstudio

Time:01-08

I'm trying to write a field on a csv file and then print it on the console, however when printing the first line after writing it, it shows a number 10 (which i'm not adding to the field. Then when i'm just reading the file and printing it the second line shows what I was supposed to write on the first attempt but showing \ufeff

Here is the code

from csv import DictReader

with open('loopbacks2.csv', mode='r ', encoding='utf-8-sig') as csv_file:
    client_details = DictReader(csv_file)
    for client in client_details:
        client['State'] = csv_file.write('Completed')
        print(client)
        

Original csv file looks like this

Name,Loopback,Object,IP,SSH,State
Device1,192.168.1.2,host1,8.7.6.5/32,2022
Device2,192.168.1.3,host2,8.7.7.2/32,2222

When running the script, this is the output

{'Name': 'Device1', 'Loopback': '192.168.1.2', 'Object': 'host1', 'IP': '8.6.5.6/32', 'SSH': '2022', 'State': 10}
 

Then I open the file and it looks like this

Name,Loopback,Object,IP,SSH,State
Device1,192.168.1.2,host1,8.7.6.5/32,2022
Device2,192.168.1.3,host2,8.7.7.2/32,2222,Completed

Then if i comment the writing part and just print it like this code

from csv import DictReader

with open('loopbacks2.csv', mode='r ', encoding='utf-8-sig') as csv_file:
    client_details = DictReader(csv_file)
    for client in client_details:
        #client['State'] = csv_file.write('Completed')
        print(client)
        

Then the prompt shows this

{'Name': 'Device1', 'Loopback': '192.168.1.2', 'Object': 'host1', 'IP': '8.6.5.6/32', 'SSH': '2022', 'State': ''}

{'Name': 'Device1', 'Loopback': '192.168.1.2', 'Object': 'host1', 'IP': '8.6.5.6/32', 'SSH': '2022', 'State': '\ufeffCompleted'}

what am I missing or confusing?

Thanks in advance

CodePudding user response:

You cannot edit a file by reading a line, modifying the line, then writing that same line back in the original file.

What's happening in your code:

client['State'] = csv_file.write('Completed')
print(client)

is that you are telling Python to write (to the end of the file) the string 'Completed', which is 10 bytes long. That's the 10 that is being returned from write() and set in client['State'], which you see in the print() statement. And after the script has run and the file is closed, it has 'Completed' at the end.

To edit a file:

  1. read the original file
  2. find the data you want to modify
  3. modify that data
  4. write the modified data to a new file

and then if you want to overwrite the original with the new, do that.

Here's how that looks with Python's DictReader and DictWriter:

import csv

data = []
with open('input.csv', newline='') as f:
    reader = csv.DictReader(f)
    for row in reader:
        row['State'] = 'Completed'
        data.append(row)

with open('output.csv', 'w', newline='') as f:
    writer = csv.DictWriter(f, fieldnames=data[0].keys())
    writer.writeheader()
    writer.writerows(data)

Also, \ufeff is the BOM (wiki: UTF-8 Byte Order Mark) that's written at the beginning of every file when you specify utf-8-sig, your first call to csv_file.write(...) added that.

  •  Tags:  
  • Related