Hi I am trying to use csv library to convert my CSV file into a new one.
The code that I wrote is the following:
import csv
import re
file_read=r'C:\Users\Comarch\Desktop\Test.csv'
file_write=r'C:\Users\Comarch\Desktop\Test_new.csv'
def find_txt_in_parentheses(cell_txt):
pattern = r'\(. \)'
return set(re.findall(pattern, cell_txt))
with open(file_write, 'w', encoding='utf-8-sig') as file_w:
csv_writer = csv.writer(file_w, lineterminator="\n")
with open(file_read, 'r',encoding='utf-8-sig') as file_r:
csv_reader = csv.reader(file_r)
for row in csv_reader:
cell_txt = row[0]
txt_in_parentheses = find_txt_in_parentheses(cell_txt)
if len(txt_in_parentheses) == 1:
txt_in_parentheses = txt_in_parentheses.pop()
cell_txt_new = cell_txt.replace(' ' txt_in_parentheses,'')
cell_txt_new = txt_in_parentheses '\n' cell_txt_new
row[0] = cell_txt_new
csv_writer.writerow(row)
The only problem is that in the resulting file (Test_new.csv file), I have CRLF instead of LF.
Here is a sample image of:
- read file on the left
- write file on the right:
And as a result when I copy the csv column into Google docs Excel file I am getting a blank line after each row with CRLF.
Is it possible to write my code with the use of csv library so that LF is left inside a cell instead of CRLF.
CodePudding user response:
From the documentation of csv.reader
If
csvfileis a file object, it should be opened withnewline=''1
[...]Footnotes
1(1,2) If
newline=''is not specified, newlines embedded inside quoted fields will not be interpreted correctly, and on platforms that use\r\nlinendings on write an extra\rwill be added. It should always be safe to specifynewline='', since the csv module does its own (universal) newline handling.
This is precisely the issue you're seeing. So...
with open(file_read, 'r', encoding='utf-8-sig', newline='') as file_r, \
open(file_write, 'w', encoding='utf-8-sig', newline='') as file_w:
csv_reader = csv.reader(file_r, dialect='excel')
csv_writer = csv.writer(file_w, dialect='excel')
# ...
CodePudding user response:
You are on Windows, and you open the file with mode 'w' -- which gives you windows style line endings. Using mode 'wb' should give you the preferred behaviour.


