I am creating an Excel file and writing some rows to it. Here is what I have written:
import string
import xlsxwriter
workbook = xlsxwriter.Workbook('DataSet.xlsx')
worksheet = workbook.add_worksheet()
df2 = pd.read_csv ('d.csv', low_memory=False)
from nltk.tokenize import word_tokenize
count = 0
for index, row in df2.iterrows():
if row['source_id'] == 'EN':
count = 1
print(count)
text = row['text']
new_string = text.translate(str.maketrans('', '', string.punctuation))
new_string = word_tokenize(new_string)
sentence = ''
tokens = ''
for word in new_string:
sample_len = len(new_string)
count_len = 0
sentence = word
sentence = ' '
tokens = word
if count_len != sample_len:
tokens = ', '
worksheet.write(count, 3, tokens)
worksheet.write(count, 2, sentence)
worksheet.write(count, 1, 'Discrimination')
worksheet.write(count, 0, count)
workbook.close()
However, after the row number 94165, it gives me the following error and won't proceed anymore:
Traceback (most recent call last):
File "/Users/PycharmProjects/pythonProject/venv/lib/python3.9/site-packages/xlsxwriter/workbook.py", line 323, in close
self._store_workbook()
File "/Users/PycharmProjects/pythonProject/venv/lib/python3.9/site-packages/xlsxwriter/workbook.py", line 745, in _store_workbook
raise e
File "/Users/PycharmProjects/pythonProject/venv/lib/python3.9/site-packages/xlsxwriter/workbook.py", line 739, in _store_workbook
xlsx_file.write(os_filename, xml_filename)
File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/zipfile.py", line 1761, in write
with open(filename, "rb") as src, self.open(zinfo, 'w') as dest:
File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/zipfile.py", line 1505, in open
return self._open_to_write(zinfo, force_zip64=force_zip64)
File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/zipfile.py", line 1597, in _open_to_write
self._writecheck(zinfo)
File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/zipfile.py", line 1712, in _writecheck
raise LargeZipFile(requires_zip64
zipfile.LargeZipFile: Filesize would require ZIP64 extensions
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/Users/PycharmProjects/pythonProject/Python file.py", line 64, in <module>
workbook.close()
File "/Users/PycharmProjects/pythonProject/venv/lib/python3.9/site-packages/xlsxwriter/workbook.py", line 327, in close
raise FileSizeError("Filesize would require ZIP64 extensions. "
xlsxwriter.exceptions.FileSizeError: Filesize would require ZIP64 extensions. Use workbook.use_zip64().
Does anyone know why this has occurred and how it can be solved?
CodePudding user response:
The issue is caused by the fact that the resulting file, or components of it are greater than 4GB in size. This requires an additional parameter to be passed by xlsxwriter to the Python standard library zipfile.py in order to support larger zip file sizes.
The answer/solution is buried in the exception message:
xlsxwriter.exceptions.FileSizeError: Filesize would require ZIP64 extensions.
Use workbook.use_zip64().
You can either add it as a constructor option or workbook method:
workbook = xlsxwriter.Workbook(filename, {'use_zip64': True})
# Same as:
workbook = xlsxwriter.Workbook(filename)
workbook.use_zip64()
See the docs on the Workbook Constructor and workbook.use_zip64() including the following Note:
Note:
When using the
use_zip64()option the zip file created by the Python standard library zipfile.py may cause Excel to issue a warning about repairing the file. This warning is annoying but harmless. The “repaired” file will contain all of the data written by XlsxWriter, only the zip container will be changed.
