I am a complete beginner, so forgive the probably-obvious question. I have a list of roughly ~800,000 items that I am trying to run through Counter. When I try to open the script in IDLE, it stops responding, and when I try to run the script through PowerShell, it throws back an error in Line 9 (the line the large list is populated on). Is there a cap on the number of items that Counter can run?
For brevity's sake, I am not including my whole list here of course, but this is my basic script:
#!/usr/bin/env python3
import json
from itertools import count
from urllib.request import urlopen
from collections import Counter
from collections import Counter
list1 = [list, items, here, et cetera]
print(Counter(list1))
This is the complete script -- Full script with list data.
CodePudding user response:
Given that the recommended maximum line length for Python is 79 characters https://www.python.org/dev/peps/pep-0008/#maximum-line-length your expectations should have been moderate at best.
It's generally a bad practice to keep your data in your code. If you must, you should at least properly quote and escape each string in the list, e.g.:
list1 = ['sherlock holmes', 'something\\else', 'a \'quote\' here', ...]
But it's a lot easier and more robust to just put your data in a text file:
sherlock holmes
star wars
star wars sequel trilogy
...
ya lit
books
The text file need no escaping, although you may need something to deal with line endings, which appear to have been escaped as \xa0 in your data.
And then read the file from code:
with open('myfile.txt') as f:
list1 = f.read().splitlines()
From the partial escaping, it seems likely something generated your 'code' to begin with - you may want to generate it again without the escaping, and just output a clean text file, and only deal with the line endings in a sensible way.
CodePudding user response:
The full code took 20-30 seconds to load. IDLE quits responding because the text widget of the tk GUI framework it uses freezes with super-duper long lines. 10000 chars is enough to bring it almost to a stop. 100000s or 1000000s should completely freeze is.
The error in Powershell has nothing to do with IDLE. Posting a sample list with a few unquoted items would have been enough to expose that error.
