I have a textfile containing words, numbers, and characters. I want to delete all lines with the characters and words, and keep the lines with numbers. I found out that all those lines with words and characters have the letter of "r". so I wrote my code as:
The textfile contains these lines as an example:
-- for example
-- 7 Febraury 2022
5 7 1 5 3.0 2
3*2 3 5 7.0 3
and I want to keep these 2 lines:
5 7 1 5 3.0 2
3*2 3 5 7.0 3
This is the code written: textfile = open('test.txt', 'r') A = textfile.readlines()
L = []
for index,name in enumerate(A):
if 'r' in name:
L.append(index)
for idx in sorted(L, reverse = True):
del A[idx]
I know it is not a good way to do that, is there any suggestion to do that?
CodePudding user response:
you can find only the words using regex
import re
with open(r'text_file.txt', 'r') as f:
data = f.readlines()
with open(r'text_file.txt', 'w') as f:
for line in data:
if re.findall(r"(?!^\d $)^. $", line):
f.write(line)
CodePudding user response:
If you want to do this without importing anything (e.g., re) then you could do this:
keep_these = []
def is_valid(t):
try:
float(t.replace('*', '0'))
return True
except ValueError:
pass
return False
with open('test.txt', encoding='utf-8') as infile:
for line in infile:
if all(is_valid(t) for t in line.strip().split()):
keep_these.append(line)
print(keep_these)
Thus the keep_these list will contain references to the lines you want to keep which you could, for example, use to re-write the file
CodePudding user response:
You can use the regex library re. One way to do that is to loop through the lines and then keep the line only if re.match("[^0-9 ]", line) == None.
