Home > database >  Python how to cut a certain number of lines from a file and write them into another
Python how to cut a certain number of lines from a file and write them into another

Time:01-10

I'm starting to learn python for everyday tasks, and up to this point have only encountered simple file processing. But today the task of processing a large number of heavy files came up before me and I ended up with.

I have a large file with the following contents:

TRACE:1
10000.000000;-4.316597;
10050.000000;-4.686951;
10100.000000;-5.178696;
10150.000000;-4.356827;
10200.000000;-4.620125;
.....
TRACE 2:
10000.000000;-50.371719;
10050.000000;-52.572052;
10100.000000;-60.795563;
10150.000000;-50.679413;
10200.000000;-62.036072;
.....
TRACE 3:
10000.000000;-10.796394;
10050.000000;-10.879318;
10100.000000;-11.238129;
10150.000000;-10.811073;
10200.000000;-10.627502;
10250.000000;-10.825951;
10300.000000;-11.240158;
TRACE 4:
Nope;Nope;

I need to separate the trace data into separate files

Input: data.DAT --> Output: Trace1.DAT, Trace2.DAT, Trace3.DAT

I tried to do it in this way

import os
os.chdir('C:\path\to\work\dir')
with open('data.DAT') as dataFile, \
        open('Trace1.DAT', "w ") as T1,\
        open('Trace2.DAT', "w ") as T2,\
        open('Trace3.DAT', "w ") as T3:
    a = []
    for num, line in enumerate(dataFile, 1):
        if 'TRACE 1:' in line:
            a.append(num)
        if 'TRACE 2:' in line:
            a.append(num)
        if 'TRACE 3:' in line:
            a.append(num)
        if 'TRACE 4:' in line:
            a.append(num)
    print(a)
    [dataFile.readline() for _ in range(a[0], a[1])]
    T1.writelines(dataFile)
    [dataFile.readline() for _ in range(a[1], a[2])]
    T2.writelines(dataFile)
    [dataFile.readline() for _ in range(a[2], a[3])]
    T3.writelines(dataFile)

And I got Python output:

a = [42, 1999847, 3999652, 5999457]

Created files:

drwxrwxrwx 1 user user       512 Jan 10 17:31 .
drwxrwxrwx 1 user user       512 Jan 10 17:31 ..
-rwxrwxrwx 1 user user         0 Jan 10 16:46 Trace1.DAT
-rwxrwxrwx 1 user user         0 Jan 10 16:47 Trace2.DAT
-rwxrwxrwx 1 user user         0 Jan 10 16:47 Trace3.DAT
-rwxrwxrwx 1 user user 173082112 Jan 11  2019 data.DAT

Thanks for the help or advice

CodePudding user response:

You can try:

with open('data.DAT') as inp:
    out = None
    for line in inp:
        if line.startswith('TRACE'):
            if out != None:
                out.close()
            filename = f"Trace{line.split()[1][:-1]}.DAT"
            out = open(filename, 'w')
        else:
            out.write(line)
    out.close()

CodePudding user response:

a bit lengthy but works :)

with open('data.DAT', 'r') as filereader:

read = filereader.readlines()

traceOne = None
TraceTwo = None
TraceThree = None
for i in read:
    if i.startswith("TRACE:1"):
        traceOne = read.index(i)
    elif i.startswith("TRACE 2:"):
        TraceTwo = read.index(i)
    elif i.startswith("TRACE 3:"):
        TraceThree = read.index(i)
    else:
        continue

with open('TRACE1.DAT', 'w ') as writeTraceONE:
    traceonefile = writeTraceONE.writelines("\n".join(read[traceOne:TraceTwo]))

with open('TRACE2.DAT', 'w ') as writeTraceTwo:
    tracetwofile = writeTraceTwo.writelines("\n".join(read[TraceTwo:TraceThree]))

with open('TRACE3.DAT', 'w ') as writeTraceThree:
    traceThreeFile = writeTraceThree.writelines("\n".join(read[TraceThree:]))
  •  Tags:  
  • Related