Home > Back-end >  Trouble looping through a folder of txt files to create new dataframe of averages
Trouble looping through a folder of txt files to create new dataframe of averages

Time:02-04

I have a folder of text files wit each file having this format. File_0:

X_0 Y_0
1 3
2 4

File_1

X_1 Y_1
5 6
6 7

etc.

I'm trying to loop through and create a dataframe of each file then flatten the dataframe to just be the averages of each dataframe and combine them all with each row being the averages of each individual dataframe. So something similar to this:

X_avg Y_avg
1.5 3.5
5.5 6.5
etc etc

This is the code I have so far:

#create empty df to concatenate the rows to.
df = pd.DataFrame([[]], columns=['x_means', 'y_means'])
directory_in_str = "my_path/data"
dir = os.fsencode(directory_in_stir)
for file in os.listdir(dir):
    filename = os.fsencode(file)
    data_frame = pd.read_table(dir "/" filename, sep = ' ')
    #this part accurately gets the file read in as a df and i get stuck after this
    x_means = data_frame['x'].mean()
    y_means = data_frame['y'].mean()
    df2 = pd.DataFrame([[x_means, y_means]], columns=['x_means', 'y_means'])
    #here I try to concatenate the new row to the old rows
    pd.concat([df2, df])

Is there a different approach I should be taking to do this? Thanks

CodePudding user response:

You can try:

import pandas as pd
import pathlib

directory_in_str = 'my_path/data'

dfs = []
for filename in pathlib.Path(directory_in_str).iterdir():
    dfs.append(pd.read_table(filename, sep=' ').mean())
df = pd.concat(dfs, axis=1).T.add_suffix('_mean')

Output:

>>> df
   X_mean  Y_mean
0     5.5     6.5
1     1.5     3.5
  •  Tags:  
  • Related