I have a folder of text files wit each file having this format. File_0:
| X_0 | Y_0 |
|---|---|
| 1 | 3 |
| 2 | 4 |
File_1
| X_1 | Y_1 |
|---|---|
| 5 | 6 |
| 6 | 7 |
etc.
I'm trying to loop through and create a dataframe of each file then flatten the dataframe to just be the averages of each dataframe and combine them all with each row being the averages of each individual dataframe. So something similar to this:
| X_avg | Y_avg |
|---|---|
| 1.5 | 3.5 |
| 5.5 | 6.5 |
| etc | etc |
This is the code I have so far:
#create empty df to concatenate the rows to.
df = pd.DataFrame([[]], columns=['x_means', 'y_means'])
directory_in_str = "my_path/data"
dir = os.fsencode(directory_in_stir)
for file in os.listdir(dir):
filename = os.fsencode(file)
data_frame = pd.read_table(dir "/" filename, sep = ' ')
#this part accurately gets the file read in as a df and i get stuck after this
x_means = data_frame['x'].mean()
y_means = data_frame['y'].mean()
df2 = pd.DataFrame([[x_means, y_means]], columns=['x_means', 'y_means'])
#here I try to concatenate the new row to the old rows
pd.concat([df2, df])
Is there a different approach I should be taking to do this? Thanks
CodePudding user response:
You can try:
import pandas as pd
import pathlib
directory_in_str = 'my_path/data'
dfs = []
for filename in pathlib.Path(directory_in_str).iterdir():
dfs.append(pd.read_table(filename, sep=' ').mean())
df = pd.concat(dfs, axis=1).T.add_suffix('_mean')
Output:
>>> df
X_mean Y_mean
0 5.5 6.5
1 1.5 3.5
