I have many CSVs with the same fields and same name ("data.csv"). Each csv has a header and multiple lines and is inside a different folder.
E.g. folder1 has a csv called data.csv:
NAME, COUNTRY
JOHN, USA
MARY, Panama
folder2 has a csv called data.csv:
NAME, COUNTRY
James, UK
Jim, India
folder3 has a csv called data.csv:
NAME, COUNTRY
James, UK
Jim, India
Now I want to combine all csv's into one, but without repeating the headers.
So far I am doing:
find . -name "data.csv" | xargs cat > mergedCSV
Which works fine, except for the repeated headers.
CodePudding user response:
You can use csvstack from the handy csvkit package to concatenate multiple CSV files with the same layout:
find . -name data.csv | xargs csvstack > mergedCSV
CodePudding user response:
You can use miller very easily with the "cat" built-in command/verb
find . -name data.csv | xargs mlr --csv cat
and if you want pretty formatting with 3 files as input:
mlr --opprint --barred --icsv cat a/data.csv b/data.csv c/data.csv
------- ------------
| NAME | COUNTRY |
------- ------------
| JOHN | USA |
| MARY | Panama |
| James | UK |
| Jim | India |
| James | UK |
| Jim | India |
------- ------------
