I have a csv file that contains a bunch of data with one of the columns being date. I am trying to extract all lines that have dates in a specific year and save it into a new file.
The format of file is like this with the date and time in the second column:
000000000,10/04/2021 02:10:15 AM,.....
So far I tried:
grep -E ^2020 data.csv >> temp.csv
But it just produced an empty temp list. Any ideas on how I can do this?
CodePudding user response:
One potential solution is with awk:
awk -F"," '$2 ~ /\/2020 /' data.csv > temp.csv
Another potential option is with grep:
grep "\/2020 " data.csv > temp.csv
However, the grep solution may detect "/2020 " elsewhere in the file, rather than in column 2.
CodePudding user response:
Although awk solution is best here, e.g.
awk -F, 'index($2, "/2021 ")' file
grep can also be used here:
grep '^[^,]*,[^,]*/2021 ' file
See the online demo
Notes:
awk -F, 'index($2, "/2021 ")'splits the lines (records) into fields with a comma (see-F,), and if there is a/2021space in the second field ($2) the line is printed- the
^[^,]*,[^,]*/2021pattern in thegrepcommand matches^- start of string[^,]*- zero or more non-comma chars,[^,]*- a,and zero or more non-comma chars/2021- a literal substring.
