I have the next format for my date in my dataframe:
Typeof(DateTime("2021-12-17T06:00:00"))
Feather.Arrow.Timestamp{Microsecond}
I want to filter the data by date, but I can't because of the type. I tried to chop it, but again because of the type I couldnt. MethodError: no method matching chop(::Feather.Arrow.Timestamp{Microsecond}; head=10, tail=2) Closest candidates are: chop(::AbstractString; head, tail) at strings/util.jl:184 So I try to change the type using parse but it is not allowed. In R I use filter and I have no problem. What can I do?
CodePudding user response:
Going with the title date from string, here's a way
julia> using Dates
julia> dt = DateTime("2021-12-17T06:00:00")
2021-12-17T06:00:00
julia> typeof(dt)
DateTime
julia> Dates.day(dt)
17
julia> Dates.monthabbr(dt)
"Dec"
# etc...
Or in a data frame
julia> df = DataFrame(date=["2021-12-17T06:00:00","2021-12-18T06:00:00","2021-12-19T06:00:00","2021-12-20T06:00:00","2021-12-21T06:00:00"])
5×1 DataFrame
Row │ date
│ String
─────┼─────────────────────
1 │ 2021-12-17T06:00:00
2 │ 2021-12-18T06:00:00
3 │ 2021-12-19T06:00:00
4 │ 2021-12-20T06:00:00
5 │ 2021-12-21T06:00:00
julia> df.date = DateTime.(df[:,:date])
julia> df
5×1 DataFrame
Row │ date
│ DateTime
─────┼─────────────────────
1 │ 2021-12-17T06:00:00
2 │ 2021-12-18T06:00:00
3 │ 2021-12-19T06:00:00
4 │ 2021-12-20T06:00:00
5 │ 2021-12-21T06:00:00
julia> Dates.day.(df.date)
5-element Vector{Int64}:
17
18
19
20
21
CodePudding user response:
To filter a DataFrame's row by value, see the Subsetting section of the DataFrames manual.
julia> df = DataFrame(n = 1:16, dates = DateTime("2021-12-17T06:00:00"):Day(1):DateTime("2022-01-01T06:00:00"));
julia> summary(df)
"16×2 DataFrame"
julia> df[DateTime("2021-12-25") .<= df.dates .<= DateTime("2021-12-31"), :]
6×2 DataFrame
Row │ n dates
│ Int64 DateTime
─────┼────────────────────────────
1 │ 9 2021-12-25T06:00:00
2 │ 10 2021-12-26T06:00:00
3 │ 11 2021-12-27T06:00:00
4 │ 12 2021-12-28T06:00:00
5 │ 13 2021-12-29T06:00:00
6 │ 14 2021-12-30T06:00:00
julia> #OR:
julia> datefilter(dates) = DateTime("2021-12-25") .<= dates .<= DateTime("2021-12-31")
datefilter (generic function with 1 method)
julia> subset(df, :dates => datefilter)
6×2 DataFrame
Row │ n dates
│ Int64 DateTime
─────┼────────────────────────────
1 │ 9 2021-12-25T06:00:00
2 │ 10 2021-12-26T06:00:00
3 │ 11 2021-12-27T06:00:00
4 │ 12 2021-12-28T06:00:00
5 │ 13 2021-12-29T06:00:00
6 │ 14 2021-12-30T06:00:00
For the specifics of what you want to accomplish here, it's most useful if you can show us the actual code you've tried and what its intended purpose is. (chop in Julia is used to process strings. tidyr-type chop may be replaced by groupbys instead, but that again depends on what your end goal here is.)
CodePudding user response:
the problem I have is that my DataFrame looks like:
julia> df
5×1 DataFrame
Row │ date
│ DateTime
─────┼─────────────────────
1 │ DateTime(2021-12-17T06:00:00)
2 │ DateTime(2021-12-18T06:00:00)
3 │ DateTime(2021-12-19T06:00:00)
4 │ DateTime(2021-12-20T06:00:00)
5 │ DateTime(2021-12-21T06:00:00)
Rather than:
Row │ date
│ DateTime
─────┼─────────────────────
1 │ 2021-12-17T06:00:00
2 │ 2021-12-18T06:00:00
3 │ 2021-12-19T06:00:00
4 │ 2021-12-20T06:00:00
5 │ 2021-12-21T06:00:00
So, that´s why I can´t filter the data because of the type. The first has type: Vector{Feather.Arrow.Timestamp{Microsecond}} while the second has type : DateTime.
