I have a dataframe like:
| Month | id_product |
|---|---|
| 2020_05 | 0_0 |
| 2020_06 | 0_22 |
| 2020_06 | 0_22 |
| 2021_01 | 2_52 |
| 2023_02 | 0_0 |
| 2023_02 | 0_12 |
I would like to get the amount of each product sold each month, like:
| Month | 0_0 | 0_12 | 0_22 | 2_52 |
|---|---|---|---|---|
| 2020_05 | 1 | 0 | 0 | 0 |
| 2020_06 | 0 | 0 | 2 | 0 |
| 2021_01 | 0 | 0 | 0 | 0 |
| 2023_02 | 1 | 1 | 0 | 0 |
So I can find the best/worst sellers per month, quarters, years, etc.
I have tried several methods (aggregate, etc.), but I did not find the way to do it with the 70 000 transactions and 3 000 id products of my database. Maybe I am not taking the issue by the right side. I am looking for the right strategy.
CodePudding user response:
in Base R:
a <- table(df)
b <- cbind(Month = rownames(a), as.data.frame.matrix(a))
rownames(b) <- NULL
b
Month 0_0 0_12 0_22 2_52
1 2020_05 1 0 0 0
2 2020_06 0 0 2 0
3 2021_01 0 0 0 1
4 2023_02 1 1 0 0
CodePudding user response:
df <- structure(list(Month = c("2020_05", "2020_06", "2020_06", "2021_01",
"2023_02", "2023_02"), id_product = c("0_0", "0_22", "0_22",
"2_52", "0_0", "0_12")), row.names = c(NA, -6L), class = "data.frame")
library(data.table)
setDT(df)
dcast(df, Month ~ id_product, fun.aggregate = length)
#> Key: <Month>
#> Month 0_0 0_12 0_22 2_52
#> <char> <int> <int> <int> <int>
#> 1: 2020_05 1 0 0 0
#> 2: 2020_06 0 0 2 0
#> 3: 2021_01 0 0 0 1
#> 4: 2023_02 1 1 0 0
Created on 2022-01-27 by the reprex package (v2.0.1)
CodePudding user response:
library(data.table)
dummy data
df <- data.table(month = c('2020_01', '2020_01', '2020_01', '2021_02', '2021_02', '2021_02')
, id_product = c(1,2,2,1,1,2)
); df
month id_product
1: 2020_01 1
2: 2020_01 2
3: 2020_01 2
4: 2021_02 1
5: 2021_02 1
6: 2021_02 2
summary - each product's quantity sold per month
x <- df[, .(q = .N), .(month, id_product)][]
month id_product q
1: 2020_01 1 1
2: 2020_01 2 2
3: 2021_02 1 2
4: 2021_02 2 1
max q sold - best selling product per month
x[x[, .I[q == max(q)], .(month)]$V1]
month id_product q
1: 2020_01 2 2
2: 2021_02 1 2
