Home > Enterprise >  Impute missing variable id's into a time series panel
Impute missing variable id's into a time series panel

Time:01-08

In order to do some time series analysis, I want to use a dataframe that looks like this:

data <- data.frame (Store_ID = as.character(c(seq( 1, length.out = 10),
                                              seq( 1, length.out = 9),
                                              c(1,2,3,4,6,7,8,9))),
                    amount_sold = c(seq( 1, 9, length.out = 27)),
                    date = c(rep(as.Date("2015-01-01"),10),
                             rep(as.Date("2015-01-02"),9),
                             rep(as.Date("2015-01-03"),8)
                             )
                            )

As you can see, there are 10 Store_ID's for the first date (2015-01-01), but only 9 for the next date and 8 for the last date.

For my analysis I need to add the Store_ID's that are missing for the next two days. As a result I want to have 30 rows and a "0" as amount_sold for the missing Store_ID's.

CodePudding user response:

Try

library(tidyr)

data <- data.frame (Store_ID = as.character(c(seq( 1, length.out = 10),
                                          seq( 1, length.out = 9),
                                          c(1,2,3,4,6,7,8,9))),
                amount_sold = c(seq( 1, 9, length.out = 27)),
                date = c(rep(as.Date("2015-01-01"),10),
                         rep(as.Date("2015-01-02"),9),
                         rep(as.Date("2015-01-03"),8)
                )
) %>%
  complete(Store_ID, date, fill = list(amount_sold = 0)) 
  •  Tags:  
  • Related