Home > database >  Data too large to do calculation after calculation
Data too large to do calculation after calculation

Time:01-29

I have a df that looks like this:

   TailNum  Year year 
   <chr>   <dbl> <chr>
 1 N605AW   2006 1997 
 2 N309AW   2006 1990 
 3 N104UW   2006 1999 
 4 N162UW   2006 2001 
 5 N665AW   2006 2001 
 6 N659AW   2006 2000 
 7 N837AW   2006 2005 
 8 N751UW   2006 2000 
 9 N770UW   2006 2000 
10 N746UW   2006 2000

I am trying to get Year col to minus year col to get age col.

I've tried:

#convert years to numeric  
df_tailnum <- df_tailnum %>%
  unlist(df_tailnum$year) %>% 
  as.numeric(df_tailnum$year)

#find age of planes
df_tailnum$Age <- df_tailnum["Year"] - df_tailnum["year"]

But my result will turn out to be too large for me to do anything else data: large list 1.GB

Am I doing something wrong? Or is there a simpler way for me to get the result?

Steps so far:

> #find manufacturing date of planes 
> df_tailnum <- inner_join(df_tailnum, select(df_planes, tailnum, year), by = c(TailNum = "tailnum")) %>%
    drop_na() %>% 
    filter(year != "0") %>% 
    filter(year != "0000") %>% 
    filter(year != "none") %>% 
    filter(year != "None")
> head(df_tailnum)
# A tibble: 6 × 3
  TailNum  Year year 
  <chr>   <dbl> <chr>
1 N605AW   2006 1997 
2 N309AW   2006 1990 
3 N104UW   2006 1999 
4 N162UW   2006 2001 
5 N665AW   2006 2001 
6 N659AW   2006 2000 
> dim(df_tailnum)
[1] 5951272       3
> #convert years to numeric  
> df_tailnum <- df_tailnum %>%
    unlist(df_tailnum$year) %>% 
    as.numeric(df_tailnum$year)
Warning message:
In df_tailnum %>% unlist(df_tailnum$year) %>% as.numeric(df_tailnum$year) :
  NAs introduced by coercion
> head(df_tailnum)
[1] NA NA NA NA NA NA
> dim(df_tailnum)
NULL

CodePudding user response:

Please delete this:

df_tailnum <- df_tailnum %>%
    unlist(df_tailnum$year) %>% 
    as.numeric(df_tailnum$year)
Warning message:
In df_tailnum %>% unlist(df_tailnum$year) %>% as.numeric(df_tailnum$year) :
  NAs introduced by coercion
> head(df_tailnum)
[1] NA NA NA NA NA NA
> dim(df_tailnum)

And use this:

df_tailnum$Age <- df_tailnum$Year - as.numeric(df_tailnum$year)

Alternative:

df_tailnum$year <- df_tailnum %>%
    unlist(df_tailnum$year) %>% 
    as.numeric(df_tailnum$year)

  •  Tags:  
  • Related