Home > Enterprise >  Pivot_longer and Pivot wider syntax
Pivot_longer and Pivot wider syntax

Time:01-30

I want to ask for ideas on creating a syntax to pivot_longer given on this.

I've already tried researching in the internet but I can't seem to find any examples that is similar to my data given where it has a Metric column which is also seperated in 3 different columns of months.

My desire final output is to have seven columns consisting of (regions,months, and the five Metrics)

How to formulate the pivot_longer and pivot_wider syntax to clean my data in order for me to visualize it?

CodePudding user response:

The tricky part isn't pivot_longer. You first have to clean your Excel spreadsheet, i.e. get rid of empty rows and merge the two header rows containing the names of the variables and the dates.

One approach to achieve your desired result may look like so:

library(readxl)
library(tidyr)
library(janitor)
library(dplyr)

x <- read_excel("data/Employment.xlsx", skip = 3, col_names = FALSE) %>% 
  # Get rid of empty rows and cols
  janitor::remove_empty()

# Make column names
col_names <- data.frame(t(x[1:2,])) %>% 
  fill(1) %>% 
  unite(name, 1:2, na.rm = TRUE) %>% 
  pull(name)

x <- x[-c(1:2),]
names(x) <- col_names

# Convert to long and values to numerics
x %>% 
  pivot_longer(-Region, names_to = c(".value", "months"), names_sep = "_") %>% 
  separate(months, into = c("month", "year")) %>% 
  mutate(across(!c(Region, month, year), as.numeric))
#> # A tibble: 6 × 8
#>   Region     month year  `Total Population … `Labor Force Part… `Employment Rat…
#>   <chr>      <chr> <chr>               <dbl>              <dbl>            <dbl>
#> 1 Philippin… April 2020f              73722.               55.7             82.4
#> 2 Philippin… Janu… 2021p              74733.               60.5             91.3
#> 3 Philippin… April 2021p              74971.               63.2             91.3
#> 4 National … April 2020f               9944.               54.2             87.7
#> 5 National … Janu… 2021p              10051.               57.2             91.2
#> 6 National … April 2021p              10084.               60.1             85.6
#> # … with 2 more variables: Unemployment Rate <dbl>, Underemployment Rate <dbl>
  •  Tags:  
  • Related