Vern  Greenholt

Vern Greenholt

1594967160

Five Tidyverse Tricks You May Not Know About

It struck me recently through collaborating with a number of other users of the tidyverse that there are many people who are not aware of all the things that this collection of packages offers them to help with their day to day data wrangling. In particular, two critical packages have had major updates in the past year, and have introduced new features which I regard as transformative — allowing users to step up a gear in the control of their data and in the efficiency of their code.

In late 2019, tidyr 1.0.0 was released. Of many updates, the key ones were the introduction of the functions pivot_longer() and pivot_wider() to better manage and control transformations of dataframes from wide to long form - one of the most common data-wrangling tasks. Replacing gather() and spread(), these new functions introduced more capability to manage the specifics of the transformation, cutting time for users in terms of how they tailor their outputs.

In early 2020, dplyr 1.0.0 was released. There was a vast scope to the new functionality that came into play with this release, but in particular the introduction of across() and c_across() as adverbs to be used with summarise() and mutate() simplified the number of scoped variants that users needed to work with and, like the tidyr changes, allowed greater control of what the output looked like.

Both of these updates took advantage of major new innovations in the R ecosystem, including updates in rlangvctrs and glue among others.

So if you haven’t checked out these updates, it’s a good time to check back in with the tidyverse packages. In this article I want to show you how they can make your life significantly easier and how you can use them to wrangle data with less code. I’ll do this by showing five simple examples of things you can do which you may not know about.

1. Combine column names however you want in tidyr::pivot_wider()

The whole idea of pivot_wider() is that you want to take data that is in long form and transform it to wide form. For example, let’s say your data looks like this:

storms_sum <- storms %>% 
	  dplyr::filter(year %in% 1975:1977) %>% 
	  dplyr::group_by(year, status) %>% 
	  dplyr::summarise(mean = mean(pressure, na.rm = TRUE),
	                   median = median(pressure, na.rm = TRUE))

	storms_sum

	#> # A tibble: 9 x 4
	#> # Groups:   year [3]
	#>    year status               mean median
	#>   <dbl> <chr>               <dbl>  <dbl>
	#> 1  1975 hurricane            977\.   984 
	#> 2  1975 tropical depression 1011\.  1012.
	#> 3  1975 tropical storm       992\.   993 
	#> 4  1976 hurricane            975\.   975 
	#> 5  1976 tropical depression 1007\.  1006.
	#> 6  1976 tropical storm       995\.   994.
	#> 7  1977 hurricane            978\.   987 
	#> 8  1977 tropical depression 1011\.  1010 
	#> 9  1977 tropical storm      1002\.  1001
view raw
tidyverse1.R hosted with ❤ by GitHub

Now let’s say that you are interested in seeing the mean and median pressure by year for each storm status. You can use pivot_wider() which cleverly knows what you are trying to do and pastes column names together by default:

storms_sum %>% 
	  tidyr::pivot_wider(names_from = "year", values_from = c("mean", "median"))

	#> # A tibble: 3 x 7
	#>   status       mean_1975 mean_1976 mean_1977 median_1975 median_1976 median_1977
	#>   <chr>            <dbl>     <dbl>     <dbl>       <dbl>       <dbl>       <dbl>
	#> 1 hurricane         977\.      975\.      978\.        984         975          987
	#> 2 tropical de…     1011\.     1007\.     1011\.       1012\.       1006\.        1010
	#> 3 tropical st…      992\.      995\.     1002\.        993         994\.        1001
view raw
tidyverse2.R hosted with ❤ by GitHub

You also have the benefit of the names_glue argument, which allows you to structure the combined column names as you wish using glue syntax which is simple and intuitive:

wide_storms <- storms_sum %>% 
	  tidyr::pivot_wider(names_from = "year", values_from = c("mean", "median"), 
	                     names_glue = "{.value}_of_{year}")

	wide_storms

	#> # A tibble: 3 x 7
	#>   status mean_of_1975 mean_of_1976 mean_of_1977 median_of_1975 median_of_1976
	#>   <chr>         <dbl>        <dbl>        <dbl>          <dbl>          <dbl>
	#> 1 hurri…         977\.         975\.         978\.           984            975 
	#> 2 tropi…        1011\.        1007\.        1011\.          1012\.          1006.
	#> 3 tropi…         992\.         995\.        1002\.           993            994.
	#> # … with 1 more variable: median_of_1977 <dbl>
view raw
tidyverse3.R hosted with ❤ by GitHub

#data-science #learning #analytics #data #data analysis

What is GEEK

Buddha Community

Five Tidyverse Tricks You May Not Know About
Ray  Patel

Ray Patel

1619518440

top 30 Python Tips and Tricks for Beginners

Welcome to my Blog , In this article, you are going to learn the top 10 python tips and tricks.

1) swap two numbers.

2) Reversing a string in Python.

3) Create a single string from all the elements in list.

4) Chaining Of Comparison Operators.

5) Print The File Path Of Imported Modules.

6) Return Multiple Values From Functions.

7) Find The Most Frequent Value In A List.

8) Check The Memory Usage Of An Object.

#python #python hacks tricks #python learning tips #python programming tricks #python tips #python tips and tricks #python tips and tricks advanced #python tips and tricks for beginners #python tips tricks and techniques #python tutorial #tips and tricks in python #tips to learn python #top 30 python tips and tricks for beginners

Seven CSS Tricks You Don't Know About!

In this video, I am going to teach you how to create seven really simple but cool things with CSS3. The codepen links will be available on my blog.

What you will learn:

  1. Disable Link Click
  2. Image Rendering
  3. Target Specific File Extension
  4. Background Clipping
  5. Sticky Bar
  6. Image Filters
  7. Unicode Classes

Website: www.raddy.co.uk
Blog: www.raddy.co.uk/blog

Music:
Vibe With Me by Joakim Karud http://soundcloud.com/joakimkarud Music promoted by Audio Library https://youtu.be/-7YDBIGCXsY

#css #tricks #css tricks you don't know about

Paula  Hall

Paula Hall

1623389988

4 Pandas GroupBy Tricks You Should Know

Use Pandas GroupBy more flexibly and creatively

As one of the most popular libraries in Python, Pandas has been utilised very commonly especially in data EDA (Exploratory Data Analysis) jobs. Very typically, it can be used for filtering and transforming dataset just like what we usually do using SQL queries. They share a lot of similar concepts such as joining tables. However, some features from them have the same names but different concepts. “Group By” is one of them.

In this article, I’ll introduce some tricks for the Pandas group by function, which could improve our productivity in EDA jobs. Hopefully at least one is something you never familiar with so that it could help you.

I’m sure that you know how to import Pandas in Python, but still, let me put it here. All the rest of the code in this article assume Pandas has been imported as follows.

import pandas as pd

#python #technology #data-science #programming #4 pandas groupby tricks you should know #pandas groupby tricks

Vern  Greenholt

Vern Greenholt

1594967160

Five Tidyverse Tricks You May Not Know About

It struck me recently through collaborating with a number of other users of the tidyverse that there are many people who are not aware of all the things that this collection of packages offers them to help with their day to day data wrangling. In particular, two critical packages have had major updates in the past year, and have introduced new features which I regard as transformative — allowing users to step up a gear in the control of their data and in the efficiency of their code.

In late 2019, tidyr 1.0.0 was released. Of many updates, the key ones were the introduction of the functions pivot_longer() and pivot_wider() to better manage and control transformations of dataframes from wide to long form - one of the most common data-wrangling tasks. Replacing gather() and spread(), these new functions introduced more capability to manage the specifics of the transformation, cutting time for users in terms of how they tailor their outputs.

In early 2020, dplyr 1.0.0 was released. There was a vast scope to the new functionality that came into play with this release, but in particular the introduction of across() and c_across() as adverbs to be used with summarise() and mutate() simplified the number of scoped variants that users needed to work with and, like the tidyr changes, allowed greater control of what the output looked like.

Both of these updates took advantage of major new innovations in the R ecosystem, including updates in rlangvctrs and glue among others.

So if you haven’t checked out these updates, it’s a good time to check back in with the tidyverse packages. In this article I want to show you how they can make your life significantly easier and how you can use them to wrangle data with less code. I’ll do this by showing five simple examples of things you can do which you may not know about.

1. Combine column names however you want in tidyr::pivot_wider()

The whole idea of pivot_wider() is that you want to take data that is in long form and transform it to wide form. For example, let’s say your data looks like this:

storms_sum <- storms %>% 
	  dplyr::filter(year %in% 1975:1977) %>% 
	  dplyr::group_by(year, status) %>% 
	  dplyr::summarise(mean = mean(pressure, na.rm = TRUE),
	                   median = median(pressure, na.rm = TRUE))

	storms_sum

	#> # A tibble: 9 x 4
	#> # Groups:   year [3]
	#>    year status               mean median
	#>   <dbl> <chr>               <dbl>  <dbl>
	#> 1  1975 hurricane            977\.   984 
	#> 2  1975 tropical depression 1011\.  1012.
	#> 3  1975 tropical storm       992\.   993 
	#> 4  1976 hurricane            975\.   975 
	#> 5  1976 tropical depression 1007\.  1006.
	#> 6  1976 tropical storm       995\.   994.
	#> 7  1977 hurricane            978\.   987 
	#> 8  1977 tropical depression 1011\.  1010 
	#> 9  1977 tropical storm      1002\.  1001
view raw
tidyverse1.R hosted with ❤ by GitHub

Now let’s say that you are interested in seeing the mean and median pressure by year for each storm status. You can use pivot_wider() which cleverly knows what you are trying to do and pastes column names together by default:

storms_sum %>% 
	  tidyr::pivot_wider(names_from = "year", values_from = c("mean", "median"))

	#> # A tibble: 3 x 7
	#>   status       mean_1975 mean_1976 mean_1977 median_1975 median_1976 median_1977
	#>   <chr>            <dbl>     <dbl>     <dbl>       <dbl>       <dbl>       <dbl>
	#> 1 hurricane         977\.      975\.      978\.        984         975          987
	#> 2 tropical de…     1011\.     1007\.     1011\.       1012\.       1006\.        1010
	#> 3 tropical st…      992\.      995\.     1002\.        993         994\.        1001
view raw
tidyverse2.R hosted with ❤ by GitHub

You also have the benefit of the names_glue argument, which allows you to structure the combined column names as you wish using glue syntax which is simple and intuitive:

wide_storms <- storms_sum %>% 
	  tidyr::pivot_wider(names_from = "year", values_from = c("mean", "median"), 
	                     names_glue = "{.value}_of_{year}")

	wide_storms

	#> # A tibble: 3 x 7
	#>   status mean_of_1975 mean_of_1976 mean_of_1977 median_of_1975 median_of_1976
	#>   <chr>         <dbl>        <dbl>        <dbl>          <dbl>          <dbl>
	#> 1 hurri…         977\.         975\.         978\.           984            975 
	#> 2 tropi…        1011\.        1007\.        1011\.          1012\.          1006.
	#> 3 tropi…         992\.         995\.        1002\.           993            994.
	#> # … with 1 more variable: median_of_1977 <dbl>
view raw
tidyverse3.R hosted with ❤ by GitHub

#data-science #learning #analytics #data #data analysis

Lessie  Fisher

Lessie Fisher

1626870600

CSS Tricks You May Didn't Know

Some css trick which will be useful to you.

#css #html #tricks