Techno Blender
Digitally Yours.

Review of the RStudio Workshop — Building Tidy Tools | by Arafath Hossain | Aug, 2022

0 90


Photo by Kvalifik on Unsplash

Building Tidy Tools is one of the many workshops offered by the RStudio annually at their RStudio Conference. I participated the workshop in the last conference that took place in Washington, DC from July 25 to July 26, 2022. In this blog post you will get a summary of what was taught in the workshop and also my take on the overall workshop. If you are thinking about participating in this workshop, hopefully, this post will offer you some information to help.

Before jumping more into the details here are some quick facts about the workshop to help you guide if this is the right content for you:

What is it?

The workshop was a two-day hands-on training on teaching how to build packages in R using “tidy” design or coding principles as demonstrated in the packages included in the package universe — tidyverse.

Who are the Primary Audience?

The official site for the workshop says anyone with experience with writing R functions and ready to make the jump to distributing their work should be the right candidate. I think it’s a pretty accurate requirement but to make it more concrete I would say one would get the most out of this workshop if h/she has already tried building an R package before. But that’s just my opinion 🙂

So now let’s dig in into the details.

As said earlier, it’s a two days affair. The way I saw it that the contents of these two days target two different sides of the package development journey: process of creating a package, and best practices of writing R programming. The ultimate goal is to take the trainees through the journey of building, and improving a package in real time. In fact, you can checkout the final product, an R package, that you’ll be producing after the workshop here: ussie.

To make my overall review organize, I will share the summary and comments broken down into the sessions.

Day 01: Build a POC Package

If you attend this workshop with the intention of knowing how to throw a bunch of your scripts wrapped in an R package and start using it, the materials from day 01 will be sufficient for that! You will be taken from understanding what is R package to building you first R package and even creating a site for your package in just one day! And all that, in my opinion, possibly the most painless way possible.

The day’s contents are divided into four sections:

Session 01: The Whole Game

After setting up some ground rules and discussing about the workshop’s agenda, you will get a gentle introduction to “what is an R package”. This session will get you started with the R project and setting up the ground work to starting building an R package.

You will start your R package development project, set up version control, add your first functions into the package, learn about the basic package development workflow, and licensing. Unless you struggle with some technical anomaly, this session should go very smoothly and set you up to have a great experience with building R package.

Session 02: Documentation – Minimum

As the name suggests, this section is all about documentation and the “minimum” part signifies that this section’s discussion will only explain documenting the bare minimum: filling out DESCRIPTION file and minimum descriptions for the functions — the help file that you get if you run help(“function_name”) in R.

You will learn about the DESCRIPTION file — one of the defining files of an R package, learn what to include and not to include.

Give special attention to the section on package dependencies. Not properly including the dependencies can cause a lot of headaches down the line.

For functions documentation, you will learn how to use Roxygen, and devtool packages to automatically convert comments in the function scripts to generate documentations.

Session 03: Unit Testing

Probably the most challenging section of the day 01. You will learn all about how to use package testthat to set up test procedures to automatically run test after each iteration or modification made in the functions. Setting up test cases from different use cases, will cut down a lot of hick-ups down the line over the lifetime of a package. This section will introduce you to the expect_*() family of functions that will make it easier to set up different test scenarios.

Session 04: Documentations – Sharing

This section details out the documentations that you should have to make your package useful for others — things like setting up a readme file in GitHub page, creating a Vignette for detailed description of the package and its contents.

In addition to these documentation, you will also set up a CI-CD pipeline using GitHub Actions to make sure any change in the code once pushed to the GitHub all the tests are run and documents are updated.

The most exciting part — you will have a functioning package along with a cool website at the end of the day!

Overall, day 01 was went great by setting up an early excitement of how it feels to have a real package. It demonstrated how easily one can package codes using packages such as usethat, and devtool.

Day 02: Setting you up for long term success — at least that’s how I saw it.

If day 01 was about getting you excited about building an R package, then day 02 was about setting you up for the long term success meaning it goes beyond strictly package building and delved more into some of R’s inner workings, discussing best practices and so on. Similar to day 01, day 02 was a four session day too.

Session 01: Design

This section starts with a brief discussion on how tiyeval works then goes into discussing some of the best practices in writing functions such as: naming, casing, sequence of arguments, and separating pure functions from the ones with side effect. Second session goes much more detailed into the pure vs functions with side effects.

Session 02: Managing Side Effects

This section discusses about the ways one can test and manage the side effects that are generated by the functions. You will practice using methods to validate input and abort functions if needed programmatically and how to write user friendly error messages. Also, how to function without altering the global environment.

Session 03: Tidy Evals

If you have been coding in R for sometime and also dabbled in other languages like Python, probably you already have noticed that how unlike Python you can pass the name of a column in an R function as a non-string value. Tidy eval makes it possible for R to understand the bare name as a column name. This is a handy feature, not my most favorite though, and this section is all about understanding this functionality in further detail.

The specific topics that were covered are: utilizing the three dots (…) as input to a tidy function, using splicing (!!!) functionality, using double curly ({{}}), utility of some very specific tidy functions such as any_of(), all_of(), across, and comparing code snapshots.

Overall, this section introduced with some very specific tools and functionalities that would make functions more tidy and easy to maintain.

Session 04: Functional and Object Oriented Programming (OOP)

Day 02 sessions covered a lot of the ground that needed more focus and attention to understand properly. And being the last session, this part got a bit rushed due to time limit. This session was about discussing how functional programming is implemented in R.

Using functions as arguments is a foundation of functional programming. — quoted from the session slide.

I didn’t find much about OOP in this session and to be honest I don’t think R is the right language to practice OOP either. If you like OOP then you would be much better served by Python.

A quick discussion and hand on exercise was conducted on utilizing purrr package to do functional programming in R.

Overall, in addition to the defined list of tasks, day 02 was full of useful bits and pieces of advice from Ian — the instructor, from his vast experience of programming. Also, the slides were loaded with additional resources to go over and learn more about how R work internally and general coding best practices.

In total, I was very satisfied with the workshop — it was well designed, thought out and I felt like it had something for people at different levels of expertise. For someone just curious about building R package, day 01 was a great motivator. And for a passionate R programmer, day 02 was full of advice and tips to make their journey into R smoother.

Also, the course materials are open sourced. Here is the course website and here’s the GitHub repo of the workshop.


Photo by Kvalifik on Unsplash

Building Tidy Tools is one of the many workshops offered by the RStudio annually at their RStudio Conference. I participated the workshop in the last conference that took place in Washington, DC from July 25 to July 26, 2022. In this blog post you will get a summary of what was taught in the workshop and also my take on the overall workshop. If you are thinking about participating in this workshop, hopefully, this post will offer you some information to help.

Before jumping more into the details here are some quick facts about the workshop to help you guide if this is the right content for you:

What is it?

The workshop was a two-day hands-on training on teaching how to build packages in R using “tidy” design or coding principles as demonstrated in the packages included in the package universe — tidyverse.

Who are the Primary Audience?

The official site for the workshop says anyone with experience with writing R functions and ready to make the jump to distributing their work should be the right candidate. I think it’s a pretty accurate requirement but to make it more concrete I would say one would get the most out of this workshop if h/she has already tried building an R package before. But that’s just my opinion 🙂

So now let’s dig in into the details.

As said earlier, it’s a two days affair. The way I saw it that the contents of these two days target two different sides of the package development journey: process of creating a package, and best practices of writing R programming. The ultimate goal is to take the trainees through the journey of building, and improving a package in real time. In fact, you can checkout the final product, an R package, that you’ll be producing after the workshop here: ussie.

To make my overall review organize, I will share the summary and comments broken down into the sessions.

Day 01: Build a POC Package

If you attend this workshop with the intention of knowing how to throw a bunch of your scripts wrapped in an R package and start using it, the materials from day 01 will be sufficient for that! You will be taken from understanding what is R package to building you first R package and even creating a site for your package in just one day! And all that, in my opinion, possibly the most painless way possible.

The day’s contents are divided into four sections:

Session 01: The Whole Game

After setting up some ground rules and discussing about the workshop’s agenda, you will get a gentle introduction to “what is an R package”. This session will get you started with the R project and setting up the ground work to starting building an R package.

You will start your R package development project, set up version control, add your first functions into the package, learn about the basic package development workflow, and licensing. Unless you struggle with some technical anomaly, this session should go very smoothly and set you up to have a great experience with building R package.

Session 02: Documentation – Minimum

As the name suggests, this section is all about documentation and the “minimum” part signifies that this section’s discussion will only explain documenting the bare minimum: filling out DESCRIPTION file and minimum descriptions for the functions — the help file that you get if you run help(“function_name”) in R.

You will learn about the DESCRIPTION file — one of the defining files of an R package, learn what to include and not to include.

Give special attention to the section on package dependencies. Not properly including the dependencies can cause a lot of headaches down the line.

For functions documentation, you will learn how to use Roxygen, and devtool packages to automatically convert comments in the function scripts to generate documentations.

Session 03: Unit Testing

Probably the most challenging section of the day 01. You will learn all about how to use package testthat to set up test procedures to automatically run test after each iteration or modification made in the functions. Setting up test cases from different use cases, will cut down a lot of hick-ups down the line over the lifetime of a package. This section will introduce you to the expect_*() family of functions that will make it easier to set up different test scenarios.

Session 04: Documentations – Sharing

This section details out the documentations that you should have to make your package useful for others — things like setting up a readme file in GitHub page, creating a Vignette for detailed description of the package and its contents.

In addition to these documentation, you will also set up a CI-CD pipeline using GitHub Actions to make sure any change in the code once pushed to the GitHub all the tests are run and documents are updated.

The most exciting part — you will have a functioning package along with a cool website at the end of the day!

Overall, day 01 was went great by setting up an early excitement of how it feels to have a real package. It demonstrated how easily one can package codes using packages such as usethat, and devtool.

Day 02: Setting you up for long term success — at least that’s how I saw it.

If day 01 was about getting you excited about building an R package, then day 02 was about setting you up for the long term success meaning it goes beyond strictly package building and delved more into some of R’s inner workings, discussing best practices and so on. Similar to day 01, day 02 was a four session day too.

Session 01: Design

This section starts with a brief discussion on how tiyeval works then goes into discussing some of the best practices in writing functions such as: naming, casing, sequence of arguments, and separating pure functions from the ones with side effect. Second session goes much more detailed into the pure vs functions with side effects.

Session 02: Managing Side Effects

This section discusses about the ways one can test and manage the side effects that are generated by the functions. You will practice using methods to validate input and abort functions if needed programmatically and how to write user friendly error messages. Also, how to function without altering the global environment.

Session 03: Tidy Evals

If you have been coding in R for sometime and also dabbled in other languages like Python, probably you already have noticed that how unlike Python you can pass the name of a column in an R function as a non-string value. Tidy eval makes it possible for R to understand the bare name as a column name. This is a handy feature, not my most favorite though, and this section is all about understanding this functionality in further detail.

The specific topics that were covered are: utilizing the three dots (…) as input to a tidy function, using splicing (!!!) functionality, using double curly ({{}}), utility of some very specific tidy functions such as any_of(), all_of(), across, and comparing code snapshots.

Overall, this section introduced with some very specific tools and functionalities that would make functions more tidy and easy to maintain.

Session 04: Functional and Object Oriented Programming (OOP)

Day 02 sessions covered a lot of the ground that needed more focus and attention to understand properly. And being the last session, this part got a bit rushed due to time limit. This session was about discussing how functional programming is implemented in R.

Using functions as arguments is a foundation of functional programming. — quoted from the session slide.

I didn’t find much about OOP in this session and to be honest I don’t think R is the right language to practice OOP either. If you like OOP then you would be much better served by Python.

A quick discussion and hand on exercise was conducted on utilizing purrr package to do functional programming in R.

Overall, in addition to the defined list of tasks, day 02 was full of useful bits and pieces of advice from Ian — the instructor, from his vast experience of programming. Also, the slides were loaded with additional resources to go over and learn more about how R work internally and general coding best practices.

In total, I was very satisfied with the workshop — it was well designed, thought out and I felt like it had something for people at different levels of expertise. For someone just curious about building R package, day 01 was a great motivator. And for a passionate R programmer, day 02 was full of advice and tips to make their journey into R smoother.

Also, the course materials are open sourced. Here is the course website and here’s the GitHub repo of the workshop.

FOLLOW US ON GOOGLE NEWS

Read original article here

Denial of responsibility! Techno Blender is an automatic aggregator of the all world’s media. In each content, the hyperlink to the primary source is specified. All trademarks belong to their rightful owners, all materials to their authors. If you are the owner of the content and do not want us to publish your materials, please contact us by email – [email protected]. The content will be deleted within 24 hours.
Leave a comment