Data Workflow Using R: Sogang University Workshop, 2023 Fall

(In Korean) A series of slides on a principled crash-course to learning R.

  1. 데이터 워크플로우의 정의 및 R 기초 문법
  2. Tidyverse로 데이터 불러오고 변환하기
  3. R 함수 및 함수형 프로그래밍

I. Beginner-friendly Learning Materials


II. What If I Have Coding Questions? What If Something is Not Working?

Before you start Googling or seeking help:

  1. Take a deep breath and accept that learning to debug is (highly likely) a painful, grueling process that has a steep learning curve. You will likely have to go through many frustrated minutes, hours, or even days! It gets better, but with some heavy investment.
  2. Create a Stack Overflow account.

Now that you’ve braced yourself,

  • Google the error message. 90%+ of the time, the question has been already asked on Stack Overflow.
  • [New!] Ask ChatGPT to provide a documentation and example code. For example, image
  • [New!] Similarly, RTutor.ai can help translate natural languages into R scripts. image

Note AI-generated answers may not be always accurate, especially for complex queries. Use at your own peril.


III. Advanced Steps

  • Comment generously! You will not be able to remember what you were doing without ample commenting, nor will other people be able to understand your code. Otherwise, this will be you reading your old code:

  • Never save/restore .RData. In fact, run usethis::use_blank_slate().
  • Form each data analysis into a project, use here::here() as opposed to setwd, and be mindful of the project-oriented workflow and reproducibility.
  • Please use the styler package, which makes it very easy to style your code consistently.
  • Using assertthat, insert unit tests/sanity checks about your dataset so that you catch mistakes quickly. For example, you could preemptively detected and warn about duplicates, missing values, wrong number of rows, proportions going below 0 or over 1, wrong class or implicit coercions, …
  • Read R Inferno by Patrick Burns with summary slides by Maya Gans.