Resources for learning R and best practices for data analysis
There are many resources available which help both with learning R
and for developing appropriate practices. To start with, you can visit my website where I collate these, with a particular emphasis on practical examples which you can work with online (i.e., not through textbooks which are often hundreds of pages and cumbersome to work through). But try not to get stuck in tutorial hell!
Papers for conducting data analysis
Here are some papers which provide a good overview of best practices for data analysis, often using practical examples.
- Balaban, G., Grytten, I., Rand, K. D., Scheffer, L., & Sandve, G. K. (2021). Ten simple rules for quick and dirty scientific programming. PLoS Computational Biology, 17(3), e1008549. link
- Gentzkow, M., & Shapiro, J. M. (2014). Code and data for the social sciences: A practitioner’s guide. Working Paper, University of Chicago. link
- Roth, J., Duan, Y., Mahner, F. P., Kaniuth, P., Wallis, T. S., & Hebart, M. N. (2025). Ten principles for reliable, efficient, and adaptable coding in psychology and cognitive neuroscience. Communications Psychology, 3(1), 62. link
- Wilson, G., Aruliah, D.A., Brown, C.T., Chue Hong, N.P., Davis, M., Guy, R.T., Haddock, S.H., Huff, K.D., Mitchell, I.M., Plumbley, M.D. and Waugh, B., 2014. Best practices for scientific computing. PLoS biology, 12(1), p.e1001745. link
Guides/courses on programming and data science
Complimenting the papers above, there are many guides and courses available which provide a good overview of best practices for data analysis.
The Good Research Code Handbook. A handbook for organising code with an emphasis on project management. Created by Patrick Mineault, Amaranth Foundation.
Friends Don’t Make Friends Make Bad Graphs. A self-titled ‘opinionated essay about good and bad practices in data visualization’ with examples demonstrated through R plots. Created by Chenxin Li, University of Georgia.
Research Data Management (RDM) Workshop. A workshop designed to give a generic overview of RDM principles and practices, including OSF, data management, pre-registration, project and data organisation, version control, data storage and sharing, and copyright and licenses. Created by Julia-Katharina Pfarr, Philipps-Universität Marburg.
Computational and Inferential Thinking: The Foundations of Data Science. A online course in data science originally developed for the UC Berkeley course Data 8: Foundations of Data Science by Ani Adhikari, John DeNero and David Wagner.
Coding for data. An introduction to data science by Matthew Brett, borrowing from the Berkeley textbook above.
Online courses for programming in R
And finally, there are many online courses available which provide a good overview of programming in R
and with incorporating best practices.
Hands-On Programming with R. The online (and free) version of Garrett Grolemund’s R textbook, written for non-programmers using hands-on examples.
PsyTeachR Courses. A whole range of courses covering different capabilities of R, created by the psyTeachR team at the University of Glasgow.
R for Reproducible Scientific Analysis. Software Carpentries’ 2-day workshop on R, with a theme on open and reproducible research. Ran by the University of Reading.
R, Open Research, and Reproducibility. Andrew Stewart’s 12-week workshop course on R, Open Research, and Reproducibility, taught to students at the University of Manchester.
R, Git and bash. Software Carpentries’ 3-day workshop on
git
,bash
and R, with a theme on open and reproducible research. Ran by the University of Reading.An introduction to R. An online interactive book detailing R for beginners, including: data manipulation, plotting with ggplot2, basic statistics, functions, markdown and reproducilbity with git/GitHub. Written by Alex Douglas, Deon Roos, Francesca Mancini, Ana Couto & David Lusseau.
Introduction to R 2021. A basic introduction to R, covering data types, functions and plotting. Created by Sarah Bonnin.
Data Science for Psychologists. An introduction to data science that is tailored to the needs of students in psychology, but is also suitable for students of the humanities and other biological or social sciences. Created by Hansjorg Neth, University of Konstanz.
Statistical Inference via Data Science: A ModernDive into R and the Tidyverse. The electronic version of the data science book which covers data science with tidyverse, data modeling with moderndive and statistical inference with infer. Created by Chester Ismay (Flatiron School) and Albert Y. Kim (Smith College).
An Introduction to Data Analysis. Basic reading material for an introduction to data analysis with R, covering the use of R for data wrangling and plotting, and data analysis from a Bayesian and a frequentist tradition. Created by Michael Franke.
Just Enough R. Working with data, models (regression, ANOVA, linear models), confidence intervals, multiple comparisons, fixed/random effects. Created by Ben Whalley.
easystats. ‘A collection of R packages, which aims to provide a unifying and consistent framework to tame, discipline, and harness the scary R statistics and their pesky models.’ Developed by Daniel Lüdecke, Dominique Makowski, Mattan S. Ben-Shachar, Indrajeet Patil, Brenton M. Wiernik, Etienne Bacher and Rémi Thériault.