The CNIO Bioinformatics Unit is organising a Third Hands-on Introduction to R, to be held on Wednesday, July 3rd 2013, at the Legado Clotilde Jiménez room, 3rd floor.  The course is limited to 15 people. Please contact us (This email address is being protected from spambots. You need JavaScript enabled to view it. ) in advance if you wish to attend the course.

This is a crash introduction to R, an "environment for statistical computing and graphics''. At the end of this part of the course, you should have an introductory understanding of how to work with R, and will be ready to continue learning on your own at a fast pace.

For instance, you should feel confident handling the following scenarios:  The authors of a paper claim there is a weak relationship between levels of protein A and growth. However, you know that some of the samples are from males and some are from females, and you suspect the correlation is present only in males and would like to check it.

You've been working on a microarray study. For 100 subjects (50 of them with leukemia, 50 of them healthy) you have the Cy3/Cy5 intensity ratios for 300,000 spots.  In less than five minutes you'd like to get a quick idea of what the data look like: maximum and minimum values for all spots, average for 5 specific control spots (corresponding to probes 10, 23, 56, 10,004, 20,000), and a quick-and-dirty statistical test of differences for two specific probes (that correspond to two well know genes, from probes 7,000 and 99,000). 

For a set of 20 selected probes you will want to: a) take a look at the mean of the intensity, variance of intensity, and the mean of the intensity in each of the two groups; b) plot the intensity vs. the age of the subject; c) plot the log of the intensity vs. the age of the subject.

In this course we will provide a hands-on introduction to R. This is not an introduction to statistics with R, nor an introduction to BioConductor. Rather, the objective of the course is to make you familiar with the syntax of R and its usage with the command line.

  • Ramón Díaz Uriarte
    (Departamento de Bioquímica, Instituto de Investigaciones Biomédicas "Alberto Sols", UAM-CSIC)

Course Agenda (Legado Clotilde Jiménez room, 3rd floor)
  • 10:00 Introduction: An example to see the effects of multiple testing   
  • 10:15 Interacting with R: notes on editors and GUIs. Interactive calculations. The help system. A first interactive session.   
  • 11:30 Script files and data files
  • 12:00 Coffee break
  • 12:15 The R package system  
  • 12:40 Vectors: indexing, logical operations. Factors
  • 13:00 Lunch
  • 14:00 A simple hypothesis test: the t-test   
  • 15:00 Matrices, data frames. More indexing
  • 15:45 Coffee break
  • 16:00 Flow control and defining functions. The apply family  
  • 16:30 The introductory example again  
  • 16:45 End

[Course material: presentations & exercises]

All UBio Training Courses material are licensed under a Creative Commons Attribution-Noncommercial 3.0 Unported License. That means that anyone is free to copy, distribute, display, and use the work for educational purposes, on the following conditions: The training material must be attributed in their original form and "CNIO Bioinformatics Unit" has to be clearly labeled as author and provider of the work. The training materials may not be used for commercial purposes, and derivative works can be distributed only under an identical license.