Using R to improve data analyses Python workflows
From Gridkaschool
Jump to navigationJump to search
Using R to improve data analyses Python workflows
Software Environment
The courses are implemented as Jupyter/IPython notebooks, running Python and R. We provide a VM with preconfigured Jupyter Notebook Server for each student.
You can also setup the software environment on your own computer.
Minimum Requirement
- Current web browser, such as Firefox, Chrome or Safari.
- You will receive a server address and password at the start of the course.
Running the Course on your own Computer
We have only tested the course on Linux and OSX! Installation on Windows may differ!
Required Software
- Python3 (>= 3.4) - A sufficiently recent release of Python
- Commonly available via package managers such as
apt-get install python3
orbrew install python3
. - Also available from the Python Homepage
- Commonly available via package managers such as
- Python3 pip - Python Package Manager
- Several package managers do not install the python package manger
- It will be available as a separate package in this case, e.g.
apt-get install python3-pip
.
- Jupyter - Evaluates and renders the notebooks
- Available via pip:
pip3 install jupyter
- Available via pip:
- RISE (optional) - provides the interactive presentation view
- Consult the RISE Readme
- Available via pip:
pip3 install RISE && jupyter-nbextension install rise --py --sys-prefix && jupyter-nbextension enable rise --py --sys-prefix
- R (we are using 3.3.1, but you are also fine with older versions)
- on unix-based systems you need to take care to compile R with the flag --enable-R-shlib (we do need this to get Rserve running)
- RServe
wget https://www.rforge.net/Rserve/snapshot/Rserve_1.8-5.tar.gz --no-check-certificate
R CMD INSTALL Rserve_1.8-5.tar.gz
- pyRserve
- Availabe via pip:
pip3 install pyRserve
- Availabe via pip:
- PypeR
- Available via pip:
pip3 install PypeR
- Available via pip:
- rpy2
- Available via pip:
pip3 install rpy2
- Available via pip:
- numpy
- Available via pip:
pip3 install numpy
- Available via pip:
- pandas
- Available via pip:
pip3 install pandas
- Available via pip:
- From within R please also execute
- Please ensure the following packages are installed: devtools, ggplot2, dplyr
devtools::install_github('IRkernel/IRkernel')
(this will install the RKernel for Jupyter)IRkernel::installspec(user = FALSE)
Setting up the Environment
- Check out the exercise repositories
https://bitbucket.org/teamkseta/gks_2016_pyr.git
andhttps://bitbucket.org/teamkseta/gks_2016_python.git
. - Set PYTHONPATH to include the
gks_2016_pyr
directory. - Change to the
gks_2016_python
directory and run the notebook server.
GKSDIR='~/gks2016' # change me
git clone https://bitbucket.org/teamkseta/gks_2016_pyr.git $GKSDIR/gks_lib
git clone https://bitbucket.org/teamkseta/gks_2016_python.git $GKSDIR/gks_2016_python
export PYTHONPATH="$GKSDIR/gks_lib:$PYTHONPATH"
jupyter-notebook