Using R to improve data analyses Python workflows: Difference between revisions
From Gridkaschool
Jump to navigationJump to search
No edit summary |
No edit summary |
||
(8 intermediate revisions by 2 users not shown) | |||
Line 32: | Line 32: | ||
* R (we are using 3.3.1, but you are also fine with older versions) |
* R (we are using 3.3.1, but you are also fine with older versions) |
||
** on unix-based systems you need to take care to compile R with the flag --enable-R-shlib (we do need this to get Rserve running) |
** on unix-based systems you need to take care to compile R with the flag --enable-R-shlib (we do need this to get Rserve running) |
||
* RServe |
* RServe - R Compute Service |
||
** <code>wget https://www.rforge.net/Rserve/snapshot/Rserve_1.8-5.tar.gz --no-check-certificate</code> |
** <code>wget https://www.rforge.net/Rserve/snapshot/Rserve_1.8-5.tar.gz --no-check-certificate</code> |
||
** <code>R CMD INSTALL Rserve_1.8-5.tar.gz</code> |
** <code>R CMD INSTALL Rserve_1.8-5.tar.gz</code> |
||
* pyRserve |
* pyRserve - RServe client for Python |
||
** Availabe via pip: <code>pip3 install pyRserve</code> |
** Availabe via pip: <code>pip3 install pyRserve</code> |
||
* PypeR |
* PypeR - Pipe to an R subprocess |
||
** Available via pip: <code>pip3 install PypeR</code> |
** Available via pip: <code>pip3 install PypeR</code> |
||
* rpy2 - Low level bindings to R |
|||
* rpy2 |
|||
** Available via pip: <code>pip3 install rpy2</code> |
** Available via pip: <code>pip3 install rpy2</code> |
||
* numpy |
* numpy |
||
Line 45: | Line 45: | ||
* pandas |
* pandas |
||
** Available via pip: <code>pip3 install pandas</code> |
** Available via pip: <code>pip3 install pandas</code> |
||
* [https://irkernel.github.io/installation/ Instal the R kernel for jupyter] |
|||
* From within R please |
** From within R please execute: |
||
** Please ensure the following packages are installed: devtools, ggplot2, dplyr |
** Please ensure the following packages are installed: <code>install.packages(c('devtools', 'ggplot2', 'dplyr', 'readr', 'magrittr'))</code> |
||
** <code>devtools::install_github('IRkernel/IRkernel')</code> (this will install the RKernel for Jupyter) |
** <code>devtools::install_github('IRkernel/IRkernel')</code> (this will install the RKernel for Jupyter) |
||
** <code>IRkernel::installspec(user = FALSE)</code> |
** <code>IRkernel::installspec(user = FALSE)</code> |
||
Line 61: | Line 62: | ||
git clone https://bitbucket.org/teamkseta/gks_2016_pyr.git $GKSDIR/gks_lib |
git clone https://bitbucket.org/teamkseta/gks_2016_pyr.git $GKSDIR/gks_lib |
||
git clone https://bitbucket.org/teamkseta/ |
git clone https://bitbucket.org/teamkseta/gks_2016_r.git $GKSDIR/gks_2016_r |
||
export PYTHONPATH="$GKSDIR/gks_lib:$PYTHONPATH" |
export PYTHONPATH="$GKSDIR/gks_lib:$PYTHONPATH" |
||
Line 67: | Line 68: | ||
jupyter-notebook |
jupyter-notebook |
||
</code> |
</code> |
||
* To also start Rserve you will need to execute: |
|||
<code> |
|||
R CMD Rserve |
|||
</code> |
|||
* '''Make sure not to run this with root privileges! By default, Rserve listens on localhost:6311.''' |
Latest revision as of 22:52, 29 August 2016
Using R to improve data analyses Python workflows
Software Environment
The courses are implemented as Jupyter/IPython notebooks, running Python and R. We provide a VM with preconfigured Jupyter Notebook Server for each student.
You can also setup the software environment on your own computer.
Minimum Requirement
- Current web browser, such as Firefox, Chrome or Safari.
- You will receive a server address and password at the start of the course.
Running the Course on your own Computer
We have only tested the course on Linux and OSX! Installation on Windows may differ!
Required Software
- Python3 (>= 3.4) - A sufficiently recent release of Python
- Commonly available via package managers such as
apt-get install python3
orbrew install python3
. - Also available from the Python Homepage
- Commonly available via package managers such as
- Python3 pip - Python Package Manager
- Several package managers do not install the python package manger
- It will be available as a separate package in this case, e.g.
apt-get install python3-pip
.
- Jupyter - Evaluates and renders the notebooks
- Available via pip:
pip3 install jupyter
- Available via pip:
- RISE (optional) - provides the interactive presentation view
- Consult the RISE Readme
- Available via pip:
pip3 install RISE && jupyter-nbextension install rise --py --sys-prefix && jupyter-nbextension enable rise --py --sys-prefix
- R (we are using 3.3.1, but you are also fine with older versions)
- on unix-based systems you need to take care to compile R with the flag --enable-R-shlib (we do need this to get Rserve running)
- RServe - R Compute Service
wget https://www.rforge.net/Rserve/snapshot/Rserve_1.8-5.tar.gz --no-check-certificate
R CMD INSTALL Rserve_1.8-5.tar.gz
- pyRserve - RServe client for Python
- Availabe via pip:
pip3 install pyRserve
- Availabe via pip:
- PypeR - Pipe to an R subprocess
- Available via pip:
pip3 install PypeR
- Available via pip:
- rpy2 - Low level bindings to R
- Available via pip:
pip3 install rpy2
- Available via pip:
- numpy
- Available via pip:
pip3 install numpy
- Available via pip:
- pandas
- Available via pip:
pip3 install pandas
- Available via pip:
- Instal the R kernel for jupyter
- From within R please execute:
- Please ensure the following packages are installed:
install.packages(c('devtools', 'ggplot2', 'dplyr', 'readr', 'magrittr'))
devtools::install_github('IRkernel/IRkernel')
(this will install the RKernel for Jupyter)IRkernel::installspec(user = FALSE)
Setting up the Environment
- Check out the exercise repositories
https://bitbucket.org/teamkseta/gks_2016_pyr.git
andhttps://bitbucket.org/teamkseta/gks_2016_python.git
. - Set PYTHONPATH to include the
gks_2016_pyr
directory. - Change to the
gks_2016_python
directory and run the notebook server.
GKSDIR='~/gks2016' # change me
git clone https://bitbucket.org/teamkseta/gks_2016_pyr.git $GKSDIR/gks_lib
git clone https://bitbucket.org/teamkseta/gks_2016_r.git $GKSDIR/gks_2016_r
export PYTHONPATH="$GKSDIR/gks_lib:$PYTHONPATH"
jupyter-notebook
- To also start Rserve you will need to execute:
R CMD Rserve
- Make sure not to run this with root privileges! By default, Rserve listens on localhost:6311.