Advanced Python Programming for Data Science

From Gridkaschool
Jump to: navigation, search

Contents

Advanced Python Programming for Data Science

In recent years, Python has been adopted in many fields, be it scientific, commercial and other. For most new users, Python is easy to learn and allows to quickly write small but powerful scripts. At the same time, it is also feasible to create web services, distributed processing systems, and other complex applications.

Beginners have an easy start with Python: little code is required for many tasks as the language has "batteries included". Yet, data driven science often relies on custom workflows and algorithms. Writing clean, efficient, and comprehensive code is vital; perhaps not for the next deadline, but the one after. Luckily, Python provides many features to even manage complex tasks with ease.

The course focuses on advanced concepts of Python programming, useful for data science. This covers features of the language, how to use them, and also best practices. We present this as an interactive hands on tutorial, which gives you a feeling for the capabilities of Python.

Doing the Course on your Own

For all exercise, sample solutions have been added to the course material. You can find both the notebooks and the library at the git repositories listed below.

If you want to (re)do the course on your own, follow the instructions in the section Running the Course on your own Computer.

Software Environment

The courses are implemented as Jupyter/IPython notebooks. We provide a VM with preconfigured Jupyter Notebook Server for each student.

You can also setup the software environment on your own computer. If you want to run the course after the school, a setup on your own computer is required.

Minimum Requirement

  • Current web browser, such as Firefox, Chrome or Safari.
    • You will receive a server address and password at the start of the course.

Running the Course on your own Computer

We have only tested the course on Linux and OSX! Installation on Windows may differ!

Required Software

  • Python3 (>= 3.4) - A sufficiently recent release of Python
    • Commonly available via package managers such as apt-get install python3 or brew install python3.
    • Also available from the Python Homepage
  • Python3 pip - Python Package Manager
    • Several package managers do not install the python package manger
    • It will be available as a separate package in this case, e.g. apt-get install python3-pip.
  • Jupyter - Evaluates and renders the notebooks
    • Available via pip: pip3 install jupyter
  • RISE (optional) - provides the interactive presentation view
    • Consult the RISE Readme
    • Available via pip: pip3 install RISE && jupyter-nbextension install rise --py --sys-prefix && jupyter-nbextension enable rise --py --sys-prefix

Setting up the Environment

GKSDIR='~/gks2016' # change me

git clone https://bitbucket.org/teamkseta/gks_2016_pyr.git $GKSDIR/gks_lib

git clone https://bitbucket.org/teamkseta/gks_2016_python.git $GKSDIR/gks_2016_python

export PYTHONPATH="$GKSDIR/gks_lib:$PYTHONPATH"

jupyter-notebook

Personal tools