Advanced Python Programming for Data Science
Advanced Python Programming for Data Science
In recent years, Python has been adopted in many fields, be it scientific, commercial and other. For most new users, Python is easy to learn and allows to quickly write small but powerful scripts. At the same time, it is also feasible to create web services, distributed processing systems, and other complex applications.
Beginners have an easy start with Python: little code is required for many tasks as the language has "batteries included". Yet, data driven science often relies on custom workflows and algorithms. Writing clean, efficient, and comprehensive code is vital; perhaps not for the next deadline, but the one after. Luckily, Python provides many features to even manage complex tasks with ease.
The course focuses on advanced concepts of Python programming, useful for data science. This covers features of the language, how to use them, and also best practices. We present this as an interactive hands on tutorial, which gives you a feeling for the capabilities of Python.
Doing the Course on your Own
For all exercise, sample solutions have been added to the course material. You can find both the notebooks and the library at the git repositories listed below.
If you want to (re)do the course on your own, follow the instructions in the section Running the Course on your own Computer.
Software Environment
The courses are implemented as Jupyter/IPython notebooks. We provide a VM with preconfigured Jupyter Notebook Server for each student.
You can also setup the software environment on your own computer. If you want to run the course after the school, a setup on your own computer is required.
Minimum Requirement
- Current web browser, such as Firefox, Chrome or Safari.
- You will receive a server address and password at the start of the course.
Running the Course on your own Computer
We have only tested the course on Linux and OSX! Installation on Windows may differ!
Required Software
- Python3 (>= 3.4) - A sufficiently recent release of Python
- Commonly available via package managers such as
apt-get install python3
orbrew install python3
. - Also available from the Python Homepage
- Commonly available via package managers such as
- Python3 pip - Python Package Manager
- Several package managers do not install the python package manger
- It will be available as a separate package in this case, e.g.
apt-get install python3-pip
.
- Jupyter - Evaluates and renders the notebooks
- Available via pip:
pip3 install jupyter
- Available via pip:
- RISE (optional) - provides the interactive presentation view
- Consult the RISE Readme
- Available via pip:
pip3 install RISE && jupyter-nbextension install rise --py --sys-prefix && jupyter-nbextension enable rise --py --sys-prefix
Setting up the Environment
- Check out the exercise repositories
https://bitbucket.org/teamkseta/gks_2016_pyr.git
andhttps://bitbucket.org/teamkseta/gks_2016_python.git
. - Set PYTHONPATH to include the
gks_2016_pyr
directory. - Change to the
gks_2016_python
directory and run the notebook server.
GKSDIR='~/gks2016' # change me
git clone https://bitbucket.org/teamkseta/gks_2016_pyr.git $GKSDIR/gks_lib
git clone https://bitbucket.org/teamkseta/gks_2016_python.git $GKSDIR/gks_2016_python
export PYTHONPATH="$GKSDIR/gks_lib:$PYTHONPATH"
jupyter-notebook