Difference between revisions of "Multi-threaded Programming"

From Gridkaschool
(Tutorial Material)
 
(15 intermediate revisions by 2 users not shown)
Line 1: Line 1:
  +
===[[Multithreaded|Technical specification/requirements]]===
  +
 
= Introduction =
 
= Introduction =
   
  +
Intel Threading Building Blocks (TBB) is an open-source C++ library to support
OpenCL is a standard which defines a framework, an API and a programming language for parallel computation on heterogeneous systems like client computer systems, high- performance computing servers as well as hand-held devices. The standard is maintained by the Khronos Group and supported by a large consortium of industry leaders including Apple, Intel, AMD, NVIDIA and ARM. Influenced by NVIDIA’s CUDA from the GPU side and by OpenMP which originates from the classical CPU side, the open OpenCL standard is characterized by a formulation which is abstract enough to support both CPU and GPU computing resources. This is an ambitious goal, since providing an abstract interface together with a peak performance is a challenging task. OpenCL employs a strict isolation of the computation work into fundamental units, the kernels. These kernels can be developed in the OpenCL C programming language, a subset of the C99 language, with some additional OpenCL specific keywords. In general, these kernels are hardware independent and compiled by the OpenCL runtime when they are loaded. To be able to fully exploit the parallel execution of the kernel code, several kernel instances, the work items, are started to process a set of input values. The actual number of concurrently running work items is determined by the OpenCL system. How a concrete algorithm can be partitioned into work items has to be decided by the programmer.
 
  +
the development of mulit-threaded applications which can exploit the available
  +
processing cores in modern CPUs. Therefore, various contsructs to support
  +
parallelism in applications are provided by TBB. Low-level constructs are
  +
available to partition loops to run them distributed over many CPU cores.
  +
High-level constructs like a task scheduler and a graph-based execution model
  +
allow to express dependencies and relations between computing tasks and TBB can
  +
distribute these items among the available cores.
   
  +
Furthermore, C++ classes are included that guarantee a thread-safe access to
= Reference Material =
 
  +
containers like lists and maps. Also explicit locking constructs like mutexes
  +
are available.
   
  +
The TBB library supports the Windows, Mac OS and Linux operating systems and the
<ul>
 
  +
Visual C++, GCC and Intel compilers.
<li><p>Khronos Group OpenCL</p>
 
<p>http://www.khronos.org/opencl/</p></li>
 
<li><p>OpenCL 1.2 Quick Reference Card</p>
 
<p>http://www.khronos.org/files/opencl-1-2-quick-reference-card.pdf</p></li>
 
<li><p>OpenCL 1.2 Full Documentation</p>
 
<p>http://www.khronos.org/registry/cl/sdk/1.2/docs/man/xhtml/</p></li>
 
<li><p>Intel SDK for OpenCL</p>
 
<p>http://software.intel.com/en-us/articles/vcsource-tools-opencl-sdk/</p></li>
 
<li><p>AMD OpenCL Zone</p>
 
<p>http://developer.amd.com/zones/OpenCLZone/</p></li>
 
<li><p>NVIDIA OpenCL</p>
 
<p>http://www.nvidia.com/object/cuda_opencl_1.html</p></li></ul>
 
   
= Project: Boostraping OpenCL and Vector Addition =
 
   
  +
= Tutorial Material =
TODO: give overview of what to do
 
   
  +
<ul>
== Compiling and running the test program ==
 
  +
<li><p>Tasks</p>
  +
<p>http://hauth.web.cern.ch/hauth/tutorial_mcore.pdf</p></li>
  +
<li><p>Slides</p>
  +
<p>http://hauth.web.cern.ch/hauth/mcore_introduction.pdf</p></li>
  +
<li><p>Source code</p>
  +
<p>http://hauth.web.cern.ch/hauth/GridKa_Multicore_2013.tar.gz</p>
  +
Use:
  +
wget http://hauth.web.cern.ch/hauth/GridKa_Multicore_2013.tar.gz
  +
tar xzf GridKa_Multicore_2013.tar.gz
  +
to download and extract the source code on your GridKa maschine.
  +
</li>
  +
</ul>
   
  +
= Tips & Tricks =
Open the folder '''project_vectoradd''', create the build files using CMake and compile the application.
 
  +
Connect to the workshop machine using
  +
ssh -p 24 <gks username>@<full machine address>
   
  +
If you want to view the picture output of project3, run this command in the project3_taylor folder ( best in a second ssh connection )
  +
python -m SimpleHTTPServer 8080
   
  +
Now you can open this url in your regular browser (replace with your workshop machine):
  +
http://gks-219.scc.kit.edu:8080/
   
  +
= Reference Material =
<pre>$ cd project_vectoradd/
 
[hauth@vdt-corei7avx project_vectoradd]$ cmake .
 
[hauth@vdt-corei7avx project_vectoradd]$ make
 
[100%] Built target vectoradd
 
[hauth@vdt-corei7avx project_vectoradd]$ ./vectoradd </pre>
 
Once the application was compiled successully run it. The output should be along the following lines:
 
 
<pre>$ ./vectoradd
 
Testing Platform : Intel(R) OpenCL
 
&gt; Selected Compute Device : Intel(R) Core(TM) i7-3930K CPU @ 3.20GHz
 
Transferring data to device memory took 2e-06 s
 
Running vectorAdd kernel took 2e-06 s
 
Transferring data to host memory took 0 s
 
All done</pre>
 
== Running an OpenCL Kernel ==
 
 
The first task is to run a simple OpenCL kernel. To do so, you have to edit the file <tt>vectoradd.cpp</tt> in your favorite text editor, use <tt>nano</tt> if you are not sure which tool to use.
 
 
<pre>$ nano vectoradd.cpp</pre>
 
Take your time to familiarize yourself with the sourcecode which is already in the file. Some of the intial steps of setting up the OpenCL system are already provided:
 
   
 
<ul>
 
<ul>
<li><p>Creating the OpenCL compute context</p>
+
<li><p>Intel Threading Building Blocks - Online Reference</p>
  +
<p>http://software.intel.com/sites/products/documentation/doclib/tbb_sa/help/index.htm#reference/reference.htm</p></li>
<p>A OpenCL platform is automaticaly selected, depending the required device type. You can change the required device type by modifying the constant <tt>devType</tt>:</p>
 
  +
<li><p>Intel Threading Building Blocks - Tutorial</p>
<pre>// Desired Device type.
 
  +
<p>http://software.intel.com/sites/products/documentation/doclib/tbb_sa/help/index.htm#tbb_userguide/title.htm</p></li>
// can be CL_DEVICE_TYPE_GPU or CL_DEVICE_TYPE_CPU in this example
 
  +
<li><p>Intel Threading Building Blocks - Design Patterns</p>
const cl_device_type devType = CL_DEVICE_TYPE_CPU;</pre></li>
 
  +
<p>http://software.intel.com/sites/products/documentation/doclib/tbb_sa/help/tbb_userguide/Design_Patterns/Design_Patterns.htm</p></li>
<li></li></ul>
 
  +
<li><p>Intel Threading Building Blocks - General Documentation</p>
 
  +
<p>http://software.intel.com/sites/products/documentation/doclib/tbb_sa/help/index.htm</p></li>
== Modifications to play around ==
 
  +
</ul>
 
* switch to double , how does the runtime change for CPU/GPU ?
 
* switch to float4 vector type. can you perform the same addition operations ?
 
 
= Project: N-Body Simulation =
 
 
===[[Internals:Multi-threaded|Technical specification/requirements]]===
 

Latest revision as of 16:53, 19 August 2013

Technical specification/requirements

Introduction

Intel Threading Building Blocks (TBB) is an open-source C++ library to support the development of mulit-threaded applications which can exploit the available processing cores in modern CPUs. Therefore, various contsructs to support parallelism in applications are provided by TBB. Low-level constructs are available to partition loops to run them distributed over many CPU cores. High-level constructs like a task scheduler and a graph-based execution model allow to express dependencies and relations between computing tasks and TBB can distribute these items among the available cores.

Furthermore, C++ classes are included that guarantee a thread-safe access to containers like lists and maps. Also explicit locking constructs like mutexes are available.

The TBB library supports the Windows, Mac OS and Linux operating systems and the Visual C++, GCC and Intel compilers.


Tutorial Material

Tips & Tricks

Connect to the workshop machine using

 ssh -p 24 <gks username>@<full machine address>

If you want to view the picture output of project3, run this command in the project3_taylor folder ( best in a second ssh connection )

 python -m SimpleHTTPServer 8080

Now you can open this url in your regular browser (replace with your workshop machine):

 http://gks-219.scc.kit.edu:8080/

Reference Material