Course Overview:
This course (HBCS102) is provided face-to-face/on-site as well as ON-LINE (in real time) divided into three modules: Level "A" , Level "B" and Level "C", level C being the most advance course, comprising mainly of practical applications.
Level "A" is an introductory course on parallel programming with about 20% of the time devoted for CUDA programming. This level does not require any parallel computing knowledge. Only a Data structures level course is required. Some exposure to image processing is also given in this module. The course starts from C programming language, and covers the detail of Graphics card hardware ( GPU architecture, DRAM, PCIe, etc). Apart from these concepts we also cover elementary concepts in CUDA programming on Windows and Linux environment. The course aims at making the trainee understand how to write a simple program for squaring of (say) first 10000 integers, and such other simple CUDA programs. In short the candidate learns how to write simple CUDA programs and understand basic hardware and software details, without bothering about the performance.
Level "B" discusses parallel programming concepts in detail giving specific focus on CUDA programming. Specifically you are exposed to the following special topics: Performance metrics - speedup, utilization, efficiency, scalability, Models of Parallel Computation: SIMD (Single Instruction Multiple Data), MIMD (Multiple Instruction Multiple Data), GPU Compute Architecture, CUDA, Memory organization in CUDA, Memory Optimization, Coalesced Access, Occupancy, Transparent Scalability, Performance Guidelines. Finally the trainee learns different algorithms for fast Matrix Multiplication and implements the same in CUDA, getting significant performance benefits.
Level "C" is the advance course and is mainly related to practical implementations. Level C is a hands-on course involving significant parallel programming on massive-core GPUs fundamentally CUDA compatible NVIDIA's GPU. Specifically we will be working on NIDS (Network Intrusion Detection System) acceleration on GPUs. This will require core knowledge of networking fundamentals as well CUDA programming skills.
Target Audience:
___________________________________________________________________
Prerequisites:
For Level "A", the person should be familiar with the concepts of C programming language. Although the parallel programming will be taught in the training in Level "A", but some exposure to it will help you grasp the concept quickly.
___________________________________________________________________
Reference BooksIntroduction to Parallel Computing by Ananth Grama, George Karypis, Vipin Kumar and Anshul Gupta (Pearson)
CUDA programming Guide, CUDA Best Practice Guide (Download from nvidia.com)
GPU GEMS 3 by Hubert Nguyen
Reading Material from Internet
For any specific information or query contact us at info@hbeonlabs.com