CUDA范例精解

豆瓣
CUDA范例精解

登录后可管理标记收藏。

ISBN: 9787302239956
作者: Jason Sanders / Edward Kandrot
出版社: 清华大学出版社
发行时间: 2010 -10
装订: 平装
价格: 39.00元
页数: 289

/ 10

0 个评分

评分人数不足
借阅或购买

通用GPU编程(影印版)

Jason Sanders / Edward Kandrot   

简介

《CUDA范例精解:通用GPU编程(影印版)》内容简介:CUDA是设计用于帮助开发并行程序的计算体系结构。通过与广泛的软件平台相结合,cuda体系结构使程序员可以充分利用图形处理单元(gpu)的强大能力构建高性能的应用程序。当然,gpu已经在很长时间内用于实现复杂的图形和游戏应用程序。现在,cuda将这种极具价值的资源带给在其他领域内从事应用程序开发的程序员,包括科学、工程和财务领域。这些程序员完全不需要了解图形编程的相关知识,而只要能够采用适当扩展的c语言版本进行编程即可。
《CUDA范例精解:通用GPU编程(影印版)》由cuda软件平台团队中的两位资深成员编写而成,他们向程序员展示了如何使用这种新的技术,并且通过大量可以运行的示例介绍了cuda开发的每个领域。在简要介绍cuda平台和体系结构以及快速指导cudac之后,本书详细介绍了与每个关键的cuda功能相关的技术,以及如何权衡使用这些功能。通过阅读本书,您将掌握使用每个cudac扩展的时机以及编写性能极为优越的cuda软件的方式。

contents

ForewordPrefaceAcknowledgmentsAbout the Authors1 WHY CUDA? WHY NOW? 1.1 Chapter Objectives 1.2 The Age of Parallel Processing 1.2.1 Central Processing Units 1.3 The Rise of GPU Computing 1.3.1 A Brief History of GPUs 1.3.2 Early GPU Computing 1.4 CUDA 1.4.1 What Is theCUDAArchitecture? 1.4.2 Using the CUDAArchitecture 1.5 Applications of CUDA 1.5.1 Medical- Imaging 1.5.2 ComputationatFl-uid Dynamics 1.5.3 Environmental- Science 1.6 Chapter Review2 GETTING STARTED 2.1 Chapter Objectives 2.2 Devetopment Environment 2.2.1 CUDA-Enabled Graphics Processors 2.2.2 NVlDIA Device Driver 2.2.3 CUDA Development Toolkit 2.2.4 Standard C Compiler 2.3 Chapter Review3 INTRODUCTION TO CUBA C 3.1 Chapter Objectives 3.2 A First Program 3.2.1 Hetlo, Wortd! 3.2.2 A Kernet Catl 3.2.3 Passing Parameters 3.3 Querying Devices 3.4 Using Device Properties 3.5 Chapter Review4 PARALLEL PROGRAMMING IN CUDA C 4.1 Chapter Objectives 4.2 CUBA Parattel Programming 4.2.1 Summing Vectors 4.2.2 A Fun Exampte 4.3 Chapter Review5 THREAD COOPERATION 5.1 Chapter Objectives 5.2 Splitting Paraltel Blocks 5.2.1 Vector Sums: Redux 5.2.2 GPU Ripple Using Threads 5.3 Shared Memory and Synchronization 5.3.1 Dot Product 5.3.2 Dot Product Optimized lIncorrectLyl 5.3.3 Shared Memory Bitmap 5.4 Chapter Review6 CONSTANT MEMORY AND EVENTS 6.1 Chapter Objectives 6.2 Constant Memory 6.2.1 RayTracing Introduction 6.2.2 Ray Tracing on the GPU 6.2.3 Ray Tracing with Constant Memory 6.2.4 Performance with Constant Memory 6.3 Measuring Performance with Events 6.3.1 Measuring Ray Tracer Performance 6.4 Chapter Review7 TEXTURE MEMORY 7.1 Chapter Objectives 7.2 Texture Memory Overview 7.3 Simulating Heat Transfer 7.3.1 Simple Heating Model 7.3.2 Computing Temperature Updates 7.3.3 Animating the Simulation 7.3.4 Using Texture Memory 7.3.5 Using Two-Dimensional Texture Memory 7.4 Chapter Review8 GRAPHICS INTEROPERABILITY 8.1 Chapter Objectives 8.2 Graphics Interoperation 8.3 GPU Ripple with Graphics Interoperability 8.3.1 The GPUAnimBitmap Structure 8.3.2 GPU Ripple Redux 8.4 Heat Transfer with Graphics Interop 8.5 DirectX Interoperability 8.6 Chapter Review9 ATOHICS 9.1 Chapter Objectives 9.2 Compute Capability 9.2.1 The Compute Capability of NVIDIA GPUs 9.2.2 Compiling for a Minimum Compute Capability 9.3 Atomic Operations Overview 9.4 Computing Histograms 9.4.1 CPU Histogram Computation 9.4.2 GPU Histogram Computation 9.5 Chapter Review10 STREAMS 10.1 Chapter Objectives 10.2 Page-Locked Host Memory 10.3 CUDA Streams 10.4 Using a Single CUDA Stream 10.5 Using Muitipte CUDA Streams 10.6 GPU Work Scheduting 10.7 Using Muttipte CUDA Streams Effectivety 10.8 Chapter Review11 CUDA C ON MULTIPLE GPUS 11.1 Chapter Objectives 11.2 Zero-Copy Host Memory 11.2.1 Zero-Copy Dot Product 11.2.2 Zero-Copy Performance 11.3 Using Multiple GPUs 11.4 Portable Pinned Memory 11.5 Chapter Review12 THE FINAL COUNTDOWN 12.1 Chapter Objectives 12.2 CUDA Toots 12.2.1 CUDA Tootkit 12.2.2 CUFFT 12.2.3 CUBLAS 12.2.4 NVlDIAGPU ComputingSDK 12.2.5 NVIDIA Performance Primitives 12.2.6 Debugging CUDAC 12.2.7 CUDAVisual Profiler 12.3 Written Resources 12.3.1 Programming Massively Parallel Processors:A Hands-On Approach 12.3.2 CUDA U 12.3.3 NVIDIA Forums 12.4 Code Resources 12.4.1 CUDA Data Parallel Primitives Library 12.4.2 CULAtools 12.4.3 Language Wrappers 12.5 Chapter ReviewA ADVANCED ATOMICS A.1 Dot Product Revisited A.I.1 Atomic Locks A.I.2 Dot Product Redux:Atomic Locks A.2 Implementing a Hash Table A.2.1 Hash Table Overview A.2.2 ACPU HashTable A.2.3 Multithreaded Hash Table A.2.4 AGPU Hash Table A.2.5 Hash Table Performance A.3 Appendix ReviewIndex

其它版本
短评
评论
笔记