IT story

하드웨어없이 CUDA 프로그래밍을위한 GPU 에뮬레이터

hot-time 2020. 8. 12. 20:34
반응형

하드웨어없이 CUDA 프로그래밍을위한 GPU 에뮬레이터


질문 : 실제 하드웨어없이 CUDA를 프로그래밍하고 테스트 할 수있는 Geforce 카드 용 에뮬레이터가 있습니까?


정보 :CUDA에서 몇 가지 시뮬레이션 속도를 높이고 싶지만 문제는이 개발을 수행하기 위해 항상 데스크톱 주위에 있지 않다는 것입니다. 대신 넷북에서 작업을하고 싶지만 넷북에 GPU가 없습니다. 내가 아는 한 CUDA를 실행하려면 CUDA 지원 GPU가 필요합니다. 이 문제를 해결할 방법이 있습니까? 유일한 방법은 GPU 에뮬레이터 인 것 같습니다 (분명히 느리지 만 작동 할 것임). 하지만이 일을 할 수있는 방법이 무엇이든 듣고 싶습니다.Ubuntu 10.04 LTS에서 프로그래밍 중입니다.


2016 년 (그리고 2017 년까지) 해답을 찾고있는 사람들을 위해 ...


부인 성명

  • 결국 GPU를 에뮬레이션하지 못했습니다.
  • gpuocelot종속성 목록을 충족하면 사용할 수 있습니다 .

BunsenLabs (Linux 3.16.0-4-686-pae # 1 SMP Debian 3.16.7-ckt20-1 + deb8u4 (2016-02-29) i686 GNU / Linux) 용 에뮬레이터를 구하려고했습니다.내가 배운 것을 말씀 드리겠습니다.


  1. nvcc-deviceemuCUDA Toolkit 3.0에서 다시 옵션을 사용했습니다.
    #include <stdio.h>
    
    __global__ void helloWorld() {
        printf("Hello world! I am %d (Warp %d) from %d.\n",
            threadIdx.x, threadIdx.x / warpSize, blockIdx.x);
    }
    
    int main() {
        int blocks, threads;
        scanf("%d%d", &blocks, &threads);
        helloWorld<<<blocks, threads>>>();
        cudaDeviceSynchronize();
        return 0;
    }
    
    CUDA Toolkit 3.0

    nvcc

    에서는

    /usr/local/cuda/bin/

    .컴파일하는 데 어려움이 있음이 밝혀졌습니다.
    NOTE: device emulation mode is deprecated in this release
          and will be removed in a future release.
    
    /usr/include/i386-linux-gnu/bits/byteswap.h(47): error: identifier "__builtin_bswap32" is undefined
    
    /usr/include/i386-linux-gnu/bits/byteswap.h(111): error: identifier "__builtin_bswap64" is undefined
    
    /home/user/Downloads/helloworld.cu(12): error: identifier "cudaDeviceSynchronize" is undefined
    
    3 errors detected in the compilation of "/tmp/tmpxft_000011c2_00000000-4_helloworld.cpp1.ii".
    

  2. 인터넷에서 내가 사용
    gcc-4.2
    하거나 이와 유사한 고대
    gcc-4.9.2
    오류 대신 사용하면 사라질 수 있음을 발견했습니다 . 나는 포기 했어.
  3. CUDA Toolkit 3.0을 다운로드하여 설치하고 간단한 프로그램을 실행 해 보았습니다.
  4. gpuocelot다음은 최신 웹 사이트입니다.

    가이드에

    따라 gpuocelot을 설치하려고했습니다 . 그래도 설치 중에 몇 가지 오류가 발생하여 다시 포기했습니다.

    gpuocelot

    더 이상 지원되지 않으며 매우 특정한 버전의 라이브러리 및 소프트웨어에 의존합니다.2015 년 7 월

    부터이 튜토리얼

    을 따르려고 할 수도 있지만 작동한다고 보장하지는 않습니다. 나는 그것을 테스트하지 않았습니다.
  5. Stringer의 답변에는 매우 오래된
    gpuocelot
    프로젝트 웹 사이트에 대한 링크가 있습니다. 그래서 처음에는 프로젝트가 2012 년쯤에 중단되었다고 생각했습니다. 실제로 몇 년 후 버려졌습니다.
  6. MCUDA

    It might be useful. Here is a link to the website.


  7. The MCUDA translation framework is a linux-based tool designed to effectively compile the CUDA programming model to a CPU architecture.

  8. CUDA Waste

    Here's the link to the project's website: https://code.google.com/archive/p/cuda-waste/

  9. It is an emulator to use on Windows 7 and 8. I've not tried it though. It doesn't seem to be developed anymore (the last commit is dated on Jul 4, 2013).

  1. CU2CL

    As dashesy pointed out in the comments, CU2CL seems to be an interesting project. It seems to be able to translate CUDA code to OpenCL code. So if your GPU is capable of running OpenCL code then the CU2CL project might be of your interest.

    Links:

  2. Last update: 12.03.2017

This response may be too late, but it's worth noting anyway. GPU Ocelot (of which I am one of the core contributors) can be compiled without CUDA device drivers (libcuda.so) installed if you wish to use the Emulator or LLVM backends. I've demonstrated the emulator on systems without NVIDIA GPUs.

The emulator attempts to faithfully implement the PTX 1.4 and PTX 2.1 specifications which may include features older GPUs do not support. The LLVM translator strives for correct and efficient translation from PTX to x86 that will hopefully make CUDA an effective way of programming multicore CPUs as well as GPUs. -deviceemu has been a deprecated feature of CUDA for quite some time, but the LLVM translator has always been faster.

Additionally, several correctness checkers are built into the emulator to verify: aligned memory accesses, accesses to shared memory are properly synchronized, and global memory dereferencing accesses allocated regions of memory. We have also implemented a command-line interactive debugger inspired largely by gdb to single-step through CUDA kernels, set breakpoints and watchpoints, etc... These tools were specifically developed to expedite the debugging of CUDA programs; you may find them useful.

Sorry about the Linux-only aspect. We've started a Windows branch (as well as a Mac OS X port) but the engineering burden is already large enough to stress our research pursuits. If anyone has any time and interest, they may wish to help us provide support for Windows!

Hope this helps.


You can check also gpuocelot project which is a true emulator in the sense that PTX (bytecode in which CUDA code is converted to) will be emulated.

There's also an LLVM translator, it would be interesting to test if it's more fast than when using -deviceemu.


The CUDA toolkit had one built into it until the CUDA 3.0 release cycle. I you use one of these very old versions of CUDA, make sure to use -deviceemu when compiling with nvcc.


https://github.com/hughperkins/cuda-on-cl lets you run NVIDIA® CUDA™ programs on OpenCL 1.2 GPUs (full disclosure: I'm the author)


Be careful when you're programming using -deviceemu as there are operations that nvcc will accept while in emulation mode but not when actually running on a GPU. This is mostly found with device-host interaction.

And as you mentioned, prepare for some slow execution.


GPGPU-Sim is a GPU simulator that can run CUDA programs without using GPU. I created a docker image with GPGPU-Sim installed for myself in case that is helpful.

참고URL : https://stackoverflow.com/questions/3087361/gpu-emulator-for-cuda-programming-without-the-hardware

반응형