猿代码 — 科研/AI模型/高性能计算
0

天河超算,GPU,OpenCV, gpu-basics-similarity

摘要: 1)gpu-basics-similarity]$ nvcc -O3 gpu-basics-similarity.cpp -I /THL5/home/te33334.16/include -L /THL5/ho33334.16/lib64 -lopencv_cudabgsegm -lopencv_cudaobjdetect -lopencv_cudastereo -lopencv_stitchi ...
1)
gpu-basics-similarity]$ nvcc -O3 gpu-basics-similarity.cpp -I /THL5/home/te33334.16/include -L /THL5/ho33334.16/lib64  -lopencv_cudabgsegm -lopencv_cudaobjdetect -lopencv_cudastereo -lopencv_stitching -lopencv_cudafeatures2d -lopencv_superres -lopencv_cudacodec -lopencv_videostab -lopencv_cudaoptflow -lopencv_cudalegacy -lopencv_cudawarping -lopencv_aruco -lopencv_bgsegm -lopencv_bioinspired -lopencv_ccalib -lopencv_dnn_objdetect -lopencv_dpm -lopencv_highgui -lopencv_videoio -lopencv_face -lopencv_freetype -lopencv_fuzzy -lopencv_hdf -lopencv_hfs -lopencv_img_hash -lopencv_line_descriptor -lopencv_optflow -lopencv_reg -lopencv_rgbd -lopencv_saliency -lopencv_stereo -lopencv_structured_light -lopencv_phase_unwrapping -lopencv_surface_matching -lopencv_tracking -lopencv_datasets -lopencv_text -lopencv_dnn -lopencv_plot -lopencv_xfeatures2d -lopencv_shape -lopencv_video -lopencv_ml -lopencv_ximgproc -lopencv_xobjdetect -lopencv_objdetect -lopencv_calib3d -lopencv_imgcodecs -lopencv_features2d -lopencv_flann -lopencv_xphoto -lopencv_photo -lopencv_cudaimgproc -lopencv_cudafilters -lopencv_imgproc -lopencv_cudaarithm -lopencv_core -lopencv_cudev

2)
a.out  gpu-basics-similarity.cpp  lena.jpg  lena_tmpl.jpg

3)
./a.out lena.jpg lena_tmpl.jpg

--------------------------------------------------------------------------
This program shows how to port your CPU code to CUDA or write that from scratch.
You can see the performance improvement for the similarity check methods (PSNR and SSIM).
Usage:
./gpu-basics-similarity referenceImage comparedImage numberOfTimesToRunTest(like 10).
--------------------------------------------------------------------------

terminate called after throwing an instance of 'std::logic_error'
  what():  basic_string::_M_construct null not valid
运行出错,本地无GPU

3)
yhrun -p TH_GPU -N 1 ./a.out lena.jpg lena_tmpl.jpg

--------------------------------------------------------------------------
This program shows how to port your CPU code to CUDA or write that from scratch.
You can see the performance improvement for the similarity check methods (PSNR and SSIM).
Usage:
./gpu-basics-similarity referenceImage comparedImage numberOfTimesToRunTest(like 10).
--------------------------------------------------------------------------

terminate called after throwing an instance of 'std::logic_error'
  what():  basic_string::_M_construct null not valid
yhrun: error: gn10: task 0: Aborted

还是一样的出错
gdb 调试

#11 0x00000000004063ec in main(int, char**) (argv=0x7fffffffda48)
    at gpu-basics-similarity.cpp:81

    int TIMES = 10;
    stringstream sstr(argv[3]);
    sstr >> TIMES;
    double time, result = 0;
需要三个参数,我倒

4) 在K80上也非常好了

yhrun -p TH_GPU -N 1 ./a.out lena.jpg lena_tmpl.jpg 10

--------------------------------------------------------------------------
This program shows how to port your CPU code to CUDA or write that from scratch.
You can see the performance improvement for the similarity check methods (PSNR and SSIM).
Usage:
./gpu-basics-similarity referenceImage comparedImage numberOfTimesToRunTest(like 10).
--------------------------------------------------------------------------

Time of PSNR CPU (averaged for 10 runs): 5.35158 milliseconds. With result of: 16.7378
Time of PSNR CUDA (averaged for 10 runs): 559.219 milliseconds. With result of: 16.7378
Initial call CUDA optimized:              1.8707 milliseconds. With result of: 16.7378
Time of PSNR CUDA OPTIMIZED ( / 10 runs): 1.25884 milliseconds. With result of: 16.7378

[ WARN:0] OpenCV/MatExpr: processing of multi-channel arrays might be changed in the future: https://github.com/opencv/opencv/issues/16739
[ WARN:0] OpenCV/MatExpr: processing of multi-channel arrays might be changed in the future: https://github.com/opencv/opencv/issues/16739
Time of MSSIM CPU (averaged for 10 runs): 152.462 milliseconds. With result of B0.904694 G0.905915 R0.909984
Time of MSSIM CUDA (averaged for 10 runs): 194.516 milliseconds. With result of B0.904694 G0.905915 R0.909984
Time of MSSIM CUDA Initial Call            12.9246 milliseconds. With result of B0.904694 G0.905915 R0.909984
Time of MSSIM CUDA OPTIMIZED ( / 10 runs): 11.4991 milliseconds. With result of B0.904694 G0.905915 R0.909984

6)
 gpu-thrust-interop]$ yhrun -p TH_GPU -N 1 ./a.out
40

也可以运行不知道 是啥事意思



说点什么...

已有0条评论

最新评论...

本文作者
2023-11-15 23:15
  • 0
    粉丝
  • 346
    阅读
  • 0
    回复
资讯幻灯片
热门评论
热门专题
排行榜
Copyright   ©2015-2023   猿代码-超算人才智造局 高性能计算|并行计算|人工智能      ( 京ICP备2021026424号-2 )