猿代码 — 科研/AI模型/高性能计算
0

ppl.cv CUDA benchmark测试结果

摘要: 1. CompilationPlease refer toBuilding commands on linuxandBuilding commands on windowsinCUDA Platform Guide.2. Running benchmarkPlease refer toHow to run benchmarkinCUDA Platform Guide.3. Testing conf ...

1. Compilation

Please refer to Building commands on linux and Building commands on windows in CUDA Platform Guide.

2. Running benchmark

Please refer to How to run benchmark in CUDA Platform Guide.

3. Testing configuration

Information of machines:

  • X86 desktop computer with Geforce GTX 1060 GPU:
    • CPU: Intel® Core™ i7-7700 CPU (8 cores, 3.60GHz)
    • GPU: GeForce GTX 1060 (1280 CUDA Cores, 1772 MHz)
    • Host memory: 32 GB
    • Device memory: 6 GB
    • OS: ubuntu 16.04
  • X86 Cloud server with Tesla V100 GPU:
    • CPU: Intel(R) Xeon(R) Gold 6146 CPU (12 cores, 3.20 GHz)
    • GPU: Tesla V100 (5,120 cores, 1230 MHz)
    • Host memory: 396 GB
    • Device memory: 64 GB
    • OS: ubuntu 16.04

在我们的基准测试中,一个函数有三种实现,一种是ppl.cv中的CUDA实现,另一种是OpenCV中前者的x86和CUDA实现。它们都是在一系列参数组合上运行的,这些参数组合涵盖了常见用法,并记录了经过的时间。除了函数的特定参数外,还为每个函数测试支持的数据类型(uchar/foat)、通道(1/3/4)和常用的图像大小。输入图像由随机生成的像素值组成。




我们使用x86或CUDA实现来描述性能,该实现是OpenCV中速度最快的。对于每个函数,我们对加速比进行排序,并选出最小加速比、中值加速比和最大加速比,以形成一个紧凑的方框图来表征加速比,而不是平均加速比。


4. Speedup statistics

functionGeforce GTX 1060Tesla V100
Abs(1.027618, 1.216567, 2.522977)(schar), (1.002612, 1.079896, 1.212746)(float)(1.920000, 2.250000, 4.000000)(schar), (1.428571, 1.600000, 2.666667)(float)
Add(1.037273, 1.317647, 3.866667)(uchar), (1.012752, 1.135303, 1.318750)(float)(1.484422, 2.346158, 3.628854)(uchar), (1.169862, 1.400410, 2.473603)(float)
AddWeighted(0.911404, 1.069565, 2.500000)(uchar), (1.032401, 1.068750, 1.312500)(float)(0.887411, 1.310023, 2.540298)(uchar), (1.162151, 1.358260, 2.297834)(float)
Subtract(1.062500, 1.321429, 3.444444)(uchar), (1.025086, 1.073438, 1.256250)(float)(1.187220, 1.922802, 3.456477)(uchar), (1.200039, 1.639410, 3.316506)(float)
Mul(1.083189, 1.327586, 3.193548)(uchar), (1.020218, 1.062124, 1.213873)(float)(1.591886, 2.275094, 3.521160)(uchar), (1.169695, 1.402987, 2.497707)(float)
Div(1.197259, 1.318625, 1.967651)(uchar), (1.020744, 1.091126, 1.343856)(float)(1.393939, 2.000000, 2.666667)(uchar), (1.465116, 1.800000, 3.666667)(float)
BGR2BGRA(1.076923, 1.188437, 2.466667)(uchar), (1.009084, 1.058051, 1.319444)(float)(0.996509, 2.210639, 3.574978)(uchar), (1.191974, 1.460745, 3.175911)(float)
BGRA2BGR(1.061947, 1.235294, 2.666667)(uchar), (1.013997, 1.085965, 1.347518)(float)(1.198190, 1.848153, 2.867657)(uchar), (1.173317, 1.402871, 2.416639)(float)
BGR2RGB(1.052308, 1.251572, 2.700000)(uchar), (1.009910, 1.076321, 1.300000)(float)(1.209468, 1.870198, 2.894736)(uchar), (1.178372, 1.450653, 2.412296)(float)
BGRA2RGBA(1.052252, 1.294118, 3.550000)(uchar), (1.005998, 1.061538, 1.250000)(float)(1.008355, 2.209785, 3.238027)(uchar), (1.184873, 1.405101, 3.121947)(float)
BGR2GRAY(1.297546, 1.875000, 3.272727)(uchar), (1.018468, 1.121212, 2.225000)(float)(1.398824, 2.286852, 3.272300)(uchar), (1.187580, 2.308537, 3.109362)(float)
BGRA2GRAY(1.172324, 1.976190, 3.850000)(uchar), (1.022443, 1.125000, 2.050000)(float)(1.324805, 2.282110, 3.271641)(uchar), (1.212494, 1.904405, 3.039953)(float)
GRAY2BGR(1.063415, 1.540000, 2.666667)(uchar), (1.019014, 1.107143, 1.454545)(float)(1.102559, 1.845679, 2.813610)(uchar), (1.187604, 1.552207, 2.345931)(float)
GRAY2BGRA(1.211268, 1.688889, 3.550000)(uchar), (1.016974, 1.095238, 1.960000)(float)(0.964918, 2.304777, 3.295088)(uchar), (1.212406, 1.816164, 3.188364)(float)
BGR2YCrCb(1.115718, 1.437500, 2.933333)(uchar), (1.009922, 1.078431, 1.357143)(float)(1.241286, 1.960112, 2.976867)(uchar), (1.197845, 1.590102, 2.509307)(float)
YCrCb2BGR(1.000899, 1.245714, 2.275000)(uchar), (1.023173, 1.066667, 1.275168)(float)(1.240044, 1.953158, 3.116737)(uchar), (1.199772, 1.588004, 2.517634)(float)
BGR2HSV(1.047619, 1.124324, 1.440000)(uchar), (0.997503, 1.044550, 1.516779)(float)(1.244666, 1.442114, 2.198618)(uchar), (1.185546, 1.439000, 2.350594)(float)
HSV2BGR(1.087786, 1.182353, 1.625000)(uchar), (1.296982, 1.423002, 1.751773)(float)(1.205051, 1.746410, 2.417964)(uchar), (1.799301, 1.971470, 2.591022)(float)
BGR2LAB(1.057958, 1.116981, 1.347518)(uchar), (4.817276, 4.958534, 5.178571)(float)(1.218432, 1.340922, 2.003351)(uchar), (11.744559, 16.295735, 16.833752)(float)
LAB2BGR(3.492866, 3.535597, 3.842932)(uchar), (1.005988, 1.049924, 1.215789)(float)(7.991927, 10.267584, 11.016621)(uchar), (1.196817, 1.574198, 2.219194)(float)
NV122BGR(11.103975, 14.618688, 16.317706)(uchar)(22.728164, 76.905314, 80.594786)(uchar)
NV122BGRA(16.251937, 17.611833, 18.670433)(uchar)(25.602489, 92.221827, 126.598710)(uchar)
NV212BGR(11.066975, 15.031184, 16.289727)(uchar)(21.985959, 76.702585, 80.624540)(uchar)
NV212BGRA(16.354588, 17.128284, 18.803167)(uchar)(25.452630, 92.121791, 126.082495)(uchar)
BGR2I420(10.693974, 14.342727, 15.599600)(uchar)(18.157460, 65.449673, 96.908186)(uchar)
BGRA2I420(10.988100, 13.469619, 14.967882)(uchar)(19.512714, 71.242601, 102.685346)(uchar)
I4202BGR(12.730026, 14.859967, 16.374234)(uchar)(21.091450, 72.681859, 75.285153)(uchar)
I4202BGRA(16.132000, 16.995249, 18.781433)(uchar)(24.304845, 89.937122, 126.809467)(uchar)
YUV2GRAY(1.474327, 2.118086, 3.434171)(uchar)(1.091379, 14.942844, 20.622220)(uchar)
UYVY2BGR(12.124640, 14.798133, 15.474752)(uchar)(23.692301, 54.666932, 60.882893)(uchar)
UYVY2GRAY(13.245150, 17.255159, 29.511351)(uchar)(11.636411, 47.939781, 54.628228)(uchar)
YUYV2BGR(10.991902, 14.631356, 15.423861)(uchar)(23.766830, 55.792111, 62.559629)(uchar)
YUYV2GRAY(15.908143, 18.193933, 28.971842)(uchar)(11.908574, 45.874270, 54.474318)(uchar)
AdaptiveThreshold(3.573068, 14.689693, 25.344412)(uchar)(21.294030, 73.243000, 102.870909)(uchar)
BilateralFilter(1.168011, 1.525880, 2.761905)(uchar), (1.030715, 1.982063, 2.684054)(float)(81.962000, 170.308496, 2558.305913)(uchar), (237.644286, 462.073981, 574.847162)(float)
BitwiseAnd(1.000000, 3.181818, 8.497829)(uchar)(1.147059, 4.000000, 35.471800)(uchar)
BoxFilter(1.480552, 4.495130, 7.620557)(uchar), (1.448339, 6.352779, 13.262363)(float)(7.124271, 25.163939, 46.066635)(uchar), (14.106000, 38.753665, 77.675652)(float)
CalcHist(1.150568, 1.770833, 2.394737)(uchar)(1.857143, 3.000000, 3.533333)(uchar)
ConvertTo(0.993691, 1.073810, 1.603448)(uchar), (0.998486, 1.052786, 1.187879)(float)(0.670103, 1.344444, 4.500000)(uchar), (0.780702, 1.176471, 2.666667)(float)
CopyMakeborder(1.000000, 1.370000, 2.757983)(uchar), (1.057269, 1.162304, 3.528545)(float)(1.596447, 1.717750, 14.224575)(uchar), (1.389611, 2.373997, 21.272053)(float)
Crop(5.246094, 10.568061, 17.913457)(uchar), (3.501557, 12.897787, 23.676354)(float)(8.071733, 48.336800, 62.190875)(uchar), (27.270250, 42.298500, 82.963085)(float)
Dilate(0.700053, 3.018605, 36.491972)(uchar), (1.233902, 4.462496, 30.425474)(float)(1.521163, 12.164694, 39.407692)(uchar), (4.035320, 33.010542, 91.460005)(float)
DistanceTransform(4.748947, 10.090304, 53.176053)(float)(5.214643, 15.715885, 175.633388)(float)
EqualizeHist(1.282700, 1.964115, 3.808168)(uchar)(1.896552, 2.444444, 22.475000)(uchar)
Erode(0.712260, 2.985459, 37.392623)(uchar), (1.166272, 4.408501, 30.251434)(float)(1.519858, 12.016352, 40.007692)(uchar), (4.043243, 31.962500, 91.159956)(float)
Filter2D(0.857971, 2.707717, 10.080080)(uchar), (1.158228, 2.812923, 11.549172)(float)(1.064132, 5.109709, 10.239130)(uchar), (1.180978, 3.344444, 10.202247)(float)
Flip(1.166667, 1.250000, 1.885246)(uchar), (1.020772, 1.088785, 1.247059)(float)(2.266297, 2.692010, 2.764538)(uchar), (1.430543, 1.496240, 2.699739)(float)
GaussianBlur(1.642779, 3.304591, 12.553922)(uchar), (1.660031, 2.951287, 6.443099)(float)(9.550909, 15.166667, 77.000000)(uchar), (9.790741, 22.002174, 56.855556)(float)
GuidedFilter(1.841109, 4.446838, 11.442694)(uchar), (1.914174, 4.654867, 12.122168)(float)(6.002427, 33.592295, 85.662757)(uchar), (6.084409, 29.738052, 103.347549)(float)
Integral(0.336724, 0.616571, 1.143994)(uchar),(0.560649, 1.074805, 2.191565)(float)(0.447493, 1.962876, 2.007879)(uchar), (0.788689, 2.565471, 3.641776)(float)
Laplacian(5.719577, 9.622927, 55.736000)(uchar), (2.665474, 5.248192, 15.066487)(float)(31.377286, 75.869500, 234.550952)(uchar), (17.290625, 35.339333, 106.552075)(float)
Mean(0.498830, 18.990729, 59.166509)(uchar), (1.752304, 12.701592, 34.480694)(float)(0.802057, 35.056429, 221.736000)(uchar), (6.057550, 39.561500, 157.173500)(float)
MeanStdDev(0.337072, 11.151297, 40.957974)(uchar), (4.841667, 8.282971, 17.452190)(float)(0.536569, 23.880556, 145.785484)(uchar), (8.946910, 32.976917, 94.712100)(float)
MedianBlur(0.351890, 1.116563, 3.519904)(uchar), (2.136163, 3.724323, 4.381040)(float)(3.459940, 7.941560, 23.388209)(uchar), (18.604885, 20.476500, 21.021053)(float)
Merge(2.278780, 2.644595, 6.276601)(uchar), (2.353986, 8.241453, 10.674901)(float)(3.113387, 17.123593, 20.245900)(uchar), (16.259525, 24.317619, 46.897456)(float)
MinMaxLoc(0.326772, 3.114965, 6.056915)(uchar), (1.952189, 10.600955, 16.410043)(float)(0.464410, 4.004704, 11.220379)(uchar), (1.564421, 15.698727, 33.511538)(float)
Norm(0.256201, 1.656461, 39.051089)(uchar), (1.036796, 5.491927, 31.023884)(float)(0.132152, 5.693815, 108.367206)(uchar), (1.217980, 22.487100, 111.887975)(float)
Normalize(1.477920, 6.778663, 30.741967)(uchar), (1.928009, 11.314839, 27.490531)(float)(4.358423, 15.967336, 77.594030)(uchar), (9.863286, 30.135352, 85.963588)(float)
Ones(34.007361, 102.137812, 110.859572)(uchar), (14.834672, 20.067127, 34.036056)(float)(42.736150, 170.238000, 361.399606)(uchar), (30.091450, 60.011938, 119.932833)(float)
PerspectiveTransform(24.956857, 30.550903, 53.705714)(float)(77.732667, 120.337187, 236.752000)(float)
PyrDown(0.855491, 1.760697, 3.200000)(uchar), (0.783599, 0.996094, 1.968254)(float)(0.888889, 1.840000, 3.000000)(uchar), (0.967742, 1.714286, 2.500000)(float)
PyrUp(0.982715, 1.141379, 3.225000)(uchar), (1.018668, 1.101715, 1.277778)(float)(1.092308, 1.714286, 2.750000)(uchar), (0.982332, 1.104167, 1.714286)(float)
Remap(1.000000, 1.500000, 3.093750)(uchar), (0.979498, 1.380000, 3.125000)(float)(1.192308, 2.666667, 3.333333)(uchar), (1.117647, 2.500000, 3.666667)(float)
Resize(1.000943, 1.531532, 2.875000)(uchar), (0.993286, 1.147826, 2.619048)(float)(1.030841, 2.471922, 3.428373)(uchar), (1.131494, 2.197102, 3.316012)(float)
Rotate(0.574651, 1.043956, 3.076923)(uchar), (0.546294, 0.665658, 1.033333)(float)(0.805556, 2.016667, 5.550000)(uchar), (0.480392, 0.957143, 4.000000)(float)
SepFilter2D(1.364119, 1.908174, 9.654275)(uchar), (1.341837, 1.863272, 9.666667)(short), (1.340006, 2.333333, 6.084475)(float)(9.454545, 19.282143, 84.583333)(uchar), (9.413043, 19.357143, 84.583333)(short), (10.227273, 18.721429, 63.787500)(float)
SetTo(1.000000, 1.600000, 5.615385)(uchar), (1.006349, 1.281250, 3.380952)(float)(0.735294, 2.875000, 6.153846)(uchar), (0.542540, 1.304348, 4.500000)(float)
Sobel(2.329529, 5.088146, 11.560976)(uchar), (2.374818, 4.994074, 11.308411)(short), (2.209220, 4.286020, 6.318538)(float)(20.446154, 46.628319, 72.314286)(uchar), (19.476015, 42.983740, 90.071429)(short), (21.000000, 39.061538, 70.411111)(float)
Split(1.067019, 1.238372, 3.166667)(uchar), (1.006090, 1.067797, 1.450000)(float)(1.230769, 3.000000, 4.000000)(uchar), (1.070588, 1.461538, 4.000000)(float)
Transpose(8.403667, 11.634840, 15.109060)(uchar), (5.698850, 12.042905, 15.199568)(float)(28.730750, 64.849000, 72.014231)(uchar), (28.907429, 72.115600, 126.537647)(float)
WarpAffine(0.730088, 3.073171, 96.212000)(uchar), (0.967412, 2.549020, 118.212581)(float)<

说点什么...

已有0条评论

最新评论...

本文作者
2023-11-23 01:26
  • 0
    粉丝
  • 373
    阅读
  • 0
    回复
资讯幻灯片
热门评论
热门专题
排行榜