Open
Description
简述
- Android端在使用gpu进行推理的过程中,发现Mail的gpu推理结果出错,而晓龙的Adreno gpu推理结果正常
detail | 详细描述 | 詳細な説明
- ncnn,使用官方提供的ncnn-20241226-android-vulkan,或者自行根据源码进行编译
- 模型 yolov11-obb模型,没经过魔改
- 模型参数(部分)
7767517
311 373
Input in0 0 1 in0
Convolution conv_0 1 1 in0 1 0=16 1=3 11=3 12=1 13=2 14=1 2=1 3=2 4=1 5=1 6=432
Swish silu_93 1 1 1 2
Convolution conv_1 1 1 2 3 0=32 1=3 11=3 12=1 13=2 14=1 2=1 3=2 4=1 5=1 6=4608
Swish silu_94 1 1 3 4
Convolution conv_2 1 1 4 5 0=32 1=1 11=1 12=1 13=1 14=0 2=1 3=1 4=0 5=1 6=1024
Swish silu_95 1 1 5 6
Slice split_0 1 2 6 7 8 -23300=2,16,16 1=0
Split splitncnn_0 1 3 8 9 10 11
Convolution conv_3 1 1 11 12 0=8 1=3 11=3 12=1 13=1 14=1 2=1 3=1 4=1 5=1 6=1152
Swish silu_96 1 1 12 13
Convolution conv_4 1 1 13 14 0=16 1=3 11=3 12=1 13=1 14=1 2=1 3=1 4=1 5=1 6=1152
Swish silu_97 1 1 14 15
BinaryOp add_0 2 1 10 15 16 0=0
Concat cat_0 3 1 7 9 16 17 0=0
Convolution conv_5 1 1 17 18 0=64 1=1 11=1 12=1 13=1 14=0 2=1 3=1 4=0 5=1 6=3072
Swish silu_98 1 1 18 19
Convolution conv_6 1 1 19 20 0=64 1=3 11=3 12=1 13=2 14=1 2=1 3=2 4=1 5=1 6=36864
Swish silu_99 1 1 20 21
Convolution conv_7 1 1 21 22 0=64 1=1 11=1 12=1 13=1 14=0 2=1 3=1 4=0 5=1 6=4096
Swish silu_100 1 1 22 23
Slice split_1 1 2 23 24 25 -23300=2,32,32 1=0
Split splitncnn_1 1 3 25 26 27 28
Convolution conv_8 1 1 28 29 0=16 1=3 11=3 12=1 13=1 14=1 2=1 3=1 4=1 5=1 6=4608
Swish silu_101 1 1 29 30
Convolution conv_9 1 1 30 31 0=32 1=3 11=3 12=1 13=1 14=1 2=1 3=1 4=1 5=1 6=4608
Swish silu_102 1 1 31 32
BinaryOp add_1 2 1 27 32 33 0=0
Concat cat_1 3 1 24 26 33 34 0=0
Convolution conv_10 1 1 34 35 0=128 1=1 11=1 12=1 13=1 14=0 2=1 3=1 4=0 5=1 6=12288
Swish silu_103 1 1 35 36
Split splitncnn_2 1 2 36 37 38
Convolution conv_11 1 1 38 39 0=128 1=3 11=3 12=1 13=2 14=1 2=1 3=2 4=1 5=1 6=147456
...
Reshape reshape_182 1 1 132 140 0=20 1=20 2=128
ConvolutionDepthWise convdw_199 1 1 140 141 0=128 1=3 11=3 12=1 13=1 14=1 2=1 3=1 4=1 5=1 6=1152 7=128
BinaryOp add_7 2 1 139 141 142 0=0
Convolution conv_35 1 1 142 143 0=128 1=1 11=1 12=1 13=1 14=0 2=1 3=1 4=0 5=1 6=16384
BinaryOp add_8 2 1 125 143 144 0=0
Split splitncnn_15 1 2 144 145 146
Reshape view_188 1 1 272 273 0=400 1=1
Concat cat_17 3 1 261 267 273 274 0=1
Sigmoid sigmoid_178 1 1 274 275
BinaryOp sub_15 1 1 275 276 0=1 1=1 2=2.500000e-01
BinaryOp mul_16 1 1 276 277 0=2 1=1 2=3.141593e+00
Split splitncnn_27 1 3 277 278 279 280
Convolution conv_71 1 1 192 281 0=64 1=3 11=3 12=1 13=1 14=1 2=1 3=1 4=1 5=1 6=36864
Swish silu_158 1 1 281 282
Convolution conv_72 1 1 282 283 0=64 1=3 11=3 12=1 13=1 14=1 2=1 3=1 4=1 5=1 6=36864
Swish silu_159 1 1 283 284
Convolution conv_73 1 1 284 285 0=64 1=1 11=1 12=1 13=1 14=0 2=1 3=1 4=0 5=1 6=4096
ConvolutionDepthWise convdw_200 1 1 195 286 0=64 1=3 11=3 12=1 13=1 14=1 2=1 3=1 4=1 5=1 6=576 7=64
Swish silu_160 1 1 286 287
Convolution conv_74 1 1 287 288 0=64 1=1 11=1 12=1 13=1 14=0 2=1 3=1 4=0 5=1 6=4096
Swish silu_161 1 1 288 289
ConvolutionDepthWise convdw_201 1 1 289 290 0=64 1=3 11=3 12=1 13=1 14=1 2=1 3=1 4=1 5=1 6=576 7=64
Swish silu_162 1 1 290 291
Convolution conv_75 1 1 291 292 0=64 1=1 11=1 12=1 13=1 14=0 2=1 3=1 4=0 5=1 6=4096
Swish silu_163 1 1 292 293
Convolution conv_76 1 1 293 294 0=7 1=1 11=1 12=1 13=1 14=0 2=1 3=1 4=0 5=1 6=448
Concat cat_18 2 1 285 294 295 0=0
Convolution conv_77 1 1 214 296 0=64 1=3 11=3 12=1 13=1 14=1 2=1 3=1 4=1 5=1 6=73728
Swish silu_164 1 1 296 297
Convolution conv_78 1 1 297 298 0=64 1=3 11=3 12=1 13=1 14=1 2=1 3=1 4=1 5=1 6=36864
Swish silu_165 1 1 298 299
Convolution conv_79 1 1 299 300 0=64 1=1 11=1 12=1 13=1 14=0 2=1 3=1 4=0 5=1 6=4096
ConvolutionDepthWise convdw_202 1 1 217 301 0=128 1=3 11=3 12=1 13=1 14=1 2=1 3=1 4=1 5=1 6=1152 7=128
Swish silu_166 1 1 301 302
Convolution conv_80 1 1 302 303 0=64 1=1 11=1 12=1 13=1 14=0 2=1 3=1 4=0 5=1 6=8192
Swish silu_167 1 1 303 304
ConvolutionDepthWise convdw_203 1 1 304 305 0=64 1=3 11=3 12=1 13=1 14=1 2=1 3=1 4=1 5=1 6=576 7=64
Swish silu_168 1 1 305 306
Convolution conv_81 1 1 306 307 0=64 1=1 11=1 12=1 13=1 14=0 2=1 3=1 4=0 5=1 6=4096
Swish silu_169 1 1 307 308
Convolution conv_82 1 1 308 309 0=7 1=1 11=1 12=1 13=1 14=0 2=1 3=1 4=0 5=1 6=448
Concat cat_19 2 1 300 309 310 0=0
Convolution conv_83 1 1 252 311 0=64 1=3 11=3 12=1 13=1 14=1 2=1 3=1 4=1 5=1 6=147456
Swish silu_170 1 1 311 312
Convolution conv_84 1 1 312 313 0=64 1=3 11=3 12=1 13=1 14=1 2=1 3=1 4=1 5=1 6=36864
Swish silu_171 1 1 313 314
Convolution conv_85 1 1 314 315 0=64 1=1 11=1 12=1 13=1 14=0 2=1 3=1 4=0 5=1 6=4096
ConvolutionDepthWise convdw_204 1 1 254 316 0=256 1=3 11=3 12=1 13=1 14=1 2=1 3=1 4=1 5=1 6=2304 7=256
Swish silu_172 1 1 316 317
Convolution conv_86 1 1 317 318 0=64 1=1 11=1 12=1 13=1 14=0 2=1 3=1 4=0 5=1 6=16384
Swish silu_173 1 1 318 319
ConvolutionDepthWise convdw_205 1 1 319 320 0=64 1=3 11=3 12=1 13=1 14=1 2=1 3=1 4=1 5=1 6=576 7=64
Swish silu_174 1 1 320 321
Convolution conv_87 1 1 321 322 0=64 1=1 11=1 12=1 13=1 14=0 2=1 3=1 4=0 5=1 6=4096
Swish silu_175 1 1 322 323
Convolution conv_88 1 1 323 324 0=7 1=1 11=1 12=1 13=1 14=0 2=1 3=1 4=0 5=1 6=448
Concat cat_20 2 1 315 324 325 0=0
Reshape view_189 1 1 295 326 0=6400 1=71
Reshape view_190 1 1 310 327 0=1600 1=71
Reshape view_191 1 1 325 328 0=400 1=71
Concat cat_21 3 1 326 327 328 329 0=1
Slice split_10 1 2 329 330 331 -23300=2,64,7 1=0
Reshape view_192 1 1 330 332 0=8400 1=16 2=4
Permute transpose_198 1 1 332 333 0=2
Softmax softmax_181 1 1 333 334 0=0 1=1
Convolution conv_89 1 1 334 335 0=1 1=1 11=1 12=1 13=1 14=0 2=1 3=1 4=0 5=0 6=16
Reshape view_193 1 1 335 336 0=8400 1=4
MemoryData pnnx_fold_anchor_points.1 0 1 337 0=8400 1=2
Slice split_11 1 2 336 338 339 -23300=2,2,-233 1=0
Split splitncnn_29 1 2 339 340 341
Split splitncnn_28 1 2 338 342 343
UnaryOp cos_17 1 1 279 344 0=10
Split splitncnn_30 1 2 344 345 346
UnaryOp sin_18 1 1 280 347 0=9
Split splitncnn_31 1 2 347 348 349
BinaryOp sub_19 2 1 340 342 350 0=1
BinaryOp div_20 1 1 350 351 0=3 1=1 2=2.000000e+00
Slice split_12 1 2 351 352 353 -23300=2,1,-233 1=0
Split splitncnn_33 1 2 353 354 355
Split splitncnn_32 1 2 352 356 357
BinaryOp mul_21 2 1 354 348 358 0=2
BinaryOp mul_22 2 1 356 345 359 0=2
BinaryOp sub_23 2 1 359 358 360 0=1
BinaryOp mul_24 2 1 355 346 361 0=2
BinaryOp mul_25 2 1 357 349 362 0=2
BinaryOp add_26 2 1 362 361 363 0=0
Concat cat_22 2 1 360 363 364 0=0
BinaryOp add_27 2 1 364 337 365 0=0
BinaryOp add_28 2 1 343 341 366 0=0
Concat cat_23 2 1 365 366 367 0=0
Reshape reshape_183 1 1 255 368 0=8400 1=1
BinaryOp mul_29 2 1 367 368 369 0=2
Sigmoid sigmoid_179 1 1 331 370
Concat cat_24 2 1 369 370 371 0=0
Concat cat_25 2 1 371 278 out0 0=0
- 代码
_hasGPU = ncnn::get_gpu_count() > 0;
_Net->opt.use_fp16_arithmetic = false;
_Net->opt.use_fp16_storage = false;
_Net->opt.use_fp16_packed = false;
_Net->opt.use_vulkan_compute = _hasGPU; // 将其设置为false,推理结果均正常
_Net->opt.use_packing_layout=true;
// in_pad 的w, h 均为640
ncnn::Extractor ex = _Net->create_extractor();
ex.input("in0", in_pad);
ncnn::Mat out;
int extract_result = ex.extract("out0", out);
- 出错描述
- 抽取out0进行后处理时发现box的x,y,w,h出现异常,x,y远大于640,wh也远大于实际的box尺寸,但是confidence和angle是正常的
- 模型配置倒数第四行,单独抽取369发现x,y,w,h的计算结果依然异常,于是抽取367,368自行进行乘法运算得到结果是正常的(369box的位置,370置信度,278角度)
// 提取box
ncnn::Mat out_box;
int extract_result = ex.extract(367, out_box);
// 提取步长
ncnn::Mat out_stride;
ex.extract(368, out_stride);
// 提取置信度
ncnn::Mat out_conf;
ex.extract(370, out_conf);
- 出错设备
- huawei mate 40,gpu: Mali-G78
- vivo x100, x90系列 gpu: Mali-G715 Immortalis MC11
- 搭载骁龙Adreno的设备推理结果正常,禁用gpu推理,
_Net->opt.use_vulkan_compute = false;
所有设备推理结果均正常
Metadata
Assignees
Labels
No labels
Activity