GTC Japan 2016ã«ãããŠãNVIDIAã®æ£®éæ°ããåç€Ÿã®æšè«ãšã³ãžã³ãTensorRTãã«ã€ããŠçºè¡šãè¡ã£ããCaffeãTheanoãTorchãTensorFlowãªã©ã®ãã¬ãŒã ã¯ãŒã¯ãšåŒã°ãããœããã¯ããã¥ãŒã©ã«ãããã®éçºçšã®ããŒã«ã§ããããã®å ¥åãããåŠç¿ãæšè«ã®äžé£ã®æ©èœãæã£ãŠããããTensorRTã¯ã以åã¯ãGPU Inference EngineããšåŒã°ããŠãããCaffeã§åŠç¿ãçµãã£ããããã¯ãŒã¯ã®prototxtãå ¥åãšããŠã髿§èœã®æšè«ãè¡ãã·ã¹ãã ãäœãããŒã«ã§ããã
TensorRTã¿ãŒã²ãããšããŠããã®ã¯ããããªç»åã®ã¹ããªãŒãã³ã°èå¥ãèªåé転è»ã®ãªã¢ã«ã¿ã€ã ç»åèªèã巚倧ããŒã¿ã»ã³ã¿ã§ã®Webããã®å€§éã®èªèèŠæ±ãæããªã©ã®ã¬ãŒãã³ã·ãšã¹ã«ãŒããããèŠæ±ãããçšéã§ããã
æ£®éæ°ãCaffeãšTensorRTãšã®æ§èœã®æ¯èŒã«åœãã£ãŠäœ¿çšããã¢ãã«ã¯Caffeã®é ä»ããã±ãŒãžã«å«ãŸããŠããGoogLeNetã®ã¢ãã«ã§ãå ¥åã¯ILSVRC 12ã®åŠç¿ããŒã¿ã®å¹³åå€ã®ã€ã¡ãŒãžã䜿ã£ãŠããã
ãããŠãCaffeã®ã©ã€ãã©ãªã¯NVIDIAã®githubã«çœ®ãããŠãããã®ã§ãTensorRTã¯RC1ã䜿ã£ãŠããã
æšè«ã®åŠçã§ããããCaffeã®æ¹ã¯æšæºçãªåŠçãè¡ã£ãŠããããTensorRTã®æ¹ã¯ãååŠçã«ã¯CUDAã®ã«ã¹ã¿ã å®è£ ã®ååŠçã䜿ã£ãŠããããŸããæšè«éšåã¯ãCaffeã§ã¯ãªããTensorRTã䜿ã£ãŠãããšããéããããããªããTensorRTã¯æšæºã®FP32ã§ã®åŠçãã§ãããã粟床ã¯äžãããæŒç®æ§èœã®äžããFP16ãINT8ã§æé©åããåŠçãè¡ããããã«ãªã£ãŠããã
|
|
CaffeãšTensorRTã®ååŠçã¹ãããã®éããTensorRTã¯ãååŠçãšæšè«éšåã§å°çšã«éçºããåŠçã䜿ã£ãŠãã |
äž¡è ãå®è¡æéãæ¯èŒããã®ã次ã®è¡šã§ãååŠçã¯ãCaffeã§ã¯3.0msæãã£ãŠãããã®ãTensorRTã§ã¯0.8msãš3.7åé«éã«ãªã£ãããããŠãæšè«ã¯ãCaffeã®7.4msã«å¯ŸããŠãTensorRTã¯2.4msãš3.1åé«éã«ãªã£ãŠããããªããããã§äœ¿çšããGPUã¯Quadro M5000ã§ããã
|
|
CUDAåãFP16ãINT8ã®äœ¿çšãªã©ã§ãCaffeã«æ¯ã¹ãŠãååŠçã¯3.7åãæšè«æ¬äœã¯3.1åé«éã«ãªã£ã |
ããã§æšæºã®Caffeãšæ¯ã¹ãŠ3åäœã髿§èœãšãªã£ãŠããããTensorRTãããã£ãšæ§èœãåŒãåºããªãããšããããšã§ãè€æ°ã®æšè«ã®ã³ã³ã«ã¬ã³ããªå®è¡ãšããããµã€ãºã倧ãããããšããäœ¿ãæ¹ããã©ã€ããã
|
|
ããã«æ§èœåäžãçã£ãŠãè€æ°ã®æšè«ã®äžŠå(ã³ã³ã«ã¬ã³ã)å®è¡ãšå€§ããªããããµã€ãºããã©ã€ãã |
次ã®å³ã§ã¯3ã€ã®æšè«ã䞊è¡ããŠå®è¡ããã®ã§ããããCUDAã¹ããªãŒã ã¯2æ¬ãšããæ§æã§èµ°ãããŠããããã®ããã«å®è¡ããã¹ã¬ããæ°ãšã¹ããªãŒã æ°ã¯äžèŽããŠããªããŠãè¯ããšããã
次ã®å³ã¯ã³ã³ã«ã¬ã³ãã«å®è¡ããã¹ã¬ããæ°ãå€ããŠæž¬å®ããçµæããåŠçæé(æãç·ã°ã©ã)ãšã¹ã«ãŒããã(æ£ã°ã©ã)ã§ç€ºãããã®ã§ããããã®æž¬å®ã§ã¯ãã¹ã¬ããæ°ãšã¹ããªãŒã æ°ã¯äžèŽãããŠããã
åŠçæéã¯ã¹ã¬ããæ°ã®å¢å ã«äŒŽã£ãŠæžå°ããåŸåã§ãããäžæ¹ãã¹ã«ãŒãããã¯ã¹ã¬ããæ°ãå¢ãããšå¢å ããåŸåã«ããããã©ã¡ãã8ã¹ã¬ããçšåºŠã§é£œåããŠãããããããã©ã¡ãã8ã¹ã¬ãããã³ã³ã«ã¬ã³ãã«å®è¡ããããšã«ããã1ã¹ã¬ããã®å Žåãšæ¯èŒããŠãçŽ2åã«æ§èœã¢ããããŠããã
1ã¹ã¬ããã®å Žåã¯ä»äºãå°ãªãGPUã³ã¢ãéãã§ããã®ããè€æ°ã¹ã¬ããã®ã³ã³ã«ã¬ã³ãå®è¡ã§ç©ºããåããŠããããšã§ã¹ã«ãŒããããäžãã£ãŠãããšèããããããªããåŠçæéã¯å šäœã®å®è¡æéãã¹ã«ãŒãããã§å²ã£ãŠæ±ããŠãããšæãããå šäœã®åŠçæéãããŸãå¢ããªãã§ã¹ã«ãŒããããå¢ããŠããããšãæå³ããŠãããåã ã®æšè«åŠçã®éå§ããçµäºãŸã§ã®æéãçããªã£ãŠããããã§ã¯ãªããšæãããã
|
|
䞊åå®è¡ããã¹ã¬ããæ°ã1ïœ32ãŸã§å€ããŠãåŠçæéãšã¹ã«ãŒããããããããããã°ã©ã |
次ã®å³ã¯ã³ã³ã«ã¬ã³ãå®è¡ããå Žåã®ãããã¡ã€ã©åºåã®ãåã¹ã¬ããã®å®è¡ç¶æ³ã®éšåãæ¡å€§ãããã®ã§ãäžçªäžã4ã¹ããªãŒã ã®ããããã®å®è¡ç¶æ³ã瀺ããŠããã
|
|
æšè«ã®å®è¡ç¶æ ã®ãããã¡ã€ã«åºåãäžçªäžã4ã€ã®ã¹ããªãŒã ã®å®è¡ç¶æ³ |
ãã®æšè«ãµãŒããWebãµãŒããšçµåããŠãæšè«åŠçã®ã¹ã«ãŒããããšåŠçã¬ãŒãã³ã·ãèšæž¬ãããGoã®net/httpããã±ãŒãžã䜿ã£ãŠå®è£ ããboomã䜿ã£ãŠåæãªã¯ãšã¹ãæ°ãå€ããŠæž¬å®ãè¡ã£ãŠããã
|
|
æšè«åŠçã®ã¹ã«ãŒããããšåŠçã¬ãŒãã³ã·ããåæãªã¯ãšã¹ãæ°ãå€ããŠæž¬å® |
次ã®å³ã«ç€ºãããã«ãæšè«ãªã¯ãšã¹ããè€æ°ã®æšè«å®è¡ã¹ããªãŒã ã«æ¯ãåããŠäžŠåå®è¡ãããšããç¶æ ã§æ§èœãèšæž¬ããŠããã
次ã®å³ã¯åæãªã¯ãšã¹ãæ°ã1ïœ32ã«å€åãããŠã¹ã«ãŒããããæž¬å®ããçµæã§ãå·Šã¯ã·ãªã¢ã«åŠçã®ã±ãŒã¹ã§ãå³ãã³ã³ã«ã¬ã³ãå®è¡ã®ã¹ã«ãŒãããã§ãããåœç¶ãªãããã·ãªã¢ã«å®è¡ã®å Žåã¯ãåæãªã¯ãšã¹ãæ°ã倧ããããŠãã¹ã«ãŒãããã¯ã»ãŒäžå®ã§ãããããã«å¯ŸããŠã³ã³ã«ã¬ã³ãå®è¡ã®å Žåã¯ãåæãªã¯ãšã¹ãæ°ãå¢ãããšã¹ã«ãŒãããã¯åäžãããã8ã¹ã¬ããçšåºŠã§é£œåããŠããããã®æã1ãªã¯ãšã¹ãã«æ¯ã¹ãŠã8ãªã¯ãšã¹ãã«ãªããšãã¹ã«ãŒãããã¯çŽ2åã®650ãªã¯ãšã¹ã/ç§ã«ãªã£ãŠããããã¡ããã䜿çšããGPUã®ã³ã¢æ°ãªã©ãéãã°ã飜åç¹ã¯éã£ãŠãããšèããããã
ãŸããåæãªã¯ãšã¹ãæ°8ã®å Žåãã·ãªã¢ã«åŠçã®ã¬ãŒãã³ã·ã¯22.1msã§ããã®ã«å¯ŸããŠãã³ã³ã«ã¬ã³ãåŠçã®å Žåã®ã¬ãŒãã³ã·ã¯14.2msã«æžå°ããŠããã
åæãªã¯ãšã¹ãã¯ãã©ãã©ãšãªã¯ãšã¹ããå ¥ã£ãŠããç¶æ ã§ãããããããåŠçã¯ãäžå®æ°ã®ãªã¯ãšã¹ãããŸãšããŠæã«ããŠåŠçãããšããæ¹æ³ã§ãããäžè¬ã«ãæã«ãããªã¯ãšã¹ãæ°ã倧ããæ¹ãã¡ã¢ãªã¢ã¯ã»ã¹æ°ã«æ¯èŒããŠæŒç®åæ°ãå¢ããæ§èœãé«ãã§ããã
|
|
è€æ°ã®æšè«ãªã¯ãšã¹ãããŸãšãããšãã¡ã¢ãªã¢ã¯ã»ã¹åæ°ãããã®æŒç®æ°ãå¢ããããšãã§ããåŠçæ§èœãäžãããã |
次ã®å³ã¯ãããæ°ã®å¢å ã«ããæ§èœæ¹åããããããããã®ã§ãæãç·ã°ã©ããå®è¡æéãæ£ã°ã©ããã¹ã«ãŒãããã§ããããªãããã®ã°ã©ãã¯TensorRTã®æšè«éšåã®å®è¡æéã ãã瀺ãããã®ã§ãããããããµã€ãºã®1ãã128ãžã®å¢å ã§ãã¹ã«ãŒãããã¯2.5åæ¹åããåŠçæéã¯2.6msãã1.05msã«æ¹åããŠããã
ã·ãªã¢ã«ãšã³ã³ã«ã¬ã³ãããããŠããã(ãããæ°4)ã®æ§æã§åæãªã¯ãšã¹ãæ°ãå€ããŠã¹ã«ãŒããããæž¬å®ããçµæãæ¬¡ã®å³ã§ãããæåã¯ã³ã³ã«ã¬ã³ãå®è¡ã®æ§èœåäžãç®ç«ã€ããã³ã³ã«ã¬ã³ãå®è¡ã®æ§èœã¯8䞊åçšåºŠã§é£œåãããããã«å¯ŸããŠãããå®è¡ã¯æ§èœãäžããç¶ãã32åæãªã¯ãšã¹ã以äžã§ã¯äžçªé«ãã¹ã«ãŒããããåŸãããŠãããã·ãªã¢ã«å®è¡ã«æ¯ã¹ãŠçŽ2.4åã®ã¹ã«ãŒãããã§ããããããŠãã¬ãŒãã³ã·ãé·ããªã£ãŠããªãã
|
|
ãããæ°4ã§å®è¡ãããšãåæãªã¯ãšã¹ãæ°ã64以äžã§ã¯äžçªé«ãã¹ã«ãŒããããåŸããããããã¯ã·ãªã¢ã«å®è¡ã®2.4åã®ã¹ã«ãŒããã |
TensorRTã¯ãCaffeã®æšæºå®è£ ãšæ¯ã¹ããšãååŠçã®ã«ã¹ã¿ã å®è£ ã§3.7åã®é«éåãå®çŸããŠããããããŠãNVIDIA GPUã«æé©åããå®è£ ã§æšè«ã®æ§èœã3.1åã«æ¹åããŠããããè€æ°ã®æšè«ãã³ã³ã«ã¬ã³ãã«å®è¡ããããããããµã€ãºã倧ããããããããšãããã«é«ãæ§èœãåŸãããã













