C++的API跑神經網絡模型相比于Python可以得到數倍的提升

同時類似TensorRT會提供非常多的模型優化,比如int8推理

而且實際項目中一般考慮性能等各種因素也不會直接使用Python

而是會選擇C++或者Java等

這次我在做項目中遇到了必須要用C++來做模型部署的問題，這裡做個記錄

我這次選擇了ONNXRuntime這個架構做CPU推理

ONNXRuntime的GPU推理和TensorRT将在未來有空的時候進行跟進

另外本來我是想直接用Tensorflow的C++api，但是坑爹的是windows10版的源碼缺頭檔案，無法通過編譯，是以就擱置了

ONNXRuntime

onnx作為一個非常優秀的跨平台的深度學習工具，其他架構訓練的模型均可在上進行使用部署，作為cpu平台部署的利器，因為不像gpu平台，英偉達提供了tensorrt進行加速部署.這裡是在win10借助vs2019上面進行安裝編譯這個平台

安裝ONNXRuntime

（1）去下載下傳安裝包https://www.nuget.org/

神經網絡語義分割模型C++部署(VS2019+ONNXRuntime+OpenCV)ONNXRuntimeOpenCV借助OpenCV進行模型推理完整測試代碼

（2）将安裝包拷貝到win上的磁盤上，指定一個目錄存放

神經網絡語義分割模型C++部署(VS2019+ONNXRuntime+OpenCV)ONNXRuntimeOpenCV借助OpenCV進行模型推理完整測試代碼

（3）打開vs2019，工具->NuGet程式包管理器->程式包管理控制平台

這是一個指令行平台

接下來安裝即可

PM> Install-Package Microsoft.ML.OnnxRuntime -Source C:\Users\pc\Desktop\Cache
正在嘗試收集與目标為“native,Version=v0.0”的項目“onnxDemo”有關的包“Microsoft.ML.OnnxRuntime.1.6.0”的依賴項資訊
收集依賴項資訊花費時間 20 ms
正在嘗試解析程式包“Microsoft.ML.OnnxRuntime.1.6.0”的依賴項，DependencyBehavior 為“Lowest”
解析依賴項資訊花費時間 0 ms
正在解析操作以安裝程式包“Microsoft.ML.OnnxRuntime.1.6.0”
已解析操作以安裝程式包“Microsoft.ML.OnnxRuntime.1.6.0”
從“C:\Users\Administrator\AppData\Local\NuGet\Cache”檢索包“Microsoft.ML.OnnxRuntime 1.6.0” 
正在将程式包“Microsoft.ML.OnnxRuntime.1.6.0”添加到檔案夾“C:\Users\Administrator\source\repos\onnxDemo\packages”
已将程式包“Microsoft.ML.OnnxRuntime.1.6.0”添加到檔案夾“C:\Users\Administrator\source\repos\onnxDemo\packages”
已将程式包“Microsoft.ML.OnnxRuntime.1.6.0”添加到“packages.config”
已将“Microsoft.ML.OnnxRuntime 1.6.0”成功安裝到 onnxDemo
執行 nuget 操作花費時間 7.18 sec
已用時間: 00:00:07.4252338
PM> Install-Package Microsoft.ML.OnnxRuntime.mklml -Source C:\Users\Administrator\AppData\Local\NuGet\Cache
 
 
正在嘗試收集與目标為“native,Version=v0.0”的項目“onnxDemo”有關的包“Microsoft.ML.OnnxRuntime.mklml.1.6.0”的依賴項資訊
收集依賴項資訊花費時間 2 ms
正在嘗試解析程式包“Microsoft.ML.OnnxRuntime.mklml.1.6.0”的依賴項，DependencyBehavior 為“Lowest”
解析依賴項資訊花費時間 0 ms
正在解析操作以安裝程式包“Microsoft.ML.OnnxRuntime.mklml.1.6.0”
已解析操作以安裝程式包“Microsoft.ML.OnnxRuntime.mklml.1.6.0”
從“C:\Users\Administrator\AppData\Local\NuGet\Cache”檢索包“Microsoft.ML.OnnxRuntime.MKLML 1.6.0” 
正在将程式包“Microsoft.ML.OnnxRuntime.MKLML.1.6.0”添加到檔案夾“C:\Users\Administrator\source\repos\onnxDemo\packages”
已将程式包“Microsoft.ML.OnnxRuntime.MKLML.1.6.0”添加到檔案夾“C:\Users\Administrator\source\repos\onnxDemo\packages”
已将程式包“Microsoft.ML.OnnxRuntime.MKLML.1.6.0”添加到“packages.config”
已将“Microsoft.ML.OnnxRuntime.MKLML 1.6.0”成功安裝到 onnxDemo
執行 nuget 操作花費時間 7.63 sec
已用時間: 00:00:07.6510274
PM>

（3）寫代碼，ONNXRuntime寫代碼十分簡潔

大概分為三部分

1.初始化環境，會話等

2.會話中加載模型，得到模型的輸入和輸出節點

3.調用API得到模型的傳回值

這裡以語義分割模型U2Net為例

#include <assert.h>
#include <vector>
#include<ctime>
#include <onnxruntime_cxx_api.h>
int main(int argc, char* argv[]) 
{
    //記錄程式運作時間
    auto start_time = clock();
    //初始化環境，每個程序一個環境
    //環境保留了線程池和其他狀态資訊
    Ort::Env env(ORT_LOGGING_LEVEL_WARNING, "test");
    //初始化Session選項
    Ort::SessionOptions session_options;
    session_options.SetIntraOpNumThreads(1);
    // Available levels are
    // ORT_DISABLE_ALL -> To disable all optimizations
    // ORT_ENABLE_BASIC -> To enable basic optimizations (Such as redundant node removals)
    // ORT_ENABLE_EXTENDED -> To enable extended optimizations (Includes level 1 + more complex optimizations like node fusions)
    // ORT_ENABLE_ALL -> To Enable All possible opitmizations
    session_options.SetGraphOptimizationLevel(GraphOptimizationLevel::ORT_ENABLE_EXTENDED);

    //*************************************************************************
    // 建立Session并把模型加載到記憶體中
    const wchar_t* model_path = L"u2net.onnx";

    printf("Using Onnxruntime C++ API\n");
    Ort::Session session(env, model_path, session_options);

    //*************************************************************************
    //列印模型的輸入層(node names, types, shape etc.)
    Ort::AllocatorWithDefaultOptions allocator;

    //輸出模型輸入節點的數量
    size_t num_input_nodes = session.GetInputCount();
    size_t num_output_nodes = session.GetOutputCount();
    std::vector<const char*> input_node_names(num_input_nodes);
    std::vector<const char*> output_node_names(num_output_nodes);
    std::vector<int64_t> input_node_dims;  // simplify... this model has only 1 input node {1, 3, 224, 224}.
                                           // Otherwise need vector<vector<>>

    printf("Number of inputs = %zu\n", num_input_nodes);
    //疊代所有的輸入節點
    for (int i = 0; i < num_input_nodes; i++) {
         //輸出輸入節點的名稱
        char* input_name = session.GetInputName(i, allocator);
        printf("Input %d : name=%s\n", i, input_name);
        input_node_names[i] = input_name;

        // 輸出輸入節點的類型
        Ort::TypeInfo type_info = session.GetInputTypeInfo(i);
        auto tensor_info = type_info.GetTensorTypeAndShapeInfo();

        ONNXTensorElementDataType type = tensor_info.GetElementType();
        printf("Input %d : type=%d\n", i, type);

        input_node_dims = tensor_info.GetShape();
        //輸入節點的列印次元
          printf("Input %d : num_dims=%zu\n", i, input_node_dims.size());
        //列印各個次元的大小
          for (int j = 0; j < input_node_dims.size(); j++)
             printf("Input %d : dim %d=%jd\n", i, j, input_node_dims[j]);
        //batch_size=1
        input_node_dims[0] = 1;
    }
    //列印輸出節點資訊，方法類似
    for (int i = 0; i < num_output_nodes; i++)
    {
        char* output_name = session.GetOutputName(i, allocator);
        printf("Output: %d name=%s\n", i, output_name);
        output_node_names[i] = output_name;
        Ort::TypeInfo type_info = session.GetOutputTypeInfo(i);
        auto tensor_info = type_info.GetTensorTypeAndShapeInfo();
        ONNXTensorElementDataType type = tensor_info.GetElementType();
        printf("Output %d : type=%d\n", i, type);
        auto output_node_dims = tensor_info.GetShape();
        printf("Output %d : num_dims=%zu\n", i, output_node_dims.size());
        for (int j = 0; j < input_node_dims.size(); j++)
            printf("Output %d : dim %d=%jd\n", i, j, output_node_dims[j]);
    }
    
    //*************************************************************************
    // 使用樣本資料對模型進行評分，并檢驗出入值的合法性
    size_t input_tensor_size = 3 * 320 * 320;  // simplify ... using known dim values to calculate size
                                               // use OrtGetTensorShapeElementCount() to get official size!
    
    std::vector<float> input_tensor_values(input_tensor_size);

    // 初始化一個資料（示範用,這裡實際應該傳入歸一化的資料）
    for (unsigned int i = 0; i < input_tensor_size; i++)
        input_tensor_values[i] = (float)i / (input_tensor_size + 1);
    
    // 為輸入資料建立一個Tensor對象
    try
    {
        auto memory_info = Ort::MemoryInfo::CreateCpu(OrtArenaAllocator, OrtMemTypeDefault);
        Ort::Value input_tensor = Ort::Value::CreateTensor<float>(memory_info, input_tensor_values.data(), input_tensor_size, input_node_dims.data(), 4);
        //assert(input_tensor.IsTensor());
    
	    // 推理得到結果
	    auto output_tensors = session.Run(Ort::RunOptions{ nullptr }, input_node_names.data(), &input_tensor, 1, output_node_names.data(), 1);
	    assert(output_tensors.size() == 1 && output_tensors.front().IsTensor());
	
	    // Get pointer to output tensor float values
	    float* floatarr = output_tensors.front().GetTensorMutableData<float>(); 
	    printf("Number of outputs = %d\n", output_tensors.size());
    }
    catch (Ort::Exception& e)
    {
        printf(e.what());
    }
    auto end_time = clock();
    printf("Proceed exit after %.2f seconds\n", static_cast<float>(end_time - start_time) / CLOCKS_PER_SEC);
    printf("Done!\n");
    return 0;
}

輸出：

神經網絡語義分割模型C++部署(VS2019+ONNXRuntime+OpenCV)ONNXRuntimeOpenCV借助OpenCV進行模型推理完整測試代碼

然後我們進一步進行一定程度的封裝，友善我們後續使用

class U2NetModel
{
public:
    U2NetModel(const wchar_t* onnx_model_path);
    float* predict(std::vector<float>input_data,int batch_size=1);
private:
    Ort::Env env;
    Ort::Session session;
    Ort::AllocatorWithDefaultOptions allocator;
    std::vector<const char*>input_node_names;
    std::vector<const char*>output_node_names;
    std::vector<int64_t> input_node_dims;
};
U2NetModel::U2NetModel(const wchar_t* onnx_model_path):session(nullptr),env(nullptr)
{
    //初始化環境，每個程序一個環境,環境保留了線程池和其他狀态資訊
    this->env=Ort::Env(ORT_LOGGING_LEVEL_WARNING, "u2net");
    //初始化Session選項
    Ort::SessionOptions session_options;
    session_options.SetInterOpNumThreads(1);
    session_options.SetGraphOptimizationLevel(GraphOptimizationLevel::ORT_ENABLE_ALL);
    // 建立Session并把模型加載到記憶體中
    this->session=Ort::Session(env, onnx_model_path,session_options);
    //輸入輸出節點數量和名稱
    size_t num_input_nodes = session.GetInputCount();
    size_t num_output_nodes = session.GetOutputCount();
    for (int i = 0; i < num_input_nodes; i++)
    {
        auto input_node_name = session.GetInputName(i, allocator);
        this->input_node_names.push_back(input_node_name);
        Ort::TypeInfo type_info = session.GetInputTypeInfo(i);
        auto tensor_info = type_info.GetTensorTypeAndShapeInfo();
        ONNXTensorElementDataType type = tensor_info.GetElementType();
        this->input_node_dims = tensor_info.GetShape();
    }
    for (int i = 0; i < num_output_nodes; i++)
    {
        auto output_node_name = session.GetOutputName(i, allocator);
        this->output_node_names.push_back(output_node_name);
    }
}
float* U2NetModel::predict(std::vector<float>input_tensor_values,int batch_size)
{
    this->input_node_dims[0] = batch_size;
    auto input_tensor_size = input_tensor_values.size();
    auto memory_info = Ort::MemoryInfo::CreateCpu(OrtArenaAllocator, OrtMemTypeDefault);
    Ort::Value input_tensor = Ort::Value::CreateTensor<float>(memory_info, input_tensor_values.data(), input_tensor_size, input_node_dims.data(), 4);
    auto output_tensors = session.Run(Ort::RunOptions{ nullptr }, input_node_names.data(), &input_tensor, 1, output_node_names.data(), 1);
    assert(output_tensors.size() == 1 && output_tensors.front().IsTensor());
    float* floatarr = output_tensors.front().GetTensorMutableData<float>();
    return floatarr;
}

然後初始化一個示例并調用接口

int main(int argc, char* argv[]) 
{
    auto start_time = std::clock();
    U2NetModel u2net(L"u2net.onnx");
    size_t input_tensor_size = 3 * 320 * 320;
    std::vector<float> input_tensor_values(input_tensor_size);

    //初始化一個資料（示範用）
    for (unsigned int i = 0; i < input_tensor_size; i++)
    {
        input_tensor_values[i] = (float)i / (input_tensor_size + 1);
    }
    float* results = nullptr;
    try
    {
        results = u2net.predict(input_tensor_values);
    }
    catch (Ort::Exception& e)
    {
        delete results;
        printf("%s\n", e.what());
    }
    auto end_time = std::clock();
    printf("Proceed exits after %.2f seconds", static_cast<float>(end_time - start_time) / 1000);
    printf("Done!\n");
    return 0;
}

現在模型部分結束了，但問題是我們其實并沒法得知我們的模型運作情況

是以我們還需要讀入圖檔和顯示圖檔，也就是

1.讀入一張圖檔

2.模型推理

3.螢幕列印圖檔

現在我們僅僅完成了第二部

是以接下來我們需要安裝OpenCV

OpenCV

待續…

下載下傳OpenCV

opencv 下載下傳：https://github.com/opencv/opencv/releases

檔案複制

打開你的 vs目錄 D:\Softs\Microsoft Visual Studio2019Community\VC\Tools\MSVC\14.23.28105\include 直接把opencv目錄下該檔案夾複制過去

神經網絡語義分割模型C++部署(VS2019+ONNXRuntime+OpenCV)ONNXRuntimeOpenCV借助OpenCV進行模型推理完整測試代碼

打開你的 vs目錄D:\Softs\Microsoft Visual Studio2019Community\VC\Tools\MSVC\14.23.28105\lib\x64 把這兩個檔案複制過去

神經網絡語義分割模型C++部署(VS2019+ONNXRuntime+OpenCV)ONNXRuntimeOpenCV借助OpenCV進行模型推理完整測試代碼

打開 C:\Windows\System32和C:\Windows\SysWOW64這兩個目錄把所有dll複制到這兩個目錄

神經網絡語義分割模型C++部署(VS2019+ONNXRuntime+OpenCV)ONNXRuntimeOpenCV借助OpenCV進行模型推理完整測試代碼

設定VS2019

注意點這兩地方都選 x64

神經網絡語義分割模型C++部署(VS2019+ONNXRuntime+OpenCV)ONNXRuntimeOpenCV借助OpenCV進行模型推理完整測試代碼

附加依賴項選自己的opencv版本

神經網絡語義分割模型C++部署(VS2019+ONNXRuntime+OpenCV)ONNXRuntimeOpenCV借助OpenCV進行模型推理完整測試代碼

設定環境變量

win+R打開指令行，輸入 SystemPropertiesAdvanced.exe，進行環境變量設定

神經網絡語義分割模型C++部署(VS2019+ONNXRuntime+OpenCV)ONNXRuntimeOpenCV借助OpenCV進行模型推理完整測試代碼

寫代碼測試一下

#include <assert.h>
#include <vector>
#include <ctime>
#include <iostream>
#include <onnxruntime_cxx_api.h>
#include <opencv2/core.hpp>
#include <opencv2/imgproc.hpp>
#include <opencv2/highgui.hpp>
#include <opencv2/videoio.hpp>
int main(int argc, char* argv[]) 
{
    cv::Mat image = cv::imread("horse.jpg");
    cv::resize(image, image, { 320, 320 },0.0,0.0, cv::INTER_CUBIC);
    cv::imshow("test", image);
    cv::waitKey(0);//不加這個會閃退
}

神經網絡語義分割模型C++部署(VS2019+ONNXRuntime+OpenCV)ONNXRuntimeOpenCV借助OpenCV進行模型推理完整測試代碼

不過控制台上好像輸出了一下動态連結失敗的資訊，但好像沒影響到我現在的使用，就先暫時不管了hhh

借助OpenCV進行模型推理

首先是重載一個新的

predict

函數來支援

cv::Mat

資料

當然這裡我新寫的版本已經不直接傳回

float*

，而是

std::vector<float>

class U2NetModel
{
public:
    ...
    std::vector<float> predict(std::vector<float>& input_data,int batch_size=1,int index=0);
    cv::Mat predict(cv::Mat& input_tensor, int batch_size = 1, int index = 0);
    ...
}

代碼實作，增加了對于

cv::Mat

的處理：

std::vector<float> U2NetModel::predict(std::vector<float>& input_tensor_values,int batch_size,int index)
{
    this->input_node_dims[0] = batch_size;
    this->output_node_dims[0] = batch_size;
    float* floatarr = nullptr;
    try
    {
        std::vector<const char*>output_node_names;
        if (index != -1)
        {
            output_node_names = { this->output_node_names[index] };
        }
        else
        {
            output_node_names = this->output_node_names;
        }
        this->input_node_dims[0] = batch_size;
        auto input_tensor_size = input_tensor_values.size();
        auto memory_info = Ort::MemoryInfo::CreateCpu(OrtArenaAllocator, OrtMemTypeDefault);
        Ort::Value input_tensor = Ort::Value::CreateTensor<float>(memory_info, input_tensor_values.data(), input_tensor_size, input_node_dims.data(), 4);
        auto output_tensors = session.Run(Ort::RunOptions{ nullptr }, input_node_names.data(), &input_tensor, 1, output_node_names.data(), 1);
        assert(output_tensors.size() == 1 && output_tensors.front().IsTensor());
        floatarr = output_tensors.front().GetTensorMutableData<float>();
    }
    catch (Ort::Exception&e)
    {
        throw e;
    }
    int64_t output_tensor_size = 1;
    for (auto& it : this->output_node_dims)
    {
        output_tensor_size *= it;
    }
    std::vector<float>results(output_tensor_size);
    for (unsigned i = 0;i < output_tensor_size; i++)
    {
        results[i] = floatarr[i];
    }
    return results;
}
cv::Mat U2NetModel::predict(cv::Mat& input_tensor, int batch_size, int index)
{
    int input_tensor_size = input_tensor.cols * input_tensor.rows * 3;
    std::size_t counter = 0;//std::vector空間一次性配置設定完成，避免過多的資料copy
    std::vector<float>input_data(input_tensor_size);
    std::vector<float>output_data;
    try
    {
        for (unsigned k = 0; k < 3; k++)
        {
            for (unsigned i = 0; i < input_tensor.rows; i++)
            {
                for (unsigned j = 0; j < input_tensor.cols; j++)
                {
                    input_data[counter++]=static_cast<float>(input_tensor.at<cv::Vec3b>(i, j)[k]) / 255.0;
                }
            }
        }
    }
    catch (cv::Exception& e)
    {
        printf(e.what());
    }
    try
    {
        output_data = this->predict(input_data);
    }
    catch (Ort::Exception& e)
    {
        throw e;
    }
    cv::Mat output_tensor(output_data);
    output_tensor=output_tensor.reshape(1, { 320,320 })*255.0;
    std::cout << output_tensor.rows << " " << output_tensor.cols << "fuck" << std::endl;
    return output_tensor;
}

int main(int argc, char* argv[]) 
{
    U2NetModel model(L"u2net.onnx");
    cv::Mat image = cv::imread("horse.jpg");
    cv::resize(image, image, { 320, 320 },0.0,0.0, cv::INTER_CUBIC);//調整大小到320*320
    cv::imshow("image", image);                                     //列印原圖檔
    cv::cvtColor(image, image, cv::COLOR_BGR2RGB);                  //BRG格式轉化為RGB格式
    auto result=model.predict(image);                               //模型預測
    cv::imshow("result", result);                                   //列印結果
    cv::waitKey(0);
}

神經網絡語義分割模型C++部署(VS2019+ONNXRuntime+OpenCV)ONNXRuntimeOpenCV借助OpenCV進行模型推理完整測試代碼

到這裡模型的部署和結果的展示就OKK了

不過直接把模型輸出轉化為圖檔顯然結果并不是非常理想

是以現在還需要對資料進行後處理，對圖檔進行二值化處理

得到一個Mask掩碼矩陣

cv::Mat output_tensor(output_data);
    output_tensor=255.0-output_tensor.reshape(1, { 320,320 })*255.0;
    cv::threshold(output_tensor, output_tensor, 220, 255, cv::THRESH_BINARY_INV);
    return output_tensor;

神經網絡語義分割模型C++部署(VS2019+ONNXRuntime+OpenCV)ONNXRuntimeOpenCV借助OpenCV進行模型推理完整測試代碼

現在結果就已經十分的理想了

完整測試代碼

#include <assert.h>
#include <vector>
#include <ctime>
#include <iostream>
#include <onnxruntime_cxx_api.h>
#include <opencv2/core.hpp>
#include <opencv2/imgproc.hpp>
#include <opencv2/highgui.hpp>
#include <opencv2/videoio.hpp>
class U2NetModel
{
public:
    U2NetModel(const wchar_t* onnx_model_path);
    std::vector<float> predict(std::vector<float>& input_data,int batch_size=1,int index=0);
    cv::Mat predict(cv::Mat& input_tensor, int batch_size = 1, int index = 0);
private:
    Ort::Env env;
    Ort::Session session;
    Ort::AllocatorWithDefaultOptions allocator;
    std::vector<const char*>input_node_names;
    std::vector<const char*>output_node_names;
    std::vector<int64_t> input_node_dims;
    std::vector<int64_t> output_node_dims;
};
U2NetModel::U2NetModel(const wchar_t* onnx_model_path):session(nullptr),env(nullptr)
{
    //初始化環境，每個程序一個環境,環境保留了線程池和其他狀态資訊
    this->env=Ort::Env(ORT_LOGGING_LEVEL_WARNING, "u2net");
    //初始化Session選項
    Ort::SessionOptions session_options;
    session_options.SetInterOpNumThreads(4);
    session_options.SetGraphOptimizationLevel(GraphOptimizationLevel::ORT_ENABLE_ALL);
    // 建立Session并把模型加載到記憶體中
    this->session=Ort::Session(env, onnx_model_path,session_options);
    //輸入輸出節點數量和名稱
    size_t num_input_nodes = session.GetInputCount();
    size_t num_output_nodes = session.GetOutputCount();
    for (int i = 0; i < num_input_nodes; i++)
    {
        auto input_node_name = session.GetInputName(i, allocator);
        this->input_node_names.push_back(input_node_name);
        Ort::TypeInfo type_info = session.GetInputTypeInfo(i);
        auto tensor_info = type_info.GetTensorTypeAndShapeInfo();
        ONNXTensorElementDataType type = tensor_info.GetElementType();
        this->input_node_dims = tensor_info.GetShape();
    }
    for (int i = 0; i < num_output_nodes; i++)
    {
        auto output_node_name = session.GetOutputName(i, allocator);
        this->output_node_names.push_back(output_node_name);
        Ort::TypeInfo type_info = session.GetOutputTypeInfo(i);
        auto tensor_info = type_info.GetTensorTypeAndShapeInfo();
        this->output_node_dims = tensor_info.GetShape();
    }
}
std::vector<float> U2NetModel::predict(std::vector<float>& input_tensor_values,int batch_size,int index)
{
    this->input_node_dims[0] = batch_size;
    this->output_node_dims[0] = batch_size;
    float* floatarr = nullptr;
    try
    {
        std::vector<const char*>output_node_names;
        if (index != -1)
        {
            output_node_names = { this->output_node_names[index] };
        }
        else
        {
            output_node_names = this->output_node_names;
        }
        this->input_node_dims[0] = batch_size;
        auto input_tensor_size = input_tensor_values.size();
        auto memory_info = Ort::MemoryInfo::CreateCpu(OrtArenaAllocator, OrtMemTypeDefault);
        Ort::Value input_tensor = Ort::Value::CreateTensor<float>(memory_info, input_tensor_values.data(), input_tensor_size, input_node_dims.data(), 4);
        auto output_tensors = session.Run(Ort::RunOptions{ nullptr }, input_node_names.data(), &input_tensor, 1, output_node_names.data(), 1);
        assert(output_tensors.size() == 1 && output_tensors.front().IsTensor());
        floatarr = output_tensors.front().GetTensorMutableData<float>();
    }
    catch (Ort::Exception&e)
    {
        throw e;
    }
    int64_t output_tensor_size = 1;
    for (auto& it : this->output_node_dims)
    {
        output_tensor_size *= it;
    }
    std::vector<float>results(output_tensor_size);
    for (unsigned i = 0;i < output_tensor_size; i++)
    {
        results[i] = floatarr[i];
    }
    return results;
}
cv::Mat U2NetModel::predict(cv::Mat& input_tensor, int batch_size, int index)
{
    int input_tensor_size = input_tensor.cols * input_tensor.rows * 3;
    std::size_t counter = 0;//std::vector空間一次性配置設定完成，避免過多的資料copy
    std::vector<float>input_data(input_tensor_size);
    std::vector<float>output_data;
    try
    {
        for (unsigned k = 0; k < 3; k++)
        {
            for (unsigned i = 0; i < input_tensor.rows; i++)
            {
                for (unsigned j = 0; j < input_tensor.cols; j++)
                {
                    input_data[counter++]=static_cast<float>(input_tensor.at<cv::Vec3b>(i, j)[k]) / 255.0;
                }
            }
        }
    }
    catch (cv::Exception& e)
    {
        printf(e.what());
    }
    try
    {
        output_data = this->predict(input_data);
    }
    catch (Ort::Exception& e)
    {
        throw e;
    }
    cv::Mat output_tensor(output_data);
    output_tensor=255.0-output_tensor.reshape(1, { 320,320 })*255.0;
    cv::threshold(output_tensor, output_tensor, 220, 255, cv::THRESH_BINARY_INV);
    
    return output_tensor;
}
int main(int argc, char* argv[]) 
{
    U2NetModel model(L"u2net.onnx");
    cv::Mat image = cv::imread("horse.jpg");
    cv::resize(image, image, { 320, 320 },0.0,0.0, cv::INTER_CUBIC);//調整大小到320*320
    cv::imshow("image", image);                                     //列印原圖檔
    cv::cvtColor(image, image, cv::COLOR_BGR2RGB);                  //BRG格式轉化為RGB格式
    auto result=model.predict(image);                               //模型預測
    cv::imshow("result", result);                                   //列印結果
    cv::waitKey(0);
}

神經網絡語義分割模型C++部署(VS2019+ONNXRuntime+OpenCV)ONNXRuntimeOpenCV借助OpenCV進行模型推理完整測試代碼

ONNXRuntime

安裝ONNXRuntime

OpenCV

下載下傳OpenCV

檔案複制

設定VS2019

設定環境變量

寫代碼測試一下

借助OpenCV進行模型推理

完整測試代碼

繼續閱讀

POJ 1284 Primitive Roots (歐拉函數&原根定理)

CQ V1.0分詞bates(基于雙數組tire樹)—應該是目前最快的中文分詞算法

成員函數初始化清單

【趨高機器視覺】機器視覺技術原了解析及解決方案

2021-08-13c++——類之操作符重載

swmm與lisflood-fp源碼如何一起編譯 CMake指令

Windows下VS開發環境環境安裝工程項目設定關于Debug和Release的提示

一文看懂字元串的加減乘除

解碼器用于語義分割：資料依賴的解碼可以實作靈活的特征聚合

cs231n斯坦福基于卷積神經網絡的CV學習筆記（一）KNN和線性分類器/分類器損失/反向傳播一，KNN圖像分類算法二，線性分類器三，線性分類器損失四，反向傳播五，神經網絡

C++ 第十五周報告1--《冒泡法排序》

C++實作簡單順序表

C經典書籍筆記——C陷阱與缺陷②(文法陷阱之優先級)一、錯誤案列二、優先級規律

線性表之順序表的實作

C++判斷素數、求最大公約數代碼判斷一個數是否為素數求兩個數的最大公約數

SequoiaDB巨杉資料庫C++驅動概述