onnx-mlir

Logo

在 MLIR 編譯器基礎架構中表示和參考降低 ONNX 模型

在 GitHub 上檢視專案 onnx/onnx-mlir

操作指南

使用 Python 進行推論
使用 C/C++ 進行推論
使用 Java 進行推論

參考資料

ONNX 方言
OMTensor C99 執行階段 API
OMTensorList C99 執行階段 API
OMTensor Java 執行階段 API
OMTensorList Java 執行階段 API
產生 ONNX 方言
關於文件

開發

新增一個操作
測試指南
錯誤處理
命令列選項
儀器化
常數傳播
新增一個加速器

工具

工具

RunONNXModel.py
DocCheck

此專案由 onnx 維護

託管於 GitHub Pages — 主題由 orderedlist 提供

使用 Python 介面

Onnx-mlir 具有執行階段工具,可以在 Python 中編譯和執行 ONNX 模型。這些工具由 OnnxMlirCompiler 編譯器介面 (include/OnnxMlirCompiler.h) 和 ExecutionSession 類別 (src/Runtime/ExecutionSession.hpp) 實作。這兩種工具都有一個由 pybind 函式庫 產生的相關 Python 綁定。

設定 Python 介面

使用 pybind,Python 直譯器可以直接匯入 C/C++ 二進位檔。對於 onnx-mlir,有五個這樣的函式庫,一個用於編譯 onnx-mlir 模型,兩個用於執行模型,另外兩個用於編譯和執行模型。

  1. 用於編譯 onnx-mlir 模型的共享函式庫由 PyOMCompileSession (src/Compiler/PyOMCompileSession.hpp) 產生,並建置為共享函式庫至 build/Debug/lib/PyCompile.cpython-<target>.so
  2. 用於執行 onnx-mlir 模型的共享函式庫由 PyExecutionSession (src/Runtime/PyExecutionSession.hpp) 產生,並建置為共享函式庫至 build/Debug/lib/PyRuntimeC.cpython-<target>.so
  3. 用於執行 onnx-mlir 模型的 Python 函式庫 (src/Runtime/python/PyRuntime.py)。
  4. 用於編譯和執行 onnx-mlir 模型的共享函式庫由 PyOMCompileExecutionSessionC (src/Runtime/PyOMCompileExecutionSession.hpp) 產生,並建置為共享函式庫至 build/Debug/lib/PyCompileAndRuntimeC.cpython-<target>.so
  5. 用於編譯和執行 onnx-mlir 模型的 Python 函式庫 (src/Runtime/python/PyCompileAndRuntime.py)。此函式庫會接收 .onnx 檔案和選項作為輸入,它會載入該檔案,然後編譯並執行它。

只要模組在您的 PYTHONPATH 中,Python 直譯器就可以正常匯入它。另一個替代方案是在您的工作目錄中建立它的符號連結。

cd <working directory>
ln -s <path to the shared library to copmpile onnx-mlir models>(e.g. `build/Debug/lib/PyCompile.cpython-<target>.so`) .
ln -s <path to the shared library to run onnx-mlir models>(e.g. `build/Debug/lib/PyRuntimeC.cpython-<target>.so`) .
ln -s <path to the Python library to run onnx-mlir models>(e.g. src/Runtime/python/PyRuntime.py) .
ln -s <path to the shared library to compile and run onnx-mlir models>(e.g. `build/Debug/lib/PyCompileAndRuntimeC.cpython-<target>.so`) .
ln -s <path to the Python library to compile and run onnx-mlir models>(e.g. src/Runtime/python/PyCompileAndRuntime.py) .
python3

用於執行模型的 Python 介面:PyRuntime

執行 PyRuntime 介面

ONNX 模型是一個計算圖,通常圖表會有一個單一的進入點來觸發計算。以下範例說明如何對具有單一進入點的模型進行推論。

import numpy as np
from PyRuntime import OMExecutionSession

model = 'model.so' # LeNet from ONNX Zoo compiled with onnx-mlir

# Create a session for this model.
session = OMExecutionSession(shared_lib_path=model)
# Input and output signatures of the default entry point.
print("input signature in json", session.input_signature())
print("output signature in json",session.output_signature())
# Do inference using the default entry point.
a = np.full((1, 1, 28, 28), 1, np.dtype(np.float32))
outputs = session.run(input=[a])

for output in outputs:
    print(output.shape)

如果計算圖有多個進入點,使用者必須設定特定的進入點來進行推論。以下範例說明如何使用多個進入點進行推論。

import numpy as np
from PyRuntime import OMExecutionSession

model = 'multi-entry-points-model.so'

# Create a session for this model.
session = OMExecutionSession(shared_lib_path=model, use_default_entry_point=False) # False to manually set an entry point.

# Query entry points in the model.
entry_points = session.entry_points()

for entry_point in entry_points:
  # Set the entry point to do inference.
  session.set_entry_point(name=entry_point)
  # Input and output signatures of the current entry point.
  print("input signature in json", session.input_signature())
  print("output signature in json",session.output_signature())
  # Do inference using the current entry point.
  a = np.arange(10).astype('float32')
  b = np.arange(10).astype('float32')
  outputs = session.run(input=[a, b])
  for output in outputs:
    print(output.shape)

使用模型標籤

如果模型是使用 --tag 編譯的,則必須將 --tag 的值傳遞給 OMExecutionSession。當同一個 Python 腳本中有針對多個模型的多個工作階段時,使用標籤會很有用。以下範例說明如何使用標籤進行多次推論。

import numpy as np
from PyRuntime import OMExecutionSession

encoder_model = 'encoder/model.so' # Assumed that the model was compiled using `--tag=encoder`
decoder_model = 'decoder/model.so' # Assumed that the model was compiled using `--tag=decoder`

# Create a session for the encoder model.
encoder_sess = OMExecutionSession(shared_lib_path=encoder_model, tag="encoder")
# Create a session for the decoder model.
decoder_sess = OMExecutionSession(shared_lib_path=decoder_model, tag="decoder")

如果兩個模型未使用 --tag 編譯,則如果它們要在同一個程序中使用,則必須使用不同的 .so 檔名編譯。事實上,當未指定任何標籤時,我們會使用檔案名稱作為其預設標籤。以下範例說明如何在不使用標籤的情況下進行多次推論。

import numpy as np
from PyRuntime import OMExecutionSession

encoder_model = 'my_encoder.so'
decoder_model = 'my_decoder.so'

# Create a session for the encoder model.
encoder_sess = OMExecutionSession(shared_lib_path=encoder_model) # tag will be `my_encoder` by default.
# Create a session for the decoder model.
decoder_sess = OMExecutionSession(shared_lib_path=decoder_model) # tag will be `my_decoder` by default.

若要使用不含標籤的函式 (例如 run_main_graph),請設定 tag = "NONE"

PyRuntime 模型 API

可以在先前提及的來源中看到 OMExecutionSession 的完整介面。不過,使用建構函式和執行方法就足以執行推論。

def __init__(self, shared_lib_path: str, tag: str, use_default_entry_point: bool):
    """
    Args:
        shared_lib_path: relative or absolute path to your .so model.
        tag: a string that was passed to `--tag` when compiling the .so model. By default, it is the output file name without its extension, namely, `filename` in `filename.so`
        use_default_entry_point: use the default entry point that is `run_main_graph_{tag}` or not. Set to True by default.
    """

def run(self, input: List[ndarray]) -> List[ndarray]:
    """
    Args:
        input: A list of NumPy arrays, the inputs of your model.

    Returns:
        A list of NumPy arrays, the outputs of your model.
    """

def input_signature(self) -> str:
    """
    Returns:
        A string containing a JSON representation of the model's input signature.
    """

def output_signature(self) -> str:
    """
    Returns:
        A string containing a JSON representation of the model's output signature.
    """

def entry_points(self) -> List[str]:
    """
    Returns:
        A list of entry point names.
    """

def set_entry_point(self, name: str):
    """
    Args:
        name: an entry point name.
    """

用於編譯模型的 Python 介面:PyCompile

執行 PyCompile 介面

可以直接從命令列編譯 ONNX 模型。然後可以使用 Python 執行產生的函式庫,如先前章節所示。有時,直接在 Python 中編譯模型可能也很方便。本節將探討執行此操作的 Python 方法。

OMCompileSession 物件在建構時會採用檔案名稱。對於編譯,compile() 將接收 flags 字串作為輸入,這將覆寫從 env var 設定的任何預設選項。

import numpy as np
from PyCompile import OMCompileSession

# Load onnx model and create OMCompileSession object.
file = './mnist.onnx'
compiler = OMCompileSession(file)
# Generate the library file. Success when rc == 0 while set the opt as "-O3"
rc = compiler.compile("-O3")
# Get the output file name
model = compiler.get_compiled_file_name()
if rc:
    print("Failed to compile with error code", rc)
    exit(1)
print("Compiled onnx file", file, "to", model, "with rc", rc)

PyCompile 模組匯出 OMCompileSession 類別,以驅動將 ONNX 模型編譯為可執行模型。通常,會藉由給定 ONNX 模型的檔案名稱,針對給定的模型建立編譯器物件。然後,可以將所有編譯器選項設定為整個 std::string 來產生所需的可執行檔。最後,藉由呼叫 compile() 命令 (使用者將選項字串作為此函式的輸入) 來執行編譯本身。

compile() 命令會傳回反映編譯狀態的回傳碼。零值表示成功,非零值表示錯誤碼。由於不同的作業系統可能會具有不同的函式庫後綴,因此可以使用 get_compiled_file_name() 方法擷取輸出檔案名稱。

PyCompile 模型 API

可以在先前提及的來源中看到 OnnxMlirCompiler 的完整介面。不過,使用建構函式和下列方法就足以編譯模型。

def __init__(self, file_name: str):
    """
    Constructor for an ONNX model contained in a file.
    Args:
        file_name: relative or absolute path to your ONNX model.
    """
def __init__(self, input_buffer: void *, buffer_size: int):
    """
    Constructor for an ONNX model contained in an input buffer.
    Args:
        input_buffer: buffer containing the protobuf representation of the model.
        buffer_size: byte size of the input buffer.
    """
def compile(self, flags: str):
    """
    Method to compile a model from a file.
    Args:
        flags: all the options users would like to set.
    Returns:
        Zero on success, error code on failure.
    """
def compile_from_array(self, output_base_name: str, target: OnnxMlirTarget):
    """
    Method to compile a model from an array.
    Args:
        output_base_name: base name (relative or absolute, without suffix)
        where the compiled model should be written into.
        target: target for the compiler's output. Typical values are
        OnnxMlirTarget.emit_lib or emit_jni.
    Returns:
        Zero on success, error code on failure.
    """
def get_compiled_file_name(self):
    """
    Method to provide the full (absolute or relative) output compiled file name, including
    its suffix.
    Returns:
        String containing the fle name after successful compilation; empty string on failure.
    """
def get_error_message(self):
    """
    Method to provide the compilation error message.
    Returns:
        String containing the error message; empty string on success.
    """

用於編譯和執行模型的 Python 介面:PyCompileAndRuntime

執行 PyCompileAndRuntime 介面

import numpy as np
from PyCompileAndRuntime import OMCompileExecutionSession

# Load onnx model and create OMCompileExecutionSession object.
inputFileName = './mnist.onnx'
# Set the full name of compiled model
sharedLibPath = './mnist.so'
# Set the compile option as "-O3"
session = OMCompileExecutionSession(inputFileName,sharedLibPath,"-O3")

# Print the models input/output signature, for display.
# Signature functions for info only, commented out if they cause problems.
session.print_input_signature()
session.print_output_signature()

# Do inference using the default entry point.
a = np.full((1, 1, 28, 28), 1, np.dtype(np.float32))
outputs = session.run(input=[a])

for output in outputs:
    print(output.shape)

PyCompileAndRuntime 模型 API

PyCompileAndRuntime 是一個新類別,結合了編譯和執行。其建構函式會接收 .onnx 輸入檔案,並使用使用者提供的選項編譯模型,然後使用輸入執行模型。

def __init__(self, input_model_path: str, compiled_file_path: str, flags: str, use_default_entry_point: bool):
    """
    Constructor for an ONNX model contained in a file.
    Args:
        input_model_path: relative or absolute path to your ONNX model.
        compiled_file_path: relative or absolute path to your compiled file.
        flags: all the options users would like to set.
        use_default_entry_point: use the default entry point that is `run_main_graph` or not. Set to True by default.
    """
def get_compiled_result(self):
    """
    Method to provide the results of the compilation.
    Returns:
        Int containing the results. 0 represents successful compilation; others on failure.
    """
def get_compiled_file_name(self):
    """
    Method to provide the full (absolute or relative) output file name, including
    its suffix.
    Returns:
        String containing the fle name after successful compilation; empty string on failure.
    """
def get_error_message(self):
    """
    Method to provide the compilation error message.
    Returns:
        String containing the error message; empty string on success.
    """
def entry_points(self) -> List[str]:
    """
    Returns:
        A list of entry point names.
    """
def set_entry_point(self, name: str):
    """
    Args:
        name: an entry point name.
    """
def run(self, input: List[ndarray]) -> List[ndarray]:
    """
    Args:
        input: A list of NumPy arrays, the inputs of your model.

    Returns:
        A list of NumPy arrays, the outputs of your model.
    """
def input_signature(self) -> str:
    """
    Returns:
        A string containing a JSON representation of the model's input signature.
    """

def output_signature(self) -> str:
    """
    Returns:
        A string containing a JSON representation of the model's output signature.
    """