Skip to content

[NNPA] Support matching tensor information in the JSON config file for NNPA#3412

Open
tungld wants to merge 60 commits intoonnx:mainfrom
tungld:json-config-shape-info-real
Open

[NNPA] Support matching tensor information in the JSON config file for NNPA#3412
tungld wants to merge 60 commits intoonnx:mainfrom
tungld:json-config-shape-info-real

Conversation

@tungld
Copy link
Member

@tungld tungld commented Mar 6, 2026

This PR supports matching tensor information such as: rank, data type, dimension size in the JSON config file for NNPA.

For example: match MatMul operations of A_3D x B_2D and the last dimensions of A and B are divisible by 64, then use NNPA for those MatMul operations.

{
  "nnpa_ops_config": [
    {
      "pattern": {
        "match": {
          "node_type": "onnx.MatMul",
          "inputs": {
            "0": {
              "rank": "3",
              "type": "f32",
              "dims": {
                "-1": "%64==0"
              }
            },
            "1": {
              "rank": "2",
              "type": "f32",
              "dims": {
                "-1": "%64==0"
              }
            }
          }
        },
        "rewrite": {
          "device": "nnpa"
        }
      }
    }
  ]
}

Supported patterns for checking integers (rank and dimension size):

Comparison Operators:

  • "3" - Exact match (implicit equality): value must equal 3
  • ">3" - Greater than: value must be > 3
  • ">=3" - Greater than or equal: value must be >= 3
  • "<3" - Less than: value must be < 3
  • "<=3" - Less than or equal: value must be <= 3
  • "==3" - Explicit equality: value must equal 3
  • "!=3" - Not equal: value must not equal 3

Modulo Operations (for divisibility/alignment checks):

  • "%32==0" - Modulo constraint: (value % 32) must equal 0
  • "%64==0" - Divisibility by 64: (value % 64) must equal 0
  • "%N==R" - General form: (value % N) must equal R

Input/output/dim index

  • Support negative index, for example, -1 is the last input/output/dim, -2 second to the last, etc.

See the documents JsonConfigFile-NNPA.md for more examples.

- Created JsonConfigObject class for JSON configuration management
  * Implemented loadFromFile, saveToFile, getArray, getObject, getString
  * Added set, remove, clear, dump methods
  * Added applyConfigToOps() for reusable configuration application

- Integrated global NNPA configuration object
  * Added globalNNPAConfig in NNPACompilerUtils.cpp
  * Added getGlobalNNPAConfig() accessor function
  * Automatic loading from nnpaLoadConfigFile option

- Refactored DevicePlacement and QuantOpSelection passes
  * Unified configObject pointer approach (local vs global)
  * Single code path using applyConfigToOps()
  * Backward compatible with loadConfigFile parameter
  * Added single-argument overload for createDevicePlacementPass()

- Created GenerateConfigFile pass
  * Generates JSON config from IR operations
  * Saves both device placement and quantization configs
  * Integrated into NNPA compilation pipeline

- Code style compliance
  * All comments end with periods per project convention

All 421 tests pass successfully.

Signed-off-by: Tung D. Le <tung@jp.ibm.com>
- Moved nnpaLoadConfigFile loading logic to start of function
- Moved nnpaSaveConfigFile initialization to start of function
- This ensures config is loaded before any pass configuration
- Improves code organization and clarity

All 421 tests pass successfully.

Signed-off-by: Tung D. Le <tung@jp.ibm.com>
- Moved global NNPA config instance from NNPACompilerUtils.cpp to JsonConfigObject.cpp
- Moved getGlobalNNPAConfig() accessor function to JsonConfigObject.cpp
- Added accessor declaration to JsonConfigObject.hpp
- Removed forward declaration and accessor from NNPACompilerUtils.hpp
- Updated NNPACompilerUtils.cpp to use accessor function

Benefits:
- Follows Single Responsibility Principle
- Better discoverability - global instance lives with its class
- Reduces coupling between modules
- Standard C++ practice for global instances

All 421 tests pass successfully.

Signed-off-by: Tung D. Le <tung@jp.ibm.com>
Signed-off-by: Tung D. Le <tung@jp.ibm.com>
Signed-off-by: Tung D. Le <tung@jp.ibm.com>
Signed-off-by: Tung D. Le <tung@jp.ibm.com>
Signed-off-by: Tung D. Le <tung@jp.ibm.com>
Signed-off-by: Tung D. Le <tung@jp.ibm.com>
Signed-off-by: Tung D. Le <tung@jp.ibm.com>
Signed-off-by: Tung D. Le <tung@jp.ibm.com>
Signed-off-by: Tung D. Le <tung@jp.ibm.com>
Signed-off-by: Tung D. Le <tung@jp.ibm.com>
Signed-off-by: Tung D. Le <tung@jp.ibm.com>
Signed-off-by: Tung D. Le <tung@jp.ibm.com>
Signed-off-by: Tung D. Le <tung@jp.ibm.com>
Signed-off-by: Tung D. Le <tung@jp.ibm.com>
Signed-off-by: Tung D. Le <tung@jp.ibm.com>
Signed-off-by: Tung D. Le <tung@jp.ibm.com>
Signed-off-by: Tung D. Le <tung@jp.ibm.com>
Signed-off-by: Tung D. Le <tung@jp.ibm.com>
Signed-off-by: Tung D. Le <tung@jp.ibm.com>
Signed-off-by: Tung D. Le <tung@jp.ibm.com>
Signed-off-by: Tung D. Le <tung@jp.ibm.com>
Signed-off-by: Tung D. Le <tung@jp.ibm.com>
Signed-off-by: Tung D. Le <tung@jp.ibm.com>
Signed-off-by: Tung D. Le <tung@jp.ibm.com>
Signed-off-by: Tung D. Le <tung@jp.ibm.com>
Signed-off-by: Tung D. Le <tung@jp.ibm.com>
Signed-off-by: Tung D. Le <tung@jp.ibm.com>
Signed-off-by: Tung D. Le <tung@jp.ibm.com>
Signed-off-by: Tung D. Le <tung@jp.ibm.com>
Signed-off-by: Tung D. Le <tung@jp.ibm.com>
Signed-off-by: Tung D. Le <tung@jp.ibm.com>
Signed-off-by: Tung D. Le <tung@jp.ibm.com>
…d json file options

Signed-off-by: Tung D. Le <tung@jp.ibm.com>
Signed-off-by: Tung D. Le <tung@jp.ibm.com>
Signed-off-by: Tung D. Le <tung@jp.ibm.com>
Signed-off-by: Tung D. Le <tung@jp.ibm.com>
Signed-off-by: Tung D. Le <tung@jp.ibm.com>
Signed-off-by: Tung D. Le <tung@jp.ibm.com>
…from_cli

Signed-off-by: Tung D. Le <tung@jp.ibm.com>
Signed-off-by: Tung D. Le <tung@jp.ibm.com>
Signed-off-by: Tung D. Le <tung@jp.ibm.com>
Signed-off-by: Tung D. Le <tung@jp.ibm.com>
Signed-off-by: Tung D. Le <tung@jp.ibm.com>
Signed-off-by: Tung D. Le <tung@jp.ibm.com>
Signed-off-by: Tung D. Le <tung@jp.ibm.com>
Signed-off-by: Tung D. Le <tung@jp.ibm.com>
Signed-off-by: Tung D. Le <tung@jp.ibm.com>
Signed-off-by: Tung D. Le <tung@jp.ibm.com>
Signed-off-by: Tung D. Le <tung@jp.ibm.com>
Signed-off-by: Tung D. Le <tung@jp.ibm.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant