|
1 week ago | |
---|---|---|
.. | ||
README.md | 1 week ago | |
compare_models.py | 1 week ago |
README.md
C++ Tracker Model Testing Framework
This directory contains the testing framework for comparing C++ implementations of PyTorch models against their original Python counterparts.
Overview
The primary goal is to ensure that the C++ models (cimp
project) produce results that are numerically close to the Python models from the pytracking
toolkit, given the same inputs and model weights.
The framework consists of:
-
C++ Test Program (
test_models.cpp
):- Responsible for loading pre-trained model weights (from
exported_weights/
). - Takes randomly generated input tensors (pre-generated by
generate_test_samples.cpp
and saved intest/input_samples/
). - Runs the C++
Classifier
andBBRegressor
models. - Saves the C++ model output tensors to
test/output/
.
- Responsible for loading pre-trained model weights (from
-
C++ Sample Generator (
generate_test_samples.cpp
):- Generates a specified number of random input tensor sets for both the classifier and bounding box regressor.
- Saves these input tensors into
test/input_samples/{classifier|bb_regressor}/sample_N/
andtest/input_samples/{classifier|bb_regressor}/test_N/
(for classifier test features). - This step is separated to allow the Python comparison script to run even if the C++ models have issues during their execution phase.
-
Python Comparison Script (
compare_models.py
):- Loads the original Python models (using
DiMPTorchScriptWrapper
which loads weights fromexported_weights/
). - Loads the input tensors generated by
generate_test_samples.cpp
fromtest/input_samples/
. - Runs the Python models on these input tensors to get reference Python outputs.
- Loads the C++ model output tensors from
test/output/
. - Performs a detailed, element-wise comparison between Python and C++ outputs.
- Calculates various error metrics (MAE, Max Error, L2 norms, Cosine Similarity, Pearson Correlation, Mean Relative Error).
- Generates an HTML report (
test/comparison/report.html
) summarizing the comparisons, including per-sample statistics and error distribution plots (saved intest/comparison/plots/
).
- Loads the original Python models (using
-
Automation Script (
run_full_comparison.sh
):- Orchestrates the entire testing process:
- Builds the C++ project (including
test_models
andgenerate_test_samples
). - Runs
generate_test_samples
to create/update input data. - Runs
test_models
to generate C++ outputs. - Runs
compare_models.py
to perform the comparison and generate the report.
- Builds the C++ project (including
- Accepts the number of samples as an argument.
- Orchestrates the entire testing process:
Directory Structure
test/
├── input_samples/ # Stores input tensors generated by C++
│ ├── classifier/
│ │ ├── sample_0/
│ │ │ ├── backbone_feat.pt
│ │ │ └── ... (other classifier train inputs)
│ │ └── test_0/
│ │ └── test_feat.pt
│ │ └── ... (other classifier test inputs)
│ └── bb_regressor/
│ └── sample_0/
│ ├── feat_layer2.pt
│ ├── feat_layer3.pt
│ └── ... (other bb_regressor inputs)
├── output/ # Stores output tensors generated by C++ models
│ ├── classifier/
│ │ ├── sample_0/
│ │ │ └── clf_features.pt
│ │ └── test_0/
│ │ └── clf_feat_test.pt
│ └── bb_regressor/
│ └── sample_0/
│ ├── iou_pred.pt
│ └── ... (other bb_regressor outputs)
├── comparison/ # Stores comparison results
│ ├── report.html # Main HTML report
│ └── plots/ # Error distribution histograms
├── test_models.cpp # C++ program to run models and save outputs
├── generate_test_samples.cpp # C++ program to generate input samples
├── compare_models.py # Python script for comparison and report generation
├── run_full_comparison.sh # Main test execution script
└── README.md # This file
How to Add a New Model for Comparison
Let's say you want to add a new model called MyNewModel
with both C++ and Python implementations.
1. Export Python Model Weights:
- Ensure your Python
MyNewModel
can have its weights saved in a format loadable by both Python (e.g.,state_dict
or individual tensors) and C++ (LibTorchtorch::load
). - Create a subdirectory
exported_weights/mynewmodel/
and save the weights there. - Document the tensor names and their corresponding model parameters in a
mynewmodel_weights_doc.txt
file within that directory (see existingclassifier_weights_doc.txt
orbb_regressor_weights_doc.txt
for examples). This is crucial for theDiMPTorchScriptWrapper
if loading from individual tensors.
2. Update C++ Code:
generate_test_samples.cpp
: * Add functions to generate realistic random input tensors forMyNewModel
. * Define the expected input tensor names and shapes. * Modify themain
function to:- Create a directory
test/input_samples/mynewmodel/sample_N/
. - Call your new input generation functions.
- Save these input tensors (e.g.,
my_input1.pt
,my_input2.pt
) into the created directory using thesave_tensor
utility.
- Create a directory
test_models.cpp
: * Include the header for your C++MyNewModel
(e.g.,cimp/mynewmodel/mynewmodel.h
). * In themain
function:- Add a section for
MyNewModel
. - Determine the absolute path to
exported_weights/mynewmodel/
. - Instantiate your C++
MyNewModel
, passing the weights directory. - Loop through the number of samples:
* Construct paths to the input tensors in
test/input_samples/mynewmodel/sample_N/
. * Load these input tensors usingload_tensor
. Ensure they are on the correct device (CPU/CUDA). * Call the relevant methods of your C++MyNewModel
(e.g.,myNewModel.predict(...)
). * Create an output directorytest/output/mynewmodel/sample_N/
. * Save the output tensors from your C++ model (e.g.,my_output.pt
) to this directory usingsave_tensor
. Remember to move outputs to CPU before saving if they are on CUDA.
- Add a section for
CMakeLists.txt
: * IfMyNewModel
is a new static library (likeclassifier
orbb_regressor
), define its sources and add it as a library. * Linktest_models
andgenerate_test_samples
(if it needs new specific libraries) withMyNewModel
library and any other dependencies (like LibTorch).
3. Update Python Comparison Script (compare_models.py
):
ModelComparison.__init__
&_init_models
: * If your PythonMyNewModel
needs to be loaded viaDiMPTorchScriptWrapper
, update the wrapper or add logic to load your model. You might need to add a new parameter likemynewmodel_sd='mynewmodel'
toDiMPTorchScriptWrapper
and handle its loading. * Store the loaded PythonMyNewModel
instance (e.g.,self.models.mynewmodel
).- Create
compare_mynewmodel
method: * Create a new method, e.g.,def compare_mynewmodel(self):
. * Print a starting message. * Define input and C++ output directory paths:Path('test') / 'input_samples' / 'mynewmodel'
andPath('test') / 'output' / 'mynewmodel'
. * Loop throughself.num_samples
:- Initialize
current_errors = {}
for the current sample. - Construct paths to input tensors for
MyNewModel
fromtest/input_samples/mynewmodel/sample_N/
. - Load these tensors using
self.load_cpp_tensor()
. - Run the Python
MyNewModel
with these inputs to getpy_output_tensor
. Handle potential errors. - Construct paths to C++ output tensors from
test/output/mynewmodel/sample_N/
. - Load the C++ output tensor (
cpp_output_tensor
) usingself.load_cpp_tensor()
. - Call
self._compare_tensor_data(py_output_tensor, cpp_output_tensor, "MyNewModel Output Comparison Name", i, current_errors)
. Use a descriptive name. - If there are multiple distinct outputs from
MyNewModel
to compare, repeat the load and_compare_tensor_data
calls for each. - Store the results:
if current_errors: self.all_errors_stats[f"MyNewModel_Sample_{i}"] = current_errors
.
- Initialize
ModelComparison.run_all_tests
: * Call your newself.compare_mynewmodel()
method.
4. Run the Tests:
- Execute
./test/run_full_comparison.sh <num_samples>
. - Check the console output and
test/comparison/report.html
for the results ofMyNewModel
.
Key Considerations:
- Tensor Naming and Paths: Be consistent with tensor filenames and directory structures. The Python script relies on these conventions to find the correct files.
- Data Types and Devices: Ensure tensors are of compatible data types (usually
float32
) and are on the correct device (CPU/CUDA) before model inference and before saving/loading. C++ outputs are saved from CPU. - Error Handling: Implement robust error handling in both C++ (e.g., for file loading, model errors) and Python (e.g., for tensor loading, Python model execution). The comparison script is designed to report "N/A" for metrics if tensors are missing or shapes mismatch, allowing other comparisons to proceed.
DiMPTorchScriptWrapper
: If your Python model structure is different from DiMP's Classifier/BBRegressor, you might need to adaptDiMPTorchScriptWrapper
or write a custom loader for your Python model if it's not already atorch.jit.ScriptModule
. The current wrapper supports loading from a directory of named tensor files based on a documentation text file.load_cpp_tensor
in Python: This utility incompare_models.py
attempts to robustly load tensors saved by LibTorch (which sometimes get wrapped asRecursiveScriptModule
). If you encounter issues loading your C++ saved tensors, you might need to inspect their structure and potentially adapt this function. The C++save_tensor
function aims to save plain tensors.
By following these steps, you can integrate new models into this testing framework to validate their C++ implementations.