History

mht f45f9657ab Fix: Remove invalid debug_get_conv3_1t_output and debug_get_conv4_1t_output calls from test_models.cpp - Removed calls to non-existent BBRegressor methods debug_get_conv3_1t_output and debug_get_conv4_1t_output in test/test_models.cpp. - Restored correct BBRegressor processing logic using get_iou_feat, get_modulation, and predict_iou. - This resolves build errors introduced by referencing methods that were never implemented. - Build and test workflow now completes successfully.		3 weeks ago
..
output_py	Fix NameError in compare_models; Add ResNet BN1 debug prints; Prepare to address other modules	2 months ago
README.md	Feat: Achieved perfect Conv1, good BN1/ReLU1/MaxPool similarity. README updated.	2 months ago
compare_models.py	Fix ResNet BatchNorm parameter loading and enhance BN1 debugging. Corrected loading of BatchNorm running_mean, running_var, and num_batches_tracked parameters in C++ ResNet BottleneckImpl and ResNetImpl to use direct member assignment instead of named_buffers(). This resolved discrepancies with Python's BatchNorm behavior. Added detailed intermediate output saving for bn1 in both C++ ResNet and Python comparison script to facilitate debugging. Ensured Python comparison script correctly loads and compares these new ResNet intermediate tensors. This series of changes led to numerical equivalence for ResNet conv1, bn1, and subsequently layer1-4 outputs between Python and C++.	2 months ago
run_tests.sh	Fix NameError in compare_models; Add ResNet BN1 debug prints; Prepare to address other modules	2 months ago
test_models.cpp	Fix: Remove invalid debug_get_conv3_1t_output and debug_get_conv4_1t_output calls from test_models.cpp	3 weeks ago

README.md

C++ Tracker Tests

This directory contains test applications and scripts for the C++ tracker components.

`test_models`

An executable for testing various models like ResNet, BBRegressor, and Classifier.

Building

Refer to the main project README.md for building instructions. The test_models executable will be placed in the bin/ directory.

Running

Currently, test_models can be run with dummy arguments as its primary function is to check basic model instantiation and forward passes:

./bin/test_models ./test/input_samples ./test/output 10

Further development will integrate more comprehensive testing and comparison with Python outputs.

C++ Tracker Model Testing Framework

This directory contains the testing framework for comparing C++ implementations of PyTorch models against their original Python counterparts.

Overview

The primary goal is to ensure that the C++ models (cimp project) produce results that are numerically close to the Python models from the pytracking toolkit, given the same inputs and model weights.

The framework consists of:

C++ Test Program (test_models.cpp):
- Responsible for loading pre-trained model weights (from exported_weights/).
- Takes randomly generated input tensors (pre-generated by generate_test_samples.cpp and saved in test/input_samples/).
- Runs the C++ Classifier and BBRegressor models.
- Saves the C++ model output tensors to test/output/.
C++ Sample Generator (generate_test_samples.cpp):
- Generates a specified number of random input tensor sets for both the classifier and bounding box regressor.
- Saves these input tensors into test/input_samples/{classifier|bb_regressor}/sample_N/ and test/input_samples/{classifier|bb_regressor}/test_N/ (for classifier test features).
- This step is separated to allow the Python comparison script to run even if the C++ models have issues during their execution phase.
Python Comparison Script (compare_models.py):
- Loads the original Python models (using DiMPTorchScriptWrapper which loads weights from exported_weights/).
- Loads the input tensors generated by generate_test_samples.cpp from test/input_samples/.
- Runs the Python models on these input tensors to get reference Python outputs.
- Loads the C++ model output tensors from test/output/.
- Performs a detailed, element-wise comparison between Python and C++ outputs.
- Calculates various error metrics (MAE, Max Error, L2 norms, Cosine Similarity, Pearson Correlation, Mean Relative Error).
- Generates an HTML report (test/comparison/report.html) summarizing the comparisons, including per-sample statistics and error distribution plots (saved in test/comparison/plots/).
Automation Script (run_full_comparison.sh):
- Orchestrates the entire testing process:
  1. Builds the C++ project (including test_models and generate_test_samples).
  2. Runs generate_test_samples to create/update input data.
  3. Runs test_models to generate C++ outputs.
  4. Runs compare_models.py to perform the comparison and generate the report.
- Accepts the number of samples as an argument.

Directory Structure

test/
├── input_samples/        # Stores input tensors generated by C++
│   ├── classifier/
│   │   ├── sample_0/
│   │   │   ├── backbone_feat.pt
│   │   │   └── ... (other classifier train inputs)
│   │   └── test_0/
│   │       └── test_feat.pt
│   │       └── ... (other classifier test inputs)
│   └── bb_regressor/
│       └── sample_0/
│           ├── feat_layer2.pt
│           ├── feat_layer3.pt
│           └── ... (other bb_regressor inputs)
├── output/               # Stores output tensors generated by C++ models
│   ├── classifier/
│   │   ├── sample_0/
│   │   │   └── clf_features.pt
│   │   └── test_0/
│   │       └── clf_feat_test.pt
│   └── bb_regressor/
│       └── sample_0/
│           ├── iou_pred.pt
│           └── ... (other bb_regressor outputs)
├── comparison/           # Stores comparison results
│   ├── report.html       # Main HTML report
│   └── plots/            # Error distribution histograms
├── test_models.cpp       # C++ program to run models and save outputs
├── generate_test_samples.cpp # C++ program to generate input samples
├── compare_models.py     # Python script for comparison and report generation
├── run_full_comparison.sh # Main test execution script
└── README.md             # This file

How to Add a New Model for Comparison

Let's say you want to add a new model called MyNewModel with both C++ and Python implementations.

1. Export Python Model Weights:

Ensure your Python MyNewModel can have its weights saved in a format loadable by both Python (e.g., state_dict or individual tensors) and C++ (LibTorch torch::load).
Create a subdirectory exported_weights/mynewmodel/ and save the weights there.
Document the tensor names and their corresponding model parameters in a mynewmodel_weights_doc.txt file within that directory (see existing classifier_weights_doc.txt or bb_regressor_weights_doc.txt for examples). This is crucial for the DiMPTorchScriptWrapper if loading from individual tensors.

2. Update C++ Code:

generate_test_samples.cpp: * Add functions to generate realistic random input tensors for MyNewModel. * Define the expected input tensor names and shapes. * Modify the main function to:
- Create a directory test/input_samples/mynewmodel/sample_N/.
- Call your new input generation functions.
- Save these input tensors (e.g., my_input1.pt, my_input2.pt) into the created directory using the save_tensor utility.
test_models.cpp: * Include the header for your C++ MyNewModel (e.g., cimp/mynewmodel/mynewmodel.h). * In the main function:
- Add a section for MyNewModel.
- Determine the absolute path to exported_weights/mynewmodel/.
- Instantiate your C++ MyNewModel, passing the weights directory.
- Loop through the number of samples: * Construct paths to the input tensors in test/input_samples/mynewmodel/sample_N/. * Load these input tensors using load_tensor. Ensure they are on the correct device (CPU/CUDA). * Call the relevant methods of your C++ MyNewModel (e.g., myNewModel.predict(...)). * Create an output directory test/output/mynewmodel/sample_N/. * Save the output tensors from your C++ model (e.g., my_output.pt) to this directory using save_tensor. Remember to move outputs to CPU before saving if they are on CUDA.
CMakeLists.txt: * If MyNewModel is a new static library (like classifier or bb_regressor), define its sources and add it as a library. * Link test_models and generate_test_samples (if it needs new specific libraries) with MyNewModel library and any other dependencies (like LibTorch).

3. Update Python Comparison Script (compare_models.py):

ModelComparison.__init__ & _init_models: * If your Python MyNewModel needs to be loaded via DiMPTorchScriptWrapper, update the wrapper or add logic to load your model. You might need to add a new parameter like mynewmodel_sd='mynewmodel' to DiMPTorchScriptWrapper and handle its loading. * Store the loaded Python MyNewModel instance (e.g., self.models.mynewmodel).
Create compare_mynewmodel method: * Create a new method, e.g., def compare_mynewmodel(self):. * Print a starting message. * Define input and C++ output directory paths: Path('test') / 'input_samples' / 'mynewmodel' and Path('test') / 'output' / 'mynewmodel'. * Loop through self.num_samples:
- Initialize current_errors = {} for the current sample.
- Construct paths to input tensors for MyNewModel from test/input_samples/mynewmodel/sample_N/.
- Load these tensors using self.load_cpp_tensor().
- Run the Python MyNewModel with these inputs to get py_output_tensor. Handle potential errors.
- Construct paths to C++ output tensors from test/output/mynewmodel/sample_N/.
- Load the C++ output tensor (cpp_output_tensor) using self.load_cpp_tensor().
- Call self._compare_tensor_data(py_output_tensor, cpp_output_tensor, "MyNewModel Output Comparison Name", i, current_errors). Use a descriptive name.
- If there are multiple distinct outputs from MyNewModel to compare, repeat the load and _compare_tensor_data calls for each.
- Store the results: if current_errors: self.all_errors_stats[f"MyNewModel_Sample_{i}"] = current_errors.
ModelComparison.run_all_tests: * Call your new self.compare_mynewmodel() method.

4. Run the Tests:

Execute ./test/run_full_comparison.sh <num_samples>.
Check the console output and test/comparison/report.html for the results of MyNewModel.

Key Considerations:

Tensor Naming and Paths: Be consistent with tensor filenames and directory structures. The Python script relies on these conventions to find the correct files.
Data Types and Devices: Ensure tensors are of compatible data types (usually float32) and are on the correct device (CPU/CUDA) before model inference and before saving/loading. C++ outputs are saved from CPU.
Error Handling: Implement robust error handling in both C++ (e.g., for file loading, model errors) and Python (e.g., for tensor loading, Python model execution). The comparison script is designed to report "N/A" for metrics if tensors are missing or shapes mismatch, allowing other comparisons to proceed.
DiMPTorchScriptWrapper: If your Python model structure is different from DiMP's Classifier/BBRegressor, you might need to adapt DiMPTorchScriptWrapper or write a custom loader for your Python model if it's not already a torch.jit.ScriptModule. The current wrapper supports loading from a directory of named tensor files based on a documentation text file.
load_cpp_tensor in Python: This utility in compare_models.py attempts to robustly load tensors saved by LibTorch (which sometimes get wrapped as RecursiveScriptModule). If you encounter issues loading your C++ saved tensors, you might need to inspect their structure and potentially adapt this function. The C++ save_tensor function aims to save plain tensors.

By following these steps, you can integrate new models into this testing framework to validate their C++ implementations.