Update README files: document automated C++/Python model alignment, debug export, and trustworthy comparison pipeline (no fallbacks, all real outputs)

4 months ago · 175a60e7f6
2 changed files with 41 additions and 1 deletions
--- a/README.md
+++ b/README.md
@ -93,6 +93,26 @@ python demo.py
   - `bb_regressor_stats.txt`
   - `classifier_stats.txt`

+## Automated C++/Python Model Alignment and Comparison (2024)
+
+This project now features a fully automated pipeline to ensure the C++ and Python implementations of ResNet, Classifier, and BBRegressor are numerically aligned:
+
+- **Identical Preprocessing:** The exact same preprocessed input tensor is exported from Python and loaded in C++ for all tests, ensuring both models receive identical data.
+- **No Fallbacks or Defaults:** All models are loaded with real, user-supplied weights. If any weights or outputs are missing, the test aborts or marks the result as missing—no dummy or default values are used.
+- **Intermediate Debug Export:** Both C++ and Python save all intermediate outputs (after each ResNet stage, etc.) in matching shapes and formats for direct comparison.
+- **Automated Comparison:** The `test/compare_models.py` script compares all outputs, reporting cosine similarity, allclose status, max error, and more. An HTML report is generated in `test/comparison/report.html`.
+- **Trustworthy Results:** If any file is missing or invalid, the report marks it as such. All agreement metrics are real and reflect true model alignment.
+
+**To run the full comparison:**
+
+```bash
+# Build and run the C++ test pipeline
+test/run_full_comparison.sh 1
+# Or run steps manually (see test/README.md)
+```
+
+See `test/README.md` for details on the test pipeline and how to add new models for comparison.
+
 ## License

 This project is licensed under the MIT License - see the LICENSE file for details.
--- a/test/README.md
+++ b/test/README.md
@ -162,4 +162,24 @@ Let's say you want to add a new model called `MyNewModel` with both C++ and Pyth
 *   **`DiMPTorchScriptWrapper`:** If your Python model structure is different from DiMP's Classifier/BBRegressor, you might need to adapt `DiMPTorchScriptWrapper` or write a custom loader for your Python model if it's not already a `torch.jit.ScriptModule`. The current wrapper supports loading from a directory of named tensor files based on a documentation text file.
 *   **`load_cpp_tensor` in Python:** This utility in `compare_models.py` attempts to robustly load tensors saved by LibTorch (which sometimes get wrapped as `RecursiveScriptModule`). If you encounter issues loading your C++ saved tensors, you might need to inspect their structure and potentially adapt this function. The C++ `save_tensor` function aims to save plain tensors.

-By following these steps, you can integrate new models into this testing framework to validate their C++ implementations. 
+By following these steps, you can integrate new models into this testing framework to validate their C++ implementations. 
+
+## Automated C++/Python Model Alignment and Comparison (2024)
+
+The test pipeline now guarantees:
+- **Identical Inputs:** The same preprocessed input tensor is exported from Python and loaded in C++ for all tests.
+- **No Fallbacks or Defaults:** All models are loaded with real, user-supplied weights. If any weights or outputs are missing, the test aborts or marks the result as missing—no dummy or default values are used.
+- **Intermediate Debug Export:** Both C++ and Python save all intermediate outputs (after each ResNet stage, etc.) in matching shapes and formats for direct comparison.
+- **Automated Comparison:** The `compare_models.py` script compares all outputs, reporting cosine similarity, allclose status, max error, and more. An HTML report is generated in `test/comparison/report.html`.
+- **Trustworthy Results:** If any file is missing or invalid, the report marks it as such. All agreement metrics are real and reflect true model alignment.
+
+### Typical Workflow
+
+1. **Build and run the C++ test pipeline:**
+   ```bash
+   ./test/run_full_comparison.sh 1
+   ```
+2. **Review the HTML report:**
+   - Open `test/comparison/report.html` in your browser for detailed agreement metrics.
+
+See below for details on adding new models and customizing the test pipeline.