Is symbolic regression the answer to turning opaque deep-learning models into closed-form, interpretable mathematical equations You have already trained your model. You can see it working. Do you know exactly what the machine has learned? A team of University of Cambridge researchers propose ‘SymTorch’, a library designed to integrate The symbolism of regression It can be integrated into workflows for deep learning. Researchers can approximate neural networks components using closed-form mathematics, which facilitates functional interpretation and inference speedup.
Wrap-Distill Switch: the Core Mechanism
SymTorch streamlines engineering to extract symbols equations from models trained by automating data movements and hook management.
- Wrap: The application is made by users.
SymbolicModelAny wrappernn.ModuleOr callable function. - Distill: The library uses forward hooks for recording input and out put activations on a forward pass. The cached data is transferred to the CPU via PySR for symbol regression.
- Switch: Once the neural weights have been refined, they can be substituted with the found equation using the forward pass.
switch_to_symbolic.
This library is a method of interacting with PySRThis algorithm uses multi-population genetic algorithms to create equations which balance accuracy with complexity. Pareto-front. The ‘best’ equation is chosen by maximizing the fractional drop in log mean absolute error relative to an increase in complexity.
Case Study: Accelerating LLM inference
The primary research application in this study is the replacement of existing systems. Multi-Layer Perceptron (MLP) To improve efficiency, layer Transformer models on top of symbolic substitutes.
Implementation Details
A research team was employed to investigate the LLM activation due to their high dimensions. Principal Component Analysis Compress inputs and out-puts prior to performing SR. Qwen2.5-1.5B Model, they selected 32 primary components for inputs as well as 8 outputs in three target layers.
Performance Tradeoffs
This intervention led to an Increase of 8.3% in the token throughput. This gain was accompanied by a significant increase in the perplexity. The main reason for this is the PCA dimension reduction, rather than the approximation symbol itself.
| Metric | Baseline (Qwen2.5-1.5B) | The Symbolic Substitute |
| Perplexity | 10.62 | 13.76 |
| The number of tokens (or throughput) | 4878.82 | 5281.42 |
| Avg. Latency (ms) | 209.89 | 193.89 |
GNNs and PINs
SymTorch’s ability to retrieve known physical laws in latent representations within scientific models was tested..
- Graph Neural Networks (GNNs): SymTorch was used by the team to train a GNN based on particle dynamics. This allowed them to extract empirical force laws like gravity (1/r).2The edge message () is a direct result of the spring force.
- Physics-Informed Neural Networks (PINNs): This library was able to extract the analytic solution of a 1-D heat equation from a PINN. Inductive bias of the PINN allowed for a mean squared error (MSE), 7.40 x 10.-6.
- LLM Arithmetic Analysis Models like Llama-3.2-1B are tested for their ability to perform addition and multiplication of 3 digits using symbolic distillation. These distilled equations showed that, while models like Llama-3.2-1B are correct in many cases, they still rely on internal algorithms that can include numerical errors.
The Key Takeaways
- Automatic Symbolic DistillationSymTorch, a library for automating the replacement of complex neural networks with closed-form equations that can be interpreted by wrapping and collecting input-output behaviors.
- Engineering Barrier RemovalThe library solves critical engineering issues that hindered symbolic regression adoption, such as GPU-CPU transfer of data, input-output cache, and seamless switching from neural and symbolic forward pass.
- LLM Inference AccelerationAs a proof-ofconcept, it was shown that by replacing the MLP layer in a Transformer model with symbol surrogates there could be an increase of 8.3% in throughput. However, this improvement came at a cost in terms of perplexity.
- Scientific Law DiscoverySymTorch has been used successfully to retrieve physical laws and solutions for the 1D heat equation using Physics-Informed Neural Networks.
- The Functional Interpretability Of LLMsResearchers can examine the implicit mathematical heuristics that are used to perform tasks such as arithmetic. This will reveal where the internal logic differs from the exact operation.
Take a look at the Paper, Repo The following are some examples of how to get started: Project Page. Also, feel free to follow us on Twitter Join our Facebook group! 120k+ ML SubReddit Subscribe Now our Newsletter. Wait! Are you using Telegram? now you can join us on telegram as well.

