Skip to content

cuplv/optimal-abstractions

Repository files navigation

KAN vs MLP Performance Comparison

This repository contains notebooks comparing the performance of Kolmogorov-Arnold Networks (KAN) and Multi-Layer Perceptrons (MLP) on various function approximation tasks taken from the KAN paper and a prosthetic data dataset.

Results Summary

The following table summarizes the Test MSE and R2 scores for both models across different experiments.

Experiment KAN Test MSE KAN Test R2 MLP Test MSE MLP Test R2
Bessel Function (funcbessel) 0.000531 0.9936 0.000577 0.9931
Exponential Function (funcexp) 0.002797 0.9986 0.002493 0.9988
Exponential Function 4 (funcexp4) 0.000760 0.9977 0.001484 0.9955
XY Function (funcxy) 0.000111 0.9990 0.000089 0.9992
Exponential Function 100D (funcexp100) 0.000613 0.8236 0.000670 0.8072
Function w/ Noise (funcnoise) 0.024705 0.9081 0.028812 0.8928
Associated Legendre (lpmv, m=1, v=3.0) (funclegendre) 0.000156 0.9999 0.001701 0.9989
Spherical Harmonic (sph_harm, m=1, n=1) (funcsphharm) 1.83e-07 0.999994 2.66e-07 0.999991
Incomplete Elliptic Integral (ellipeinc) (funcellipticint) 1.02e-05 0.999932 2.85e-07 0.999998
Incomplete Elliptic Integral (ellipkinc) (funcellipkinc) 0.000240 0.999263 0.000215 0.999339
Heat Equation PINN (pinnlearn) 0.000000 1.0000 0.000002 1.0000
Prosthetic Data (prosthetic) 0.162120 0.7523 0.161195 0.7537
Weather (weather) 0.049989 0.9414 0.050451 0.9409
ACOPF (IEEE 14-bus) (acopf_14_ieee) 7.29e-11 1.0000 1.24e-09 1.0000
ACOPF (IEEE 300-bus) (acopf_300_ieee) 1.617020 0.999997 1.749968 0.999997
PM2.5 Forecasting (pm25) 0.230902 0.7657 0.185867 0.8114

Classification accuracies

Experiment KAN Test Accuracy MLP Test Accuracy
MNIST (mnist) 93.33% 92.53%
CIFAR-10 (cifar) 45.81% 51.34%

Model Architectures

The following table details the network architectures used for each experiment, including the full layer structure (Input -> Hidden -> Output) and the total number of units (nodes) in the network.

Experiment KAN Architecture MLP Architecture KAN Total Units MLP Total Units % Units (KAN/MLP)
Bessel Function [1, 1] [1, 64, 32, 1] 2 98 2.04%
Exponential Function [2, 1, 1] [2, 32, 32, 1] 4 67 5.97%
Exponential Function 4 [4, 4, 2, 1] [4, 64, 64, 64, 64, 1] 11 261 4.21%
XY Function [2, 2, 1] [2, 32, 1] 5 35 14.29%
Function w/ Noise [5, 12, 1] [5, 64, 1] 18 70 25.71%
Exponential Function 100D [100, 1, 1] [100, 1024, 1024, 1024, 1] 102 3173 3.21%
Associated Legendre (lpmv) [1, 5, 1] [1, 64, 1] 7 66 10.61%
Spherical Harmonic (sph_harm) [2, 3, 2, 1] [2, 500, 1] 8 503 1.59%
Incomplete Elliptic Integral (ellipeinc) [2, 2, 1, 1] [2, 64, 1] 6 67 8.96%
Incomplete Elliptic Integral (ellipkinc) [2, 2, 1, 1] [2, 64, 1] 6 67 8.96%
Heat Equation PINN [2, 5, 1] [2, 64, 64, 1] 8 131 6.11%
Prosthetic Data [50, 8, 4, 5] [50, 32, 5] 67 87 77.01%
Weather [168, 16, 6] [168, 32, 6] 190 206 92.23%
ACOPF (IEEE 14-bus) [22, 32, 186] [22, 64, 186] 240 272 88.24%
ACOPF (IEEE 300-bus) [402, 32, 3804] [402, 64, 3804] 4238 4270 99.25%
PM2.5 Forecasting [612, 8, 8, 8, 3] [612, 8, 3] 639 623 102.57%

Notebooks

  • funcbessel_kan_mlp.ipynb: Bessel function regression ($y = J_0(20x_1)$, SciPy j0).
  • funcexp_kan_mlp.ipynb: Exponential function regression ($y = \exp(\sin(\pi x_1) + x_2^2)$).
  • funcexp4_kan_mlp.ipynb: Exponential function (4D) regression ($y = \exp(0.5(\sin(\pi(x_1^2 - x_2^2)) + \sin(\pi(x_3^2 + x_4^2))))$).
  • funcxy_kan_mlp.ipynb: Multiplicative interaction regression ($y = x_1 x_2$).
  • funcexp100_kan_mlp.ipynb: High-dimensional (100D) regression ($y = \exp(\frac{1}{100}\sum_{i=1}^{100}\sin^2(\frac{\pi x_i}{2}))$).
  • funcnoise_kan_mlp.ipynb: Noisy function regression ($y = 0.1x_1x_2 + 0.5\sin(x_3x_4) + \sin(x_5) + \mu$, with $\mu \sim \mathcal{N}(0,\sigma^2)$ and $\sigma=0.05$).
  • funclegendre_kan_mlp.ipynb: Associated Legendre regression ($y = P_v^m(x_1)$) with fixed $m=1$ and fixed $v=3.0$.
  • funcsphharm_kan_mlp.ipynb: Spherical harmonic regression ($y = \Re{Y_n^m(x_1,x_2)}$) with fixed $m=1$ and $n=1$.
  • funcellipticint_kan_mlp.ipynb: Incomplete elliptic integral of the second kind ($y = E(x_1,|,x_2)$, SciPy ellipeinc(x_1, x_2)).
  • funcellipkinc_kan_mlp.ipynb: Incomplete elliptic integral of the first kind ($y = K(x_1,|,x_2)$, SciPy ellipkinc(x_1, x_2)).
  • pinnlearn_kan_mlp.ipynb: Heat equation PINN benchmark (physics-informed loss; KAN vs MLP).
  • prosthetic_kan_mlp.ipynb: Prediction task using prosthetic knee abduction data.
  • weather_kan_mlp.ipynb: Weather prediction task.
  • acopf_kan_mlp.ipynb: Optimal power flow (OPF) prediction from ML4ACOPF benchmark (IEEE 14-bus system).
  • pm25_kan_mlp.ipynb: PM2.5 air quality forecasting with temporal history.
  • acopf_300_ieee_kan_mlp.ipynb: Optimal power flow (OPF) prediction from ML4ACOPF benchmark (IEEE 300-bus system).

Target Functions (Mathematical Formulations)

Below are the explicit regression targets used in the function-approximation notebooks. We use $x \in \mathbb{R}^d$ with coordinates $x_1,\dots,x_d$.

Notebook Input Target $y = f(x)$
funcbessel_kan_mlp.ipynb $x_1 \in [-1,1]$ $y = J_0(20x_1)$, where $J_0$ is the Bessel function of the first kind (order 0).
funcexp_kan_mlp.ipynb $(x_1,x_2) \in [-1,1]^2$ $y = \exp(\sin(\pi x_1) + x_2^2)$.
funcexp4_kan_mlp.ipynb $(x_1,x_2,x_3,x_4) \in [-1,1]^4$ $y = \exp\Big(\tfrac{1}{2}\big(\sin(\pi(x_1^2-x_2^2)) + \sin(\pi(x_3^2+x_4^2))\big)\Big)$.
funcxy_kan_mlp.ipynb $(x_1,x_2) \in [-1,1]^2$ $y = x_1x_2$.
funcexp100_kan_mlp.ipynb $x \in [-1,1]^{100}$ $y = \exp\Big(\frac{1}{100}\sum_{i=1}^{100}\sin^2(\tfrac{\pi x_i}{2})\Big)$.
funcnoise_kan_mlp.ipynb $(x_1,\dots,x_5) \in [-1,1]^5$ $y = 0.1x_1x_2 + 0.5\sin(x_3x_4) + \sin(x_5) + \mu$, where $\mu \sim \mathcal{N}(0,\sigma^2)$ and $\sigma=0.05$.
funclegendre_kan_mlp.ipynb $x_1 \in [-1,1]$ $y = P_{\nu}^{m}(x_1)$ with fixed $m=1$ and $\nu=3.0$ (SciPy lpmv(m, v, x)). For integer $m\ge 0$, one standard definition is $P_{\nu}^{m}(x) = (1-x^2)^{m/2},\frac{d^m}{dx^m}P_{\nu}(x)$, where $P_{\nu}$ is the (degree-$\nu$) Legendre function of the first kind.
funcsphharm_kan_mlp.ipynb $(x_1,x_2) \in [0,2\pi)\times[0,\pi]$ $y = \Re{Y_{n}^{m}(x_1,x_2)}$ with fixed $m=1$ and $n=1$ (SciPy sph_harm(m, n, theta, phi)). Using $\theta=x_1$ (azimuth) and $\phi=x_2$ (polar), $Y_{n}^{m}(\theta,\phi)=\sqrt{\frac{2n+1}{4\pi}\frac{(n-m)!}{(n+m)!}},P_{n}^{m}(\cos\phi),e^{im\theta}$.
funcellipticint_kan_mlp.ipynb $(x_1,x_2) \in [0,\phi_{\max}]\times[0,m_{\max}]$ $y = E(x_1,
funcellipkinc_kan_mlp.ipynb $(x_1,x_2) \in [0,\phi_{\max}]\times[0,m_{\max}]$ $y = K(x_1,

Image Classification Notebooks

Experiment KAN Architecture MLP Architecture KAN Total Units MLP Total Units % Units (KAN/MLP)
MNIST [784, 64, 10] [784, 16, 10] 858 810 105.93%
CIFAR-10 [3072, 256, 10] [3072, 256, 256, 10] 3338 3594 92.88%
  • mnist_kan_mlp.ipynb: MNIST digit classification.
  • cifar_kan_mlp.ipynb: CIFAR-10 image classification.

Verification results

From the notebook in the verification folder, we can run the KAN verification tasks, the results of which are shown below.

alt text alt text alt text alt text

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors