Polynomial Curve Fit

Fit a polynomial to (x, y) data via least squares. Reports coefficients, R² and residuals.

Overview

The Polynomial Curve Fit tool finds the polynomial of degree d that best matches a list of (x, y) data in the least-squares sense. It returns the coefficients, R-squared and the per-point residuals so you can see how well the fit captures your data.

It is useful for scientists fitting calibration curves, engineers approximating sensor non-linearity, students doing regression homework and analysts smoothing noisy time series. Degree 1 reduces to ordinary linear regression; higher degrees catch curvature.

How it works

For data (x_i, y_i) and degree d, build the Vandermonde matrix X with columns 1, x, x^2, ..., x^d. The normal equations X^T X β = X^T y give the coefficient vector β. The tool solves this with a small Gaussian elimination.

R-squared is computed as 1 - SS_res / SS_tot with SS_res = Σ(y_i - ŷ_i)^2 and SS_tot = Σ(y_i - ȳ)^2. Higher degrees usually push R-squared up but risk overfitting — extra coefficients hug noise instead of signal.

Examples

Data: (1,1), (2,4), (3,9), (4,16), degree 2
   →  y ≈ 1*x^2 + 0*x + 0, R² = 1

Data: (0,1), (1,2.7), (2,7.4), (3,20.1), degree 2
   →  y ≈ 1.825 x^2 + 0.55 x + 0.96, R² ≈ 0.997

Linear data, degree 3
   →  cubic term ≈ 0, model collapses to linear

FAQ

How do I pick the degree?

Start low and only raise it if residuals show clear curvature. Beyond degree 4 or 5, fits overfit quickly on small datasets.

Why is high R² not always good?

A degree-n polynomial fits any n+1 points exactly, regardless of whether the model is meaningful. Check residual patterns and consider cross-validation.

Are higher-degree fits more accurate for prediction?

Not necessarily. They extrapolate badly outside the training range and amplify noise. Quadratic and cubic fits are usually safer.

Can I fix the intercept?

Not in this tool. To force the fit through the origin, subtract the corresponding constant manually.

Does it warn about ill-conditioned data?

For poorly conditioned Vandermonde matrices (very large x ranges combined with high degree) the solver can lose precision. Scale x to a smaller range to mitigate.

Try Polynomial Curve Fit