Polynomial Curve Fit
Fit a polynomial to (x, y) data via least squares. Reports coefficients, R² and residuals.
Overview
The Polynomial Curve Fit tool finds the polynomial of degree d that best matches a list of (x, y) data in the least-squares sense. It returns the coefficients, R-squared and the per-point residuals so you can see how well the fit captures your data.
It is useful for scientists fitting calibration curves, engineers approximating sensor non-linearity, students doing regression homework and analysts smoothing noisy time series. Degree 1 reduces to ordinary linear regression; higher degrees catch curvature.
How it works
For data (x_i, y_i) and degree d, build the Vandermonde matrix X with columns 1, x, x^2, ..., x^d. The normal equations X^T X β = X^T y give the coefficient vector β. The tool solves this with a small Gaussian elimination.
R-squared is computed as 1 - SS_res / SS_tot with SS_res = Σ(y_i - ŷ_i)^2 and SS_tot = Σ(y_i - ȳ)^2. Higher degrees usually push R-squared up but risk overfitting — extra coefficients hug noise instead of signal.
Examples
Data: (1,1), (2,4), (3,9), (4,16), degree 2
→ y ≈ 1*x^2 + 0*x + 0, R² = 1
Data: (0,1), (1,2.7), (2,7.4), (3,20.1), degree 2
→ y ≈ 1.825 x^2 + 0.55 x + 0.96, R² ≈ 0.997
Linear data, degree 3
→ cubic term ≈ 0, model collapses to linear
FAQ
How do I pick the degree?
Start low and only raise it if residuals show clear curvature. Beyond degree 4 or 5, fits overfit quickly on small datasets.
Why is high R² not always good?
A degree-n polynomial fits any n+1 points exactly, regardless of whether the model is meaningful. Check residual patterns and consider cross-validation.
Are higher-degree fits more accurate for prediction?
Not necessarily. They extrapolate badly outside the training range and amplify noise. Quadratic and cubic fits are usually safer.
Can I fix the intercept?
Not in this tool. To force the fit through the origin, subtract the corresponding constant manually.
Does it warn about ill-conditioned data?
For poorly conditioned Vandermonde matrices (very large x ranges combined with high degree) the solver can lose precision. Scale x to a smaller range to mitigate.