Using Mathematica to fit Data to a Straight Line

Copyright © 1995, 1997, 2000, 2003 by James F. Hurley, Department of Mathematics, University of Connecticut, Storrs, CT 06269-3009. All rights reserved


Exercise 51 in Section 15.7 of Stewart's Multivariable Calculus, 4th Edition, derives the formula for the line y = mx + b that best fits (in the sense of minimizing the square of the error) a set of data points {( x i , y i ), i = 1, 2, ..., n}. As experience with graphing calculators would lead you to expect, Mathematica has a built-in command for calculating this line:
                    
     
Fit[ { {x_1, y_2}, ... {x_n, y_n} }, {1, x}, x } ]
                                   
It carries out the calculations in the formulas in Exercise 51. Once that information is available, it is easy to plot the data points and the calculated line of best fit. To illustrate, consider the following collection of data points.
   
                    
x     0      1       2       3       4      5  
                    
y     1    4.1    4.4     4.3    5.1    5.2
                    
The
Mathematica command ListPlot makes a simple scatter plot of the data points. Try it, by executing the following command.

In[1]:=

scatter = ListPlot[ {{0, 3}, {1, 4.1}, {2, 4.4},
                     {3, 4.3}, {4, 5.1}, {5, 5.2}},
                    AxesLabel -> {x, y},
                    PlotStyle -> {PointSize[0.02],
                    RGBColor[1,0,0]} ]

[Graphics:HTMLFiles/LineFit_1.gif]

     The Fit command calculates the straight line that best fits the data points. Execute the following routine to see the result of the calculations from Excercise 20. (Note: the semicolon at the end of the last line of code suppresses printing of just the expression for y  in terms x. Mathematica is rather brief in reporting the result of its calculation of the line of best fit. It gives only mx + b, where m and b are calculated from the formulas in Exercise 20. The Print command below produces a more readable report of the output from that calculation.)

In[2]:=

Print[ "The line of best fit is: y = ",
       Fit [{{0, 3}, {1, 4.1}, {2, 4.4}, {3, 4.3},
             {4, 5.1}, {5, 5.2}}, {1, x}, x] ]
Fit [{{0, 3}, {1, 4.1}, {2, 4.4}, {3, 4.3},
      {4, 5.1}, {5, 5.2}}, {1, x}, x];

The line of best fit is: y = 3.357142857142856` + 0.3971428571428575` x SequenceForm The line of best fit is: y = 3.35714 0.397143 x

          To plot that line, simply tell Mathematica to plot its last calculated object, which in this case is the line of best fit from the preceding routine. The RGBColor command specifies blue as the line's color.

In[4]:=

fitted = Plot[ %, {x, 0, 6},
               PlotStyle -> {RGBColor[0,0,1]},
               AxesLabel -> {x, y} ]

[Graphics:HTMLFiles/LineFit_2.gif]

     Finally, to plot both the points and the line of best fit on the same set of axes, ask Mathematica to show the last two plots together. Try it!

In[5]:=

Show[scatter, fitted]

[Graphics:HTMLFiles/LineFit_3.gif]

Out[5]=

Graphics TagBox[RowBox[List["\[SkeletonIndicator]", "Graphics", "\[SkeletonIndicator]"]], False, Rule[Editable, False]]


Converted by Mathematica  (June 11, 2003)