Final Report
2004 Assessment Grant
Dean De Cock
LIST OF TABLES iii
1.1 Placement Scores 1
1.2 ACT Score 1
1.3 High School GPA 2
CHAPTER 2 INTRODUCTORY COURSES 3
2.1 Math 156 (College Algebra) 3
2.2 Math 157 (Trigonometry) 9
2.3 Math 186 (Elementary Functions) 12
2.4 Math 194 (LAS Calculus) 16
2.5 Math 198 (Calculus I) 19
2.6 Introductory Class Summary 22
CHAPTER 3 UPPER LEVEL COURSES 25
3.1 Math 264 (Calculus II) 25
3.2 Advanced Placement (AP) 25
3.3 Data Collection 26
3.4 Policy Change 28
APPENDICES 30
|
Table 1 |
Correlation matrix of grade and explanatory variables for Math 156 |
3 |
|
Table 2 |
Predicted versus actual grade in Math 156 utilizing the regression model |
4 |
|
Table 3 |
Predicted grade distribution in M156 using ordinal logistic regression |
7 |
|
Table 4 |
Correlation matrix of grade and explanatory variables for Math 157 |
9 |
|
Table 5 |
Predicted versus actual grade in Math 157 utilizing the regression model |
10 |
|
Table 6 |
Predicted grade distribution in M157 using ordinal logistic regression |
11 |
|
Table 7 |
Correlation matrix of grade and explanatory variables for Math 186 |
12 |
|
Table 8 |
Predicted versus actual grade in Math 186 utilizing the regression model |
13 |
|
Table 9 |
Predicted grade distribution in M186 using ordinal logistic regression |
15 |
|
Table 10 |
Correlation matrix of grade and explanatory variables for Math 194 |
16 |
|
Table 11 |
17 |
|
|
Table 12 |
Predicted grade distribution in M194 using ordinal logistic regression |
18 |
|
Table 13 |
Correlation matrix of grade and explanatory variables for Math 198 |
19 |
|
Table 14 |
Predicted versus actual grade in Math 198 utilizing the regression model |
20 |
|
Table 15 |
Predicted grade distribution in M198 using ordinal logistic regression |
21 |
|
Table 16 |
Number of students receiving various grades for Calc II based on AP score |
26 |
|
Table 17 |
Percentage of students receiving various grades in Calc II by AP exam score |
27 |
|
Figure 1 |
Predicted grades for students in Math 156 classified by grade received |
5 |
|
Figure 2 |
Predicted grades for students in Math 157 classified by grade received |
10 |
|
Figure 3 |
Predicted grades for students in Math 186 classified by grade received |
14 |
|
Figure 4 |
Predicted grades for students in Math 194 classified by grade received |
17 |
|
Figure 5 |
Predicted grades for students in Math 198 classified by grade received |
20 |
|
Figure 6 |
Predicted grades in the introductory courses for a high ability student |
22 |
|
Figure 7 |
Predicted grades in the introductory courses for a low ability student |
23 |
|
Figure 8 |
Percentage of students receiving various grades in Calc II by AP exam score |
27 |
The purpose of this project was to analyze and improve the system used for placing students into their first mathematics class at Truman. The first part of this report will discuss placement in most common introductory classes of MATH156, MATH186, and MATH198, while the second part will focus on the special issues of MATH263.
The original goal of this study was to create a regression equation using ACT score, math placement test scores, and high school math GPA that would predict a student’s success for various math courses. This equation could then be used by the placement officer to assist in placing students into the correct math class at Truman. As data collection for this project began, it was evident various sources of information had different levels of availability.
1.1 Placement Scores
The mathematics placement exam is the primary method used by the department in placing students into their first math class. The test consists of two separate exams with one focusing on basic algebra and the other on trigonometry. The self-administered exam is mailed to all incoming freshman and is taken during the spring semester of their senior year in high school. As this is the primary method of placement, this data is available on almost all students who have attended Truman.
1.2 ACT Score
The ACT scores were easily obtained on almost all individuals that have attended Truman over the last ten years as these scores are currently used for math placement and are part of a student’s university record. Rather than using the general ACT composite score, this analysis will utilize the math composite score (ACTM) which is composed of three sub-scores covering Algebra, Geometry and Trigonometry. Although Truman does not currently use the specific math sub-scores for placement, the data was readily available on most students so these variables were added to the pool of potential explanatory variables.
1.3 High School GPA
Obtaining the GPA for high school mathematics classes was much more problematic than the other variables. The current placement system does incorporate the classes taken and grades received in high school, but only through a review of the transcript at placement time. No records of the student’s math GPA or coursework is added to the university record. This information could be obtained via manual reviews of all student application folders but this method would be time prohibitive. Fortuitously, a second method of evaluating high school performance was found. On the ACT application, students are asked to report their high school math GPA. Although this self reported data is not as reliable as transcripts, its use would reduce the time of data collection dramatically.
Initial inquiries with IT indicated that the high school GPA field did exist on our ACT tapes but the information was currently being ignored during downloads. Though many historical tapes had been eliminated, IT was able to obtain information on three years of incoming freshman (2000-2002). This resulted in approximately 2100 students having a full record of explanatory variables (see below) being available for this analysis.
ACTM – Composite math score based on three sub-scores
Sub 1 – Pre-Algebra/Elementary Algebra
Sub 2 – Algebra/Coordinate Geometry
Sub 3 – Plane Geometry/Trigonometry
PT1 – Truman administered exam evaluating algebra skills
PT2 – Truman administered exam evaluating trigonometric skills
SMG – Self reported high school math GPA
NGRADE – Grade received in the first math class taken at Truman
CHAPTER 2 INTRODUCTORY COURSES
The investigation began by conducting a simple correlation analysis to see if any of the possible explanatory variables are linearly related to the grade received in the class (or to each other). The strongest linear association with the M156 grade is the self reported high school math GPA. This is not unusual, as it would be expected that students who excelled in high school would also excel in college. It is also evident from Table 1 that PT1 and ACTM have some association with grade, while PT2 shows a minimal relationship to M156 performance.
|
|
Grade |
ACTM |
Sub 1 |
Sub 2 |
Sub 3 |
PT1 |
PT2 |
|
ACTM |
0.261 |
|
|
|
|
|
|
|
Sub 1 |
0.207 |
0.708 |
|
|
|
|
|
|
Sub 2 |
0.220 |
0.687 |
0.525 |
|
|
|
|
|
Sub 3 |
0.208 |
0.721 |
0.499 |
0.501 |
|
|
|
|
PT1 |
0.321 |
0.372 |
0.307 |
0.390 |
0.294 |
|
|
|
PT2 |
0.150 |
0.140 |
0.059 |
0.123 |
0.148 |
0.411 |
|
|
SRG |
0.334 |
0.195 |
0.154 |
0.123 |
0.157 |
0.209 |
0.103 |
Table 1 – Correlation matrix of grade and explanatory variables for Math 156
In comparing the correlation of the explanatory variables to themselves we see a strong relationship between ACTM and PT1, which would be expected as they measure similar characteristics via different tests. The score on the two placement exams are also correlated.
As many of
the explanatory variables measure similar qualities and are at least partially
correlated, a stepwise regression was conducted to determine which were the
strongest predictors of performance in M156. The variables chosen, in order of
selection, were SMG, PT1, ACTM and PT2. As the p-value for PT2 was substantially
greater than the other three
it was eliminated from the model. A regression on the
remaining variables yields the following resulting equation.
![]()
To illustrate the use of the equation two students from the data set were selected (one of high and one of low mathematical ability) and their predicted and actual grades are listed in the table below. The predicted grade for the high student (3.40) can be thought of as the average grade for a large number of students at that ability level. The particular student from the data set achieved an A in M156. The predicted grade for the low student (1.21) indicates that the average student with these abilities will tend to receive a D, while the actual student dropped the class.
|
Ability |
ACTM |
PT1 |
SMG |
Predicted |
Actual |
|
High |
29 |
23 |
4.0 |
3.40 |
A |
|
Low |
17 |
7 |
3.3 |
1.21 |
W |
Table 2 – Predicted versus actual grade in Math 156 utilizing the regression model
Although all of the coefficients in the model are highly significant, the model in general explains only 20% of the variation in the grade received by the student. Attempts to improve the fit of the model via transformations and higher order terms did not result in improvements in fit.
A plot of the grade received in the class versus the predicted performance allows for better evaluation of predictive power of the model. There is a definite trend present in the data with the distribution of predicted grade shifting lower as the actual grade decreases from A to D. What is also apparent is that the predicted grades for students who withdrew or received F’s appear more spread out than the other grades. The smaller number of W/F’s could explain the lack of mounding for these distributions, but there still appears to be excessive variation in predicted grades for these groups. Intuitively this might be expected for the students who choose to withdraw from M156, as there will be students who withdraw because they are struggling with the material (low predicted grade) and those who withdraw because the material was too easy for them (high predicted grade). More unsettling is the distribution of predicted grades for those students who receive F’s in M156. From the graph we can see predictions ranging from 0 to slightly above a 3.

This extreme variation is indicative of the difficulties in trying to predict a student’s success in a single math class based on their math abilities. While math ability (as measured by ACT, high school GPA, and placement tests) is related to success in college mathematics, a large number of intangible factors also lead to success or failure. This simple model does not incorporate such things as study habits, student attitude, emotional adjustments, extra-curricular activities, and various other factors, which are much difficult to quantify and would likely need to be collected post arrival at Truman.
The relatively poor R-sq of the model seems to contradict the graph, which indicates there is valuable information found in the explanatory variables, with higher predicted grades tending to fall into the higher grade categories. The poor model fit results from the fact that students within the data set who have almost identical mathematical skills (as measured by our explanatory variables) manage to achieve all the possible grades. This can be easily seen by reviewing students predicted to receive a B (grade = 3). The majority of students receive an A or B, but some manage to receive a C, D, or even an F. The lower grades most likely due to the extraneous variables mention above.
The phenomenon of achieving all possible grades, for the same ability, leads to another possible statistical analysis of the data. If grades are viewed as categorical classes (rather than numeric values), with A being better than B, and B being better C, etc, then an Ordinal Logistic Regression can be performed. Results of this type of analysis yield a set of recursive equations that can be used to find the probability of falling into each of the ordinal classes. Using the same explanatory variables that were found significant in the stepwise regression results in the following probability equations.
![]()
![]()
![]()
![]()
![]()
These equations are unwieldy for hand calculations but can easily be set up on an EXCEL spreadsheet.
To illustrate the use of the equations, the data from the two students selected before for the regression analysis are shown below. The student with an ACTM of 29, a SMG of 4.0, and a PT1 score of 23 (all indicating high mathematical abilities) would have the following grade probabilities. The individual would have approximately an 86% chance of receiving an A or B in M156, but there would still be a possibility they could receive a C, D, F, or W.
|
Ability |
WF |
D |
C |
B |
A |
|
|
High |
0.01734 |
0.02543 |
0.08872 |
0.30856 |
0.55992 |
|
|
Low |
0.42569 |
0.22670 |
0.21171 |
0.10648 |
0.02940 |
Table 3 – Predicted grade distribution in M156 using ordinal logistic regression
For comparison a student with a ACTM of 17, a SMG of 3.3, and a PT1 score of 7 (all indicating low mathematical abilities) would have the following grade probabilities and would likely withdraw from the class or receive a D or F in M156.
It is also useful to use these equations to calculate the students “expected” grade. From basic probability theory we know that
.
Calculating the expected grade for a student with an ACTM of 29, a SMG of 4.0, and a PT1 score of 23 yields an expected grade in M156 of 3.37, while calculating the expected grade for a student with a ACTM of 17, a SMG of 3.3, and a PT1 score of 7 yields an expected grade in M156 of 1.08
It is comforting to note the expected grade for the logistic model is very similar to the expected grade for the regression model. The logistic model, with its probability for each of the various grades, could almost be interpreted at explaining the prediction from the regression model.
The correlation matrix for M157 indicates a much stronger relationship between grade and the various explanatory variables than was seen for students enrolled in M156. All three of the measures of mathematical ability (ACTM, SRG, & PT) have a correlation of approximately 0.45, which is larger than maximum correlation of 0.33 seen in M156. Additionally, the correlation between the 3 explanatory is also stronger indicating the three measures tend to agree in their evaluation of the students mathematical ability.
|
|
Grade |
ACTM |
Sub 1 |
Sub 2 |
Sub 3 |
PT1 |
PT2 |
|
ACTM |
0.457 |
|
|
|
|
|
|
|
Sub 1 |
0.378 |
0.753 |
|
|
|
|
|
|
Sub 2 |
0.353 |
0.676 |
0.566 |
|
|
|
|
|
Sub 3 |
0.400 |
0.760 |
0.524 |
0.490 |
|
|
|
|
PT1 |
0.447 |
0.535 |
0.564 |
0.483 |
0.431 |
|
|
|
PT2 |
0.332 |
0.300 |
0.279 |
0.243 |
0.237 |
0.591 |
|
|
SRG |
0.453 |
0.297 |
0.306 |
0.196 |
0.339 |
0.301 |
0.112 |
A stepwise regression was conducted to determine which were the strongest predictors of performance in M157. The variables chosen, in order of selection, were SMG, ACTM and then PT1. A regression on these variables yields the following resulting equation.
![]()
The predicted grades for all students were calculated with the equation and the highest and lowest predictions are listed below. The predicted grade for the high student (4.19) illustrate a slight problem with the model in that the response is not limited to fall between values of 0 and 4. It can best be interpreted as predicting the student has a strong chance of receiving an A, which they did. The low student is predicted to receive a 0.57 and did withdraw from the class.
|
Ability |
ACTM |
PT1 |
SMG |
Predicted |
Actual |
|
High |
31 |
30 |
4.0 |
4.19 |
A |
|
Low |
24 |
13 |
2.0 |
0.57 |
W |
Table 5 – Predicted versus actual grade in Math 157 utilizing the regression model
All of the model coefficients are highly significant and the model explains 35.6% of the variation in the grade received by the student. Although this leaves a large amount of variation unexplained, the fit for M157 is much better than that found for M156.
The review o