Z-score: Need for preventing erosion of trust in systems


By Sankha Muthu Poruthotage

Ph.D (Statistics)

We know that we have short memories; a fact that most of us Sri Lankans, readily acknowledge. Perhaps, it became a part of the nation’s psyche during the war. Perhaps, it allowed us to move on, rather than keep dwelling on the past. However, as we prepare to usher in an era of sustained peace and economic prosperity we can’t afford to be forgetful. It is the time to thoroughly investigate the root causes of the problems we have had in the past and have now. Then we have to formulate long lasting solutions to fix them.

The use Z-Score for university admission is such an issue that we all chose to forget. Despite it being contentious enough to cause numerous protest campaigns and eventually needing a verdict from the country’s highest court. We may recall that, it made headlines in all national newspapers for months. Unfortunately, since we never cared to address the root causes of the problem, it may only be a matter of time before it flares up once again.

I intend to discuss some of the shortcomings of the Z-Score method and propose a rough outline for a potential long term, sustainable solution. But first, a word of caution to the reader:Z-Score is a statistical technique which has its relative strengths and weaknesses, hence praises and criticisms. In fact, this is common to all scientific methodology. I personally do not know a superior statistical approach myself and am yet to see a convincing argument for an alternative method. It was proposed in a vacuum of any other viable alternatives - as a solution to a peculiar situation that arose in our AL examination process. In fact the proposal deserves to be applauded rather than criticized since it may have performed better than the next best method. My intention is to illustrate that Z-Score is a method which may work satisfactorily only under some strong assumptions and to emphasize the need to eliminate the use of statistical methods in the critical process of university admission.

Some of us may already know how to calculate the Z-Scores of a given set of data. We first calculate the mean (average) and the standard deviation (S.D.) of the data set (S.D.is a widely used statistical measure that indicates the dispersion among data points). Then from each data point, we subtract the mean and divide the answer by the S.D. Note that for each original data point, we get a corresponding Z-Score.

Now let us consider the following situation. The data I have presented are completely imaginary and serves as an illustration only.

Suppose we select the 10 students who obtained highest marks for mathematics at the past GCE OL exam and the 10 students who obtained the lowest. Then we give two different, but equally difficult, new mathematics exams to the two groups. Let the group of good students be given the exam 1 and called group 1 and the other group exam 2 and group 2. Suppose following are the exam marks in the two groups. They appear the way we expect it to be, with group 1 scoring much higher than group 2.

(Please see the table)

Thereafter let us calculate the Z-Score for each student.Now, if we rank all the students by their Z-Score, several students who are in group 2 will be ranked higher than the ones in group 1. In fact Marvan who is in group 2 will be the highest ranked overall!

This illustration, despite its extreme nature, exposes a critical assumption one needs to make when using Z-Scores to compare two groups. The two groups are identical in every sense other than the treatment of interest. If not, a rank based on Z- Scores can produce outrageous results as illustrated by this extreme example.

In our AL exam situation, willingly or not, we make several such assumptions when we rank students based on their Z-Scores. Such as,

* The new syllabus students (Those who take it for the first time) and the old syllabus students are identical in every sense other than the exam papers they were given. In other words, we assume the two groups are identical in terms of intelligence, motivation, exam preparedness and everything else that one may think of as having a potential impact on their exam performance.

* The two groups of students who take two alternative subjects, such as Economics and French language are once again identical in all aspects except for the two different subject streams that they have chosen to study.

These are strong assumptions and should be avoided if possible.

Even if we make such strong assumptions there are further obstacles down the road.A fundamental question one can ask is "Can it be justified that a Z-Score of 2 is better than a Z-Score of 1.5?" If these scores are for two candidates who sat for two different exams but belong to two groups that are identical in all other sense, the answer is "Yes".The justification is:

*The candidate with Z- Score of 2 belongs to the top 2.5% of his group while the candidate with Z-Score of 1.5 has about 6.5% of students above him. Since the two groups are assumed to be identical the candidate with Z-Score of 2 is better than the one with Z-Score of 1.5.

For theoretical completeness, I will add the following. This justification is also based on an assumption called "normality". However it is a more realistic assumption to make in a situation such as the exam scores. Even without the normality assumption this can be justified using a well-known probability result called "Tchebyshev’s Inequality".

Nevertheless, when we add Z-Scores things get a lot more complicated and harder to justify. (An aggregate Z-Score is calculated in the AL ranking process) It is a lot harder to justify that Candidate "A" with an aggregate Z-Score of 4.0 should be prioritized over candidate "B" with an aggregate Z-Score of 3.9!

At the very beginning of this article I mentioned that long lasting solutions can be devised only by investigating the root causes of a problem. The need for Z scores (Standardization) arose due to two reasons.

* Sudden existence of new and old syllabuses.

It can be argued that education curriculums should evolve—not change. Simply from an educational stand point, comprehensive curriculum changes once in every decade or so should be avoided. It deprives one group of students of any advancement in their relative fields for a prolonged period of time while burdening the other group of students and educators with an extensive amount of new material. By making curriculum revisions a continuous process,where it is done annually or bi-annually, the changes will be nominal year to year, hence will eliminate the need for two different curriculums. This will in effect eradicate the ambiguity around the use of a statistical method.

* Ability to compete for the same university degree while studying different subject combinations.

This is a complex situation that needs to be addressed from several perspectives. It is not uncommon to have students from slightly different prior educational backgrounds studying for the same degree in a university. The global practice is to evaluate students for university admissions based on a standardized exam, such as SAT, which generally covers core subject disciplines. However we need to recognize the fact that there is no need to have a standardized exam for degrees such as medicine and engineering, where a specific subject combination is required for admissions.

An alternative solution might be for higher education institutes and regulatory bodies to decide on a quota for each subject combination.I personally like an approach that demands both educators and students to evaluate their programs and choices based on resource availability and demand for employment. However I acknowledge that there are several ground realities that need to be factored in when devising a quota system as such.

Finally, I would like to point out that none of these solutions are easy fixes. They require a great deal of careful planning, hard work, dedication and determination. However, we owe it to our future generations. As we have experienced repeatedly in the past, breakdown of trust in our systems, especially in the education system, can lead to catastrophic consequences.

animated gif
Processing Request
Please Wait...