Reply to Paul Maxim on the Norming of the Mega Test

Reply to Paul Maxim on the
Norming of the Mega Test

Kevin Langdon

Originally published in Gift of Fire #81, January 1997
[Additions and revisions (9/00) are noted in brackets.]

Paul Maxim's article, "Renorming Ron Hoeflin's Mega Test'' <http://www.eskimo.com/~miyaguch/renorm.html>, is, of course, not a norming of the test in the correct meaning of the term. Mr. Maxim has not even attempted a norming, but sought only to estimate the four-sigma and mega (4.75-sigma) levels on the test.

Why Mr. Maxim has sought to emphasize these levels is anyone's guess. It couldn't have anything to do with Mr. Maxim's fixation on gaining admittance to the Prometheus Society and the Mega Society--coincidentally four- and 4.75-sigma-cutoff organizations, respectively--on the basis of his score on the California Test of Mental Maturity, which Prometheus and Mega don't accept, and it is surely mere happenstance that Prometheus and Mega do accept the Mega Test. [Mr. Maxim was admitted to the Prometheus Society after it changed its membership criteria in 1999. See the Prometheus Membership Committee Report <http://www.eskimo.com/~miyaguch/MCReport/mcreport.html> and my reply objecting to its conclusions <http://www.polymath-systems.com/intel/essayrev/mcrptrpl.html>.]

[Here I stated that the four-sigma level is "well above the ceiling of the Scholastic Aptitude Test." This is incorrect; it's slightly below the test ceiling.] While a relatively small number of testees actually achieve perfect 1600's on the test, or scores very close to 1600, the rarity of these scores is due to the well-known psychometric effect known as "ceiling-bumping," which may be thought of as a corollary of Murphy's Law, and not to superlative ability on the part of those earning such scores. To put it another way, the SAT fails to discriminate among approximately the top .02 percent of the general population. [A case could be made that it's .01 percent, but that's still only 3.75 sigma.]

Furthermore, Mr. Maxim is drawing conclusions based on only ten Mega data points, too small a sample to be statistically meaningful, and the discrepancy between the score Mr. Maxim believes should represent the four-sigma level and the score used by Dr. Hoeflin is trivial, amounting to only 1.5% of the test range.

Mr. Maxim concluded that there were too many four-sigma scores among those reporting LAIT scores to Dr. Hoeflin, without taking into account the fact that a significant fraction of Dr. Hoeflin's sample consisted of members of the Four Sigma Society and the Prometheus Society--who were selected due to having qualified at this level on the LAIT.

Mr. Maxim failed in this instance, as he has in his other writings, to consider the self-selection factor involved when people submit answer sheets for very difficult I.Q. tests. Someone with an I.Q. of 100 is very unlikely to submit his answers to questions he can't answer. The likelihood of submitting answers for scoring continues to increase with I.Q.

On page 3 of his article, Mr. Maxim wrote, "Dr. Hoeflin's data indicates that 17 LAIT scores at the 4-sigma level and above were, on average, eight IQ points higher than Mega test scores attained by the same testees . . ." Five testees in this same sample with Mega scores above four sigma, averaging 38.6 raw score, earned a mean LAIT score of 163.4, approximately 3.5 I.Q. points lower.

If the highest scores in the sample are selected on one test, the corresponding scores on the other test will naturally tend to be lower, assuming that the correlation between the two tests is less than 1.0.

In considering the CTMM sample, Mr. Maxim again failed to take into account ceiling limitations. The ceiling of the CTMM for adults is 158, far below the four-sigma level.

The Stanford-Binet is known to yield too many high scores by a large factor, reaching an order of magnitude at the four-sigma level. [See "Statistical Distribution of Childhood IQ Scores," by John Scoville <http://sac.uky.edu/~jcscov0/ratioiq.htm>.] In this case, Mr. Maxim's ignorance worked in the opposite direction and he placed the four-sigma level too low.

After all of this, Mr. Maxim placed the four-sigma level on the Mega Test three raw score points higher than Dr. Hoeflin. This is not a huge difference; the errors I have enumerated above could equally account for this discrepancy.

It should be noted that Mr. Maxim did not calculate means, standard deviations, or average deviations of Mega raw scores and previous scores, did not calculate a measure of reliability for the test, did not estimate the standard error of test scores, and did not provide a formula for converting raw scores to I.Q.'s, despite his objection to my failure to explicitly state the conversion formula for calculating LAIT I.Q.'s from scaled scores in the statistical report on the LAIT.

Mr. Maxim has demonstrated no command of the basic principles of psychometric statistics. His argument in this article is entirely self-serving and without scientific merit.