Cuesta v. State of NY Office of Court Admin., 657 F. Supp. 1084 (S.D.N.Y. 1987)

US District Court for the Southern District of New York - 657 F. Supp. 1084 (S.D.N.Y. 1987)
April 7, 1987

657 F. Supp. 1084 (1987)

Hermino CUESTA, et al., Plaintiffs,
v.
The STATE OF NEW YORK OFFICE OF COURT ADMINISTRATION, et al., Defendants.

No. 83 Civ. 3714(PNL).

United States District Court, S.D. New York.

April 7, 1987.

*1085 Gordon & Gordon, New York City (Murray A. Gordon, Kenneth E. Gordon, Richard Imbrogno, of counsel), for plaintiffs.

Michael Colodner, Office of Court Admin., New York City (John Eisman, Ann Pfau, of counsel), for Court defendants.

Robert Abrams, Atty. Gen., State of N.Y., New York City, (Barbara B. Butler, of counsel), for Civil Service defendants.

Dretzin & Kauff, P.C., New York City (Raymond G. McGuire, of counsel), for defendant-intervenors.

LEVAL, District Judge.

This is an employment discrimination action pursuant to Title VII, 42 U.S.C. § 2000e et seq., and New York law challenging 1982 examinations administered to court officer candidates. Plaintiffs are a class of black and Hispanic individuals who were provisional court officers at the time of the 1982 tests and who lost their jobs because they either failed or scored too low for permanent appointment. They seek to have the 1982 written examinations set aside, alleging that they had disparate racial impacts and were insufficiently job-related. Defendants are the State of New York Office of Court Administration *1086 ("OCA"), the Chief Administrative Judge, the New York State Civil Service Commission, and its President. Defendant-intervenors were provisional appointees who received permanent appointments as a result of their performances on the challenged examinations.

Plaintiffs' motion for preliminary injunction was denied in an opinion dated September 16, 1983.^[1] In later opinions, defendants' motion to dismiss was denied, except as to the white plaintiffs, and the class was certified as described above. Four court officers were permitted to intervene as defendants. With the consent of the parties, the case was tried on submitted papers.^[2]

Background

Defendants are responsible for staffing the New York State Unified Court System. The positions of Court Officer (formerly Uniformed Court Officer "UCO") and Senior Court Officer ("SCO") are competitive positions in the classified civil service of the Unified Court System. See 22 N.Y. C.R.R. § 25.11. Court Officer is an entry-level position. Senior Court Officer is a promotional title from Court Officer, requiring one year of permanent service, as well as an open competitive title for anyone with one year's peace officer experience. UCOs and SCOs are peace officers under New York law, are required to wear uniforms, and may be authorized to carry weapons while on duty. UCOs are charged with maintaining order and decorum, providing security, and performing clerical duties in the Civil, Criminal and Family Courts of New York City and in the Family and District Courts in Nassau and Suffolk Counties. SCOs perform similar tasks in the Supreme and Surrogate's Courts in New York City and in the Supreme, County, and Surrogate's Courts in Nassau and Suffolk.

A competitive examination was given on December 17, 1977 for the UCO position. In September 1978 a class action was brought in this court, under the caption Underwood v. Office of Court Administration, 78 Civ. 4382 (CSH), charging that the 1977 exam discriminated illegally against minorities. Shortly after Underwood was commenced, OCA agreed to make no permanent appointments under the challenged 1977 exam; provisional appointments were made from the exam's eligibles list.

In 1980, the Underwood action was settled with court approval. The Consent Judgment called for the development and administration of a new competitive examination under the supervision of a Special Master, selected jointly by the parties. The Special Master was directed to report periodically to the court, including a final report setting forth his opinion on the validity of the examinations. OCA was not to administer the exams for at least 30 days from the time of such final report, during which time objections would be made. Preference in appointment was granted to all provisionals hired prior to the effective date of the Consent Judgment, May 21, 1980, provided that they passed the new examinations. This limited preference was upheld by the Court of Appeals. Underwood v. State of New York Office of Court Administration, 641 F.2d 60 (2d.Cir.1981). OCA was enjoined from making permanent appointments under the 1977 list, but could continue to make provisional appointments.

In July 1980, the parties agreed to plaintiffs' choice for Special Master, Robert M. Guion, Ph.D. This choice was approved by the Court on July 25, 1980. In an early letter to the Court, Dr. Guion stated that he understood that his role was "to monitor and perhaps to influence the course of research *1087 so that maximum validity and minimal adverse impact will result from construction of the test." (October 3, 1980 letter)

In February 1980 OCA submitted Requests for Bids to several leading test development organizations, and ultimately selected Management Scientists, Inc. ("MSI") to develop and validate the written examinations.^[3] Although it had not submitted the low bid, MSI was selected because of its extensive experience, particularly in police officer examinations. Throughout the test development, MSI provided monthly reports to the Special Master and the court defendants. Eventually, MSI issued a three volume report, entitled Development/Validation of Written Examinations for Uniformed Court Officer, Senior Court Officer.

Job Analysis

Test development began in May 1980, when MSI Project Director, David Wagner, and its Senior Job Analyst spent two days touring various court locations and conducting informal interviews with Chief Clerks and other supervisory personnel. They also toured the OCA training academy. OCA furnished MSI with the results of three previous job analyses and with incident reports for the first four months of 1980 as well as summaries of the 1855 incident reports filed in 1979. From these documents MSI constructed a preliminary task list representing the various tasks required to perform the court officer jobs.

In July and August 1980, after the MSI Project Director conducted 4 days of training in in-depth job anaysis interviewing techniques for the OCA staff, MSI and OCA personnel conducted 44 2-hour private interviews with incumbent UCOs. Project Director Wagner with MSI's Senior Analyst also spent thirty hours making full work-cycle observations of court officers at sampled court locations. The results of these interviews and observations were added to the preliminary task list. By September, final UCO and SCO task lists had been created, listing 121 tasks in 19 job component areas.

The next step in the job analysis was the distribution of task rating survey forms to 331 UCOs and 450 SCOs randomly-selected to represent all court locations and types. These forms asked the selected court officers to rate each task listed in terms of its a) frequency, b) amount of time, c) complexity/difficulty, d) importance, e) overall criticality, and f) whether it was performed first year on the job. Only 97 UCOs and 38 SCOs responded to the initial mailing, but later classroom administration added 106 responses in each category. The completed surveys were reviewed and analyzed by MSI staff, who produced a task index and component adjusted index for each of the 19 task families, reflecting the average ratings of the tasks involved in each component adjusted by the number of respondents indicating that the task was critical. By this process, the relative contribution of each component to the court officer jobs was determined.

At approximately the same time, MSI began to generate lists of the skills, knowledges, abilities and personal characteristics (SKAPs) necessary to perform each task. Working off of a preliminary list culled from OCA documents and the previous interviews, 35 experienced UCOs and SCOs were selected from sampled locations for half-day sessions to itemize the SKAPs required by the various tasks in each of the 19 component areas. These itemized lists were then reviewed and refined by a seven-member expert measurement panel (consisting of the MSI Project Director, Senior Analyst, and President, together with four independent psychologists). The expert panel produced 19 standardized SKAP lists which, according to the MSI Report, "were agreed upon by the seven measurement experts as reflecting a high degree of job-relatedness to the UCO and SCO tasks as described in the nineteen task-family lists." (Vol. I, p. 23)

*1088 From these lists the MSI Project Director constructed a SKAP Rating Inventory Form, designed to assess the degree to which possession of each particular SKAP was required for successful completion of UCO and SCO tasks, as well as to determine which SKAPs were learned in training or on the job. In July and September 1981, these forms were completed by 40 UCOs and 40 SCOs from sampled locations in half-day sessions conducted by the MSI Project Director or Senior Analyst. As in previous samplings, there was an effort to include significant numbers of minorities and women. MSI analyzed the completed forms and generated frequency distributions and summary statistics for each SKAP entry.

These task and SKAP analyses were then combined into a "SKAP X Component Matrix." The components (task families) were weighted to reflect their proportion of the total job, and the SKAPs were then multiplied by the weight of the component to which it was attached. The resulting products were summed for SKAPs attached to more than one component. As summarized in the MSI Report, "The result was a SKAP value for each identified SKAP. Converting these to percentages therefore presented a precise picture of the relative job-relatedness and importance of each skill, ability, knowledge and personal characteristic to the court officer job." (Vol. I:33)

A panel comprised of the MSI Project Director and President and four independent psychologists then segregated the SKAPs into five content areas, or SKAP clusters. The results of this clustering are set forth in the following tables, Table 1 representing the SKAP clusters identified for the UCO exam and Table 2 for the SCO exam.

                    Table 1: UCO SKAP Clusters
     SKAP CLUSTERS                JOB ANALYSIS POINTS          WEIGHTED %
     1. Knowledges                      4586                      41.4%
     2. Job Skills                       390                       3.5%
     3. Physical/Auditory/Visual         664                       6.0%
     4. Motivational/Personal           3696                      33.4%
        Characteristics
     5. Cognitive                       1745                      15.7%
                                       _____                      _____
                 TOTAL                 11081                       100%

On the basis of its surveys MSI determined that all of the identified UCO knowledges (cluster # 1) were learned either at the training academy or on the job. The same was found to be the case for the various job skills (# 2). Consequently, MSI concluded that it would be inappropriate to test for these SKAPs on an entry-level examination. The SKAPs in the third cluster were considered the province of ARRO which was developing the physical examinations (see footnote 3, supra). Similarly, the expert panel found that the SKAPs in the "Motivational/Personal Characteristics" Cluster could not reliably be tested on a paper-and-pencil examination. The written UCO examination was therefore constructed out of the SKAPs in the Cognitive Cluster.

                    Table 2: SCO SKAP Clusters
     SKAP CLUSTERS                JOB ANALYSIS POINTS          WEIGHTED %
     1a. Knowledges Specific to
         Situations/Locations             1308                    12.9%

*1089
     1b. Knowledges Generic to
         New York Courts                   900                     8.9%
     1c. Knowledges Taught in
         Training Courses                  156                     1.5%
     1d. Knowledges Generic to
         Peace Officers                   1375                    13.6%
      2. Job Skills                        356                     3.5%
      3. Physical/Auditory/Visual          381                     3.8%
      4. Motivational/Personal
         Characteristics                  3598                    35.6%
      5. Cognitive                        2038                    20.2%
                                         _____                    _____
                 TOTAL                   10112                     100%

As with the UCO SKAP Clusters, it was determined that the written SCO examination could not appropriately test for Job Skills, or for the Physical or Motivational Clusters. Cluster # 1a was overly dependent upon individual experiences and Cluster # 1c was taught in training. Accordingly, they were also excluded. Because an applicant for an SCO position could have served his or her one year of peace officer service outside of the New York court system, Cluster # 1b was also deemed inappropriate for testing. Cluster # 1d, "Knowledges Generic to Peace Officer," was found to be capable of valid written testing. Finally, as with the UCOs, the Cognitive Cluster was considered appropriate for testing by MSI.

For both the UCO and SCO exams, the Cognitive Clusters were refined further, since not all cognitive SKAPs could be validly and reliably measured in a written examination. Table 3 presents the breakdown of the UCO Cognitive Cluster.

                    Table 3: UCO Cognitive SKAP Clusters
     COGNITIVE CLUSTER           JOB ANALYSIS POINTS          WEIGHTED %
     Judgment                            810                     46.4%
     Ability to size-up/Evaluate         351                     20.1%
     Written Communication Skills        217                     12.4%
     Reading Comprehension               191                     10.9%
     Memory                              108                      6.2%
     Numerical Computation                42                      2.4%
     Ability to read maps/floor plans     26                      1.5%
                                        ____                     _____
                 TOTAL                  1745                     99.9%

The expert panel evaluated these categories, operationally defined judgment as "the ability to apply information ... to specific situations (practical and social) appropriately," and concluded that several of the others were ill-suited for measurement on a written examination. Both "Written Communication Skills" and "Numerical Computation" were excluded because it was found that the level of acceptability was "quite low." The final category was excluded because it was trivial. Also, the "Ability to Size-up/Evaluate" category was determined to be "not reliably measureable on a paper-and-pencil test since the majority of sizing-up performed by UCOs relies on the visual and auditory modes." (Vol. I:56) Thus, the measureable Cognitive *1090 Cluster was 63.5% of its initial total, or 10% of the UCO job. A similar procedure was applied to the SCO Clusters, resulting in a measureable total of 27.6% of the SCO job (14% cognitive; 13.6% peace officer knowledge).

Test Construction

Based on the task and SKAP analyses, the MSI panel determined that the composition of the UCO exam should be: 73% Ability to Apply Information; 17% Ability to Read, Understand Written Information; 10% Memory. The suggested breakdown for the SCO exam was: 36.8% Ability to Apply Information; 9.5% Ability to Read ...; 4.5% Memory; 49.2% Peace Officer Knowledges. For both tests it was recommended that the test items (questions) be constructed out of written materials actually used in training or on the job so as to maximize job-relatedness. Ultimately, it was decided to construct the tests so that the UCO exam would be a portion of the SCO exam, with additional questions added to the latter concerning peace officer knowledges.

OCA forwarded to MSI all documents used in training and samples of papers commonly used by court officers. The MSI Project Director then wrote a test-item outline to be used by item (question) writers. The outline described the various content areas, listed the most critical components and tasks in each area, and the number of questions needed in each category. MSI selected experienced item-writers, and Mr. Wagner (Project Director) conducted a full-day training session and evaluated their development of sample questions. Each item-writer was then assigned a number of questions to prepare out of the materials. The number of questions originally planned was reduced because of limitations of the source material.

At the suggestion of the Special Master, the next step in the construction of the examinations was to pilot-test the various proposed items to check on their validity and degree of adverse impact.^[4] For reasons of test security it was decided that the pre-testing should occur away from New York and that no more than a portion of the exam would be pre-tested at any one site. MSI sought to obtain testing populations which, in aggregate, would replicate the expected demographics of the actual candidate population. The pre-tests took place in Miami, Boston, Fort Worth and Detroit and involved a total of 458 individuals. In Detroit and Miami those participating were applicants to the city police forces. In Boston and Fort Worth the testing populations were drawn from individuals who had applied within the previous year to the local employment office and expressed an interest in peace/police officer positions. The demographics of the pilot-test population and the actual UCO/SCO candidates are strikingly similar. The pilot-test sample population was 34.7% white, 47.2% black, and 13.5% Hispanic. The actual candidate pool was 39.2% white, 43.2% black, and 11.8% Hispanic. 30.35% of the sample compared with 33.7% of the applicants were female. The mean age of the sample was 27.9 years, and of the applicants, 29.3 years. In each city, either the MSI Project Director, the OCA Project Director or both were present at the administration of the tests.

In order to test the validity of the exam questions to test whether they actually tested for what they purported to test for those taking the pre-tests were also administered a number of other exams, well-established standardized measures of reading comprehension, memory and cognitive abilities.^[5] Correlations with these *1091 measures would be confirming evidence of item validity. Also administered was a disconfirming measure, the Minnesota Clerical Test, with which, it was hoped, there would be limited or no correlation. Finally, a sample of questions from the 1977 court officer test were included in order to calibrate the difficulty predictions.

In all, 55 reading comprehension items, 64 ability to apply items, and four memory stories with 15 questions each were pretested. Each non-memory item was rated in terms of difficulty, adverse impact, and its validity profile.^[6] The memory items were considered as elements of one of the four story groups. Based upon the generated profiles, 5 reading comprehension questions were reclassified as ability to apply items and one item was reclassified in the other direction. Each item in the final pools was rated according to the three parameters (difficulty, impact, profile match). Overlapping items were flagged so that both would not be finally selected. On the basis of these characteristics the two project directors (MSI & OCA) selected the items for the final version of the UCO/SCO exam.

Twenty reading comprehension items were selected. Their aggregate ratings were: Difficulty 19 moderate, 1 difficult; Impact 10 no impact, 10 minimum impact (of which 7 had impact ratios higher than 70); Profile Match 3 perfect, 12 good, 2 marginal, 2 poor, 1 not computable. (See footnote 6, supra) The 20 items possessed the following combination of characteristics:

2 Perfect/Moderate/No Impact 1 Perfect/Moderate/Minimum 6 Good/Moderate/No Impact 5 Good/Moderate/Minimum Impact 1 Good/Difficult/Minimum Impact 1 Marginal/Moderate/No Impact 1 Marginal/Moderate/Minimum Impact 1 Poor/Moderate/No Impact 1 Poor/Moderate/Minimum Impact (> 70) 1 N.C./Moderate/Minimum Impact (>70)

In comparison, of the thirty items not selected, twenty-five had either poor, flat or non-computable profiles or revealed impact, or both. Only one of the rejected items showed a good profile and no impact. Seen another way, while 85% of the selected reading comprehension items had profiles which were marginal or better with minimum or no impact, only 13.3% of those rejected fell within the same category.

Forty-five of the pilot-tested questions were selected as ability-to-apply items. Thirty of these items demonstrated perfect or good profile matches; only one showed impact, and thirty-three were in the minimum impact range (17 > 70). Of the ten with poor or flat profiles, none showed impact. By contrast, twenty-one of twenty-five rejected items possessed poor, flat or non-computable profiles or impact, or both.

The pilot testing of the memory portion of the examinations involved four memory stories, each with fifteen related multiple choice questions. The second memory story was selected because it correlated most strongly with the alternative measures on a majority of the comparisons. The highest correlating ten-item set was chosen for inclusion in the exam. It appears from the *1092 raw data that the memory items selected were not those with the least adverse impact (MSI Report Vol. III: Table 14).

These seventy-five items constituted the entirety of the 1982 UCO written examination, and all but the twenty-five question peace officer knowledge portion of the SCO written test.

Dr. Robert Guion issued his "Report of the Special Master" on April 22, 1982. He expressed his appraisal that the mandate of the Underwood Consent Judgment to develop a valid examination "in conformity with the requirements of Title VII" had been "largely satisfied." He concluded that "The work that is complete, the work still in progress for final item selection, and the work planned for post-administration analysis represent technical advances beyond conventional professional standards and provide reasonable assurance of satisfying a combination of content validity and construct validity requirements." Dr. Guion also noted his approval of MSI's procedure for setting a cutting (passing) score, described below. No objections to this report were received.

Test Administration

The written examinations (# 45-547; # 45-548) were administered on May 22, 1982. The applicants were given four hours to complete the tests.

In test locations throughout New York State, 32,333 candidates sat for the UCO exam and 9,454 for the SCO exam. The UCO testing population was 40.1% white, 42.7% black and 11.5% Hispanic. For the SCO exam, 43.6% were white, 38.2% black, and 10.3% Hispanic.^[7]

Determination of Cutoff Score

The MSI procedure for setting the passing score relied upon the judgment of subject matter experts (SMEs) permanent Senior Court Officers and supervisors knowledgeable about the UCO and SCO jobs. OCA provided MSI with 28 SMEs from sampled courts to participate in setting the passing scores for the two exams. Ultimately, 21 SMEs were involved, including 7 from minority groups. In sessions conducted by Mr. Wagner (MSI Project Director) roughly contemporaneously with the examinations, the SMEs were given descriptions of "acceptable," "above average," and "unacceptable" UCOs and SCOs and asked to estimate the percentage in each category who would answer each item on the 1982 examinations correctly. The SMEs were also asked to estimate the percentage of all candidates answering correctly for each of the items as well as for several questions on the 1977 test for which results were known. Prior to making either set of estimations, 1½ hours were spent training the SMEs.

All but one of the SME panel completed the full set of estimations. Each SME's set of ratings was adjusted by the degree to which their estimations as to the 1977 questions diverged from the actual results.^[8] The panel's combined estimate was that an "acceptable" UCO would answer 68.63% of the memory items and 75.66% of the reading comprehension and ability-to-apply items correctly. Multiplying these percentages by the number of questions in each category yielded a passing score of 56 (out of 75).^[9] The same procedure applied to the SCO examination generated a cutting score of 72 (out of 98, since two questions were deleted)^[10].

Results of the Examinations

Applying the cutting score generated by the SMEs, 13,147 candidates passed the UCO examination for a passing rate of 40.7%. The passing rates varied considerably across racial groups. The white passing *1093 rate was 57.4%, the black rate 28.4%, and the Hispanic rate 28.7%. The results of the SCO test were similar. The overall passing rate was 44.4%. White candidates passed the SCO test at a 59.5% rate, blacks at 29.1%, and Hispanics at 33.9%.

On the other hand, among provisional court officers taking the examinations, there was far less disparity. The group passing rate for the two tests was 81.8%. Whites passed at an 88.6% rate, and non-whites at a 71.5% rate. Preference candidates those provisionals appointed prior to the effective date of the Underwood Consent Judgment achieved a passing rate of 83.5% 90.6% for white candidates and 72.9% for non-whites. (Def. Vol. A; Doc. 4; Tables 1 & 2).

Those preference provisionals who received a passing score were appointed as permanent court officers. Eligible lists of all others passing the examinations were established on November 24, 1982 (CO) and January 17, 1983 (SCO). In accordance with New York law, SCO appointments were made off of promotion eligible lists (including court officers with one year permanent status)^[11], and the open competitive list was not reached. Appointments from each of the lists are made in rank order on the basis of written examination score (plus any veterans points). The last provisional court officers (those not receiving permanent appointments) were displaced in November 1983.

Discussion

Title VII of the Civil Rights Act of 1964, 42 U.S.C. § 2000e-2, outlaws racial discrimination in hiring. As explained by the Supreme Court, "[t]he Act proscribes not only overt discrimination, but also practices that are fair in form, but discriminatory in operation." Griggs v. Duke Power Co., 401 U.S. 424, 431, 91 S. Ct. 849, 853, 28 L. Ed. 2d 158 (1971). The Act explicitly holds harmless, however, an employer acting "upon the results of any professionally developed ability test provided that such test, its administration or action upon the results is not designed, intended or used to discriminate because of race, color, religion, sex or national origin." 42 U.S.C. § 2000e-2(h).

When challenging a facially neutral employment practice, such as the examinations in question here, plaintiffs must make a prima facie showing that the practice "had a significantly discriminatory impact." Connecticut v. Teal, 457 U.S. 440, 446, 102 S. Ct. 2525, 2530, 73 L. Ed. 2d 130 (1982). Where a prima facie case is established, "the employer must then demonstrate that `any given requirement [has] a manifest relationship to the employment in question'...." Id., (quoting Griggs, 401 U.S. at 432, 91 S.Ct. at 854). Even where the employer demonstrates job-relatedness, however, "the plaintiff may prevail, if he shows that the employer was using the practice as a mere pretext for discrimination." Id., 457 U.S. at 447, 102 S. Ct. at 2530.

Disparate Impact

Plaintiff must first satisfy the threshold requirement of showing disparate racial impact. A widely accepted benchmark for assessing disparate impact is the "four-fifths rule" of the 1978 Uniform Guidelines on Employee Selection Procedures, 29 C.F.R. Part 1607 ("Guidelines"). The Guidelines explain that "A selection rate for any race, sex, or ethnic group which is less than four-fifths ( 4/5 ) (or eighty percent) of the rate for the group with the highest rate will generally be regarded by the Federal enforcement agencies as evidence of adverse impact...." 29 C.F.R. § 1607.4(D). See also Guardians Ass'n of the New York City Police Dep't v. Civil Service Comm'n, 630 F.2d 79, 88 (2d Cir. 1980) ("Guardians IV").

For the candidate population as a whole, the disparate racial impact of the 1982 court officer examinations is far in excess *1094 of the 80% standard. The impact ratios (% minority passing/% white passing) at the cutoff score on the UCO examination was 49.5% for blacks and 50% for Hispanics. For the SCO examination, the pass rate for blacks was 48.9% of the white pass rate, while Hispanics passed at 57% of the white rate. (This may not be a relevant statistic for the SCO selections, since appointments were made from the promotional lists for which statistics were not provided.) On both examinations, the disparity between white and minority selection rates was worse at higher scores where the selection cutoffs were actually set. For example, at the score of 60 on the UCO exam, the black impact ratio is 37.7% and the Hispanic was 38.1%.

It is an altogether different picture, however, regarding the plaintiffs in this action, a class limited to blacks and Hispanics who served as provisional court officers. Not only was the overall pass rate considerably higher for provisionals (81.8% vs. 40.7% or 44.4%), but the racial disparity of results was strikingly diminished. The impact ratio for minorities as a group at the passing score was 80.5% for preference provisionals and 81.4% for non-preference provisionals [with an overall impact ratio of 80.7%]. Moreover, for preference provisionals the exam was strictly pass-fail, thus there was no escalation of disparate impact because of the rank-ordered selection.

Arguably, plaintiffs have not satisfied their initial burden of showing a "significantly discriminatory impact" affecting provisional court officers. The four-fifths rule should not, however, be seen as establishing a bright line rule applicable to all situations. The Guidelines acknowledge that "[s]maller differences in selection rate may nevertheless constitute adverse impact, where they are significant in both statistical and practical terms...." 29 C.F.R. § 1607.4(D) Particularly in light of the evidence of gross racial disparities as to candidates who were not provisionals, I find that plaintiffs have established a prima facie case of discrimination under Title VII.

Job Relatedness

Once disparate racial impact is shown, the issue becomes whether the exam was properly job related, thus rebutting the inference of unlawful discrimination arising from plaintiffs' prima facie showing. As expressed by the Court of Appeals, "[t]he crucial question under Title VII is job relatedness whether or not the abilities being tested for are those that can be determined by direct, verifiable observation to be required or desirable for the job." Guardians IV, 630 F.2d 79, 93. In the words of the Supreme Court, "[w]hat Congress has commanded is that any tests used measure the person for the job and not the person in the abstract." Griggs, 401 U.S. 424, 436, 91 S. Ct. 849, 856.

The Court of Appeals in Guardians IV distinguished between the relatively straightforward early challenges to employment tests which "were so artlessly constructed that they could be judged invalid without extensive inquiry, fine distinctions or a precise notion of where the line between validity and invalidity was located," and those "second generation" tests, such as the Guardians IV exam, which were constructed with an eye toward the requirements of Title VII and the need to demonstrate validity. 630 F.2d at 88-89. As a result, the Court of Appeals was required to specify with great care the standards for judging test validity. The tests now at issue could be seen as of a "third generation," constructed under the watchful eye of a court appointed Special Master with care taken to meet the specifications set out in the Guardians IV opinion. There is considerable dispute between the parties as to whether that effort was successful.

Before addressing the contentions of the parties, one caveat is in order. As noted by the Court of Appeals, "[t]he study of employment testing, although it has necessarily been adopted by Title VII and related statutes, is not primarily a legal subject." 630 F.2d 79, 89. It is, rather, the province of specialized psychologists. Two consequences flow from this fact. First, the evolution of the subject matter occurs, *1095 to a great extent, independent of the law. Insofar as the law defers to professional standards in this area, it must allow for change, both as to standards and technologies, and not remain frozen at the stage of development associated with some previous case. At the same time, the deference to professional standards is far from total. It is Title VII which governs judicial decisions, not the American Psychological Association. Courts must therefore avoid, as the Court of Appeals cautioned, "too rigid an application of technical testing principles...." Id. at 90.

Both points are relevant to the status of the Uniform Guidelines. In recognition that they reflect expert opinion, they "should always be considered, but they should not be regarded as conclusive unless reason and statutory interpretation support their conclusions." Guardians IV, 630 F.2d at 91. In recognition that the expert opinion embedded in them reflects the state of the art as it existed in 1978, they should be examined with an eye toward developments in professional standards which may have rendered some of their judgments obsolete.

Measures of Validity

There are three strategies for assessing the validity^[12] of a written examination: content validation, construct validation, and criterion-related validation. In rough outline, a criterion-related validation seeks to determine the validity of a selection procedure by measuring the results of the procedure against ratings of actual job performance. Content validity is demonstrated by showing that the exam measures knowledges, skills, or abilities which are utilized in the job. When the exam seeks to test for more abstract characteristics of applicants, such as intelligence or leadership, the appropriate testing strategy is contruct validation. According to the Guidelines, a showing of construct validity requires "empirical evidence from one or more criterion-related studies," to demonstrate that the constructs are "validly related to the performance of critical or important job behavior(s)." 29 C.F.R. § 1607.14(D) (3). The Guidelines concede, however, that at the time of their writing, construct validation is "a relatively new and developing procedure in the employment field, and there is at present a lack of substantial literature extending the concept to employment practices." § 1607.14(D) (1).

Plaintiffs contend that the skills and abilities tested for on the court officer examinations are sufficiently abstract to merit construct validation. Absent a criterion-related validity study, they contend, the examinations simply cannot be considered valid. Plaintiffs' expert, Dr. Richard Barrett, testified that "[n]either MSI nor Dr. Guion [Special Master] has provided any data to show that the constructs tested for on the 1982 examinations are related to performance on the UCO and SCO jobs." He stated further that "the fundamental flaw in MSI's research is that there is no reason that their constructs, however well they are measured, will help the Office of Court Administration to pick better officers." As to the possibility of relying on content validation, Dr. Barrett declared that with few non-pertinent exceptions, "tests of the job content are not suitable for selection at the entry level...."^[13]

Defense witnesses disagreed on all counts. Defendants take the position that *1096 the examinations can be sufficiently validated using content validation techniques and that, in any event, the pilot testing involving confirming and disconfirming measures more than adequately satisfied construct validation requirements as they are presently understood. Dismissing plaintiffs' expert as holding "a radical view of validation," David Wagner (MSI's Project Director) testified that the abilities tested for on the examinations are not so abstract and ill-defined as to "require a construct validation strategy based upon a series of criterion-related studies." From the perspective of the Special Master, "the job relevance of the examination had been very well established by an unusually sound job analysis, [and] the procedures for checking on the construct validity at the item level were far superior than those that I have encountered in any other testing situation...." Dr. Guion considered the tests validated through a combination of content and construct validity strategies. Moreover, they argue, there was no possibility to conduct any criterion-related study because of non-cooperation of employee unions, absence of alternative raters, inability to develop a reasonable measure of performance, concerns about test security, and (as to later validation) concerns that the ratings would be contaminated by knowledge of employees' test scores.

To a considerable extent, the dispute between the parties' approaches to validation was resolved, at least for legal purposes, by the Court of Appeals in Guardians IV. The Court of Appeals noted that the Guidelines "adopt too rigid an approach" and make "too sharp a distinction between `content' and `construct,'" 630 F.2d 79, 92, 93. Instead, the Guardians IV opinion explains, "abilities, at least those that require any thinking, and constructs are simply different segments along a continuum reflecting a person's capacity to perform various categories of tasks." Id. at 93. Content validation should not be abandoned "simply because [the] abilities could be categorized as constructs." Id. The Court of Appeals concluded that "as long as the abilities that the test attempts to measure are no more abstract than necessary, that is, as long as they are the most observable abilities of significance to the particular job in question, content validation should be available." Id. The more abstract the ability, however, the more convincing a validation is required. The Court recommended a functional approach, beginning with an assessment in terms of content validity and then, if necessary, proceeding to other analyses.^[14] While it appeared to agree that construct validation requires a criterion-related study (630 F.2d 79, 92), the Court of Appeals expanded the category appropriate for content validation.

The police officer examination challenged in Guardians tested for three abilities: the ability to remember details, the ability to fill out forms, and the ability to apply general principles to specific situations. The Court of Appeals stated that those "three basic abilities are not so abstract, on their face, as to preclude content validation" provided that more concrete abilities were not "needlessly omitted from those considered for measurement." 630 F.2d 79, 94. The abilities at issue in the court officer examinations memory, reading comprehension, and ability to apply information to specific situations are strikingly similar to those for which the Court of Appeals approved content validation in Guardians IV. Plaintiffs protest that defendants have simply renamed constructs previously referred to as judgment and common sense. Defendants, in turn, contend that it is the proper role of testing experts to refine and operationally define terms for examinations. *1097 This dispute goes more to whether the abilities tested for were actually a significant part of the job than to a facial assessment of the abilities measured. I find that the court officer examinations escape any threshold disqualification of content validation.

Application of the Guardians Standards

The Court of Appeals in Guardians IV established five requirements for "an exam with sufficient content validity to be used notwithstanding its disparate racial impact." The examination must (1) be predicated upon "a suitable job analysis," and (2) have been constructed with "reasonable competence." "The basic requirement, really the essence of content validation, is (3) that the content of the test must be related to the content of the job. In addition, (4) the content of the test must be representative of the content of the job. Finally, the test must be used with (5) a scoring system that usefully selects from among the applicants those who can better perform the job." 630 F.2d 79, 95. I consider each requirement in turn.

(1) Job analysis

The Court of Appeals and the Guidelines concur that an acceptable job analysis must include "an analysis of the important work behavior(s) required for successful performance and their relative importance...." 29 C.F.R. § 1607.14(C) (2). As set out in detail above (pp. 1087-1090), MSI performed an exhaustive analysis of the court officer jobs, involving observations, surveys and interviews. An extensive task list was constructed and then reviewed by incumbent court officers. After further interviews and surveys, the relative importance of nineteen task families (components) was determined. Similarly, the skills, knowledges, and abilities required for each task were specified and weighed.

The Special Master was very impressed with the quality of the job analysis, praising both its "care and sophistication" in a contemporaneous assessment. Plaintiffs' expert substantially agrees. He testified that the MSI Reports "provide an acceptable analysis of the UCO job. By reading it, one who has had no contact with the job can form a clear picture of what incumbents do on the job. The Appendix also presents data on the frequency, complexity and importance of each of the job functions." On the basis of this rare concurrence in professional judgment, I find that the job analysis was unquestionably "suitable."

(2) Test construction

The second requirement is that "reasonable competence" be used in constructing the examination. The construction of the examination in Guardians was criticized by the Court of Appeals for its "haphazard" process of writing questions. The test was constructed by "amateurs in the art of test construction" who "did not have access to the job analysis material during much of the process." 630 F.2d 79, 96. The Court warned that "an employer dispenses with expert assistance at his peril." Id.

The warning was duly heeded by the Office of Court Administration. Not only was the entire process supervised by a highly competent and experienced Special Master, but the OCA selected a professional test development firm on the basis of their extensive experience, despite the fact that they were not the low bid for the project. Again, the details of the test construction are set out above (pp. 1090-1092).

Building on the strong foundation of a solid job analysis, MSI carefully selected those SKAPs which could be validly measured.^[15] Test items were written by experienced professionals under the supervision of the Project Director and with full access to the results of the job analysis. Each item was derived from actual materials used by court officers in training or on the job. Once a sufficient number of items were developed, they were pilot-tested on *1098 selected sample populations to measure their difficulty, impact, and validity. The demographics of these sample populations closely paralleled those of the actual candidate pool. To the extent possible, the items reflecting the greatest validity and least impact were chosen for the final examination.

The court officer examinations rank quite high in terms of the technical aspects of test development considered so far.

(3) Relatedness of test content

The crux of Title VII's requirement is that "the content of the test must be related to the content of the job." 630 F.2d 79, 95. Plaintiffs' expert, Dr. Barrett, testified that the court officer examinations "use language and test for concepts which are far beyond what the job requires." He argued, further, that the process of refining and operationally defining the SKAPs reported by the subject matter experts effectively severed any connection between the examinations and the job. Memory involves more than remembering a written statement, and there is more to judgment than applying rules to briefly described situations. Thus, he concludes, the examinations "did not measure the SKAPs identified as important for the UCO and SCO jobs by the Subject Matter Experts."

These claims are unconvincing. The careful and exhaustive job analysis performed by MSI and the professional translation of job analysis results into test items provides great assurance that the abilities tested for are actually required for the job. The process of operationally defining identified abilities was in the service of valid measurement, even if it did, to some extent, reduce the scope of the abilities measured. Without the intermediate translation, either the expert panel would have had to identify the SKAPs in the first instance or the court officers would have had to have constructed the examination. Neither alternative has much to recommend it.

As to the difficulty of the language used in the test items, MSI conducted a readability analysis and found that "94.7% of the sample 404 sentences had a reading level at the high school graduate level or below...." It is worth noting, in this regard, that the items were constructed out of actual materials used by court officers. Dr. Barrett's other contention, that one cannot completely capture an ability such as memory on a written test, is of course true. This is an imperfection tolerated by Title VII, however, which endorses "professionally developed ability test[s]."^[16] If no test were permissible unless it completely captured and perfectly measured an attribute, jobs would need to be tested on a miniscule portion of their content, rendering the tests defective for inadequate representation.

The Special Master concluded, in his post-test report, "that the content of the examinations was developed according to procedures, with such care, that the job relevance of the final examinations was assured." I note further that the job relatedness of the examinations is also suggested by the markedly higher pass rate among provisional court officers. Within the parameters of legal tolerance, the court officer examinations must be seen to test adequately for the content (broadly defined) of the jobs in question.

(4) Representative of test content

The greatest weakness of the court officer written examinations is their limitation to measuring only a small portion of the skills, knowledges or abilities required to perform the jobs. The fact of this limitation is undisputed. Its significance is highly contested. The legal requirement, as stated by the Court of Appeals, is that "the test measure important aspects of the job, at least those for which appropriate measurement is feasible, but not that it measure *1099 all aspects, regardless of significance, in their exact proportions." 630 F.2d 79, 99.

Plaintiffs make much of the fact that the UCO examination tested for only 10% of the job content. "By no stretch of the imagination," Dr. Barrett testified, "can a test that fails even to attempt to measure almost 90% of the job be said to `constitute most of the important parts of the job.'" (quoting Guidelines, 29 C.F.R. § 1607.14(C) (8).) Even if the measurement is perfect, Dr. Barrett argued, it simply focuses on "too little to justify the use of the test on the basis of its content validity...."

This position has some force, but it is overstated. Logically, the test need measure only for those attributes necessary upon entry for successful performance of the job. It makes no sense, as the Guidelines recognize, to test for skills or knowledges to be learned on the job or in training. Thus, while the UCO examination only measured for 10% of the job-related SKAPs, it measured for a far greater proportion of those SKAPs required for entry. David Wagner, MSI Project Director, testified that "Fully 61.7% of the identified UCO SKAPs are knowledges and skills which are either learned in the Court Officer Training Academy or on the job, or both, and thus cannot be tested for on an entrance exam."

Moreover, the court officer written examinations were only part of the battery of tests court officer candidates were required to take. Passing scores were also required on physical ability, psychological and medical tests, in addition to a background check. Once the 61.7% of skills and knowledges acquired after hiring is excluded, the combined representativeness of the various UCO tests is quite high. Of the 38.3% left to be appropriately measured, David Wagner testified that "approximately 10.01% were measured on the written examination and approximately 25.5% were measured on the medical, physical ability and psychological tests and background investigations." Only a small proportion of the SKAPs required at entry were deleted as too trivial or impossible to measure validly.

Even focusing exclusively on the written examinations, it is not the case that OCA and MSI were "seizing on minor aspects of the ... job as the basis for the selection of candidates." Guardians IV, 630 F.2d 79, 99. All aspects of the job for which measurement was appropriate (not learned on the job) and feasible (neither trivial nor incapable of valid measurement on a written exam) were included. No suggestion has been made that any aspects of the jobs were needlessly excluded. Exclusions were because of either inability to test the attribute fairly or because of the unfairness of testing a skill that was to be learned in training. Inevitably, the representativeness of such an exam will be weaker than its relatedness to job content. With any job a very significant part of an individual's predictable performance depends on motivational and other intangible factors that are not susceptible to such testing. Given plaintiffs' minimal showing of disparate racial impact (as to the plaintiff class of provisional court officers), and the obvious other strengths of the tests, the restricted focus of the written examinations is not enough, by itself, to invalidate the entire examinations.

Still, the force of plaintiffs' critique cannot be wholly deflected. Even if the written exams were but a part of the total selection procedure, they were, in many ways, the most important part. All other requirements were pass-fail; the rank orderings of the written examination were the basis for selection. The narrow focus of the written examinations does have implications for the appropriateness of the rank ordering system. It overemphasized the importance of a portion of the necessary SKAPs. This is discussed below.

(5) Selection system

The selection system for the court officer examinations involved both a cutting score and rank-ordered scoring. Both aspects need be critically examined. The Court of Appeals has endorsed the position of the Guidelines, that where "cutoff scores are *1100 used, they should normally be set so as to be reasonable and consistent with normal expectations of acceptable proficiency within the work force." 29 C.F.R. § 1607.5(H). The Guardians IV court explained that "a criterion-related study is not necessarily required" in order to establish a basis for the cutting score; the employer could rely on "a professional estimate of the requisite ability levels" in determining the score. 630 F.2d 79, 105.

As detailed above (p. 1092), MSI used subject matter experts to establish the cutting score by evaluating the performances of "unacceptable," "acceptable," and "above average" court officers on the examinations. The adjusted estimates of the 20 SMEs were then combined to set the score. In the view of the Special Master, this was "an empirically sound procedure for establishing passing scores.... [F]ar superior to any other procedure [he had] encountered." Empirical support for the accuracy of the SME predictions can be found in the fact that their combined projection of the mean score on the examination was within 1.15 points of the actual mean.

Justifying a particular cutting score is especially difficult, the Court of Appeals noted, where the scores are closely bunched around the cutting point. On the police officer examination involved in Guardians IV, where the cutting score was set at 94, two thirds of all passing applicants were bunched between 94 and 97. The scores on the court officer examinations were far more evenly distributed. Only one third of passing applicants scored in the analogous range between 56 and 58. Moreover, only 16% of all scores fall within the five point range surrounding the cutting score on the UCO examination (from 54 to 58), compared to 29% on the Guardians exam, and no single score was obtained by more than 3.4% of the candidates. In addition, the reliability of the court officer examinations the degree to which they could be expected to produce consistent results in repeated administrations was quite high, yielding reliability coefficients of .89 or higher on both exams (and for each racial group). Finally, unlike the Guardians IV exam where the passing score was fixed at the point designed to obtain the precise number of candidates needed to fill the vacancies, and consequently numerous candidates failed whose scores differed insignificantly from those who were hired, here the passing score was fixed by reference to acceptable result and included many more than the number needed.

Because I find that the cutting score was established in a professionally valid manner consistent with Title VII, and that, as discussed above, the examinations were otherwise valid, I conclude that the defendants have successfully rebutted the prima facie showing of discrimination as to those plaintiffs who failed the examinations. The validity of the rank-ordering is an issue, however, to those provisional candidates who passed the examinations but were not hired because of their relatively low scores.

The distribution of scores and the test reliability are relevant to any evaluation of rank-ordered scoring as well as to the cutting score. In order to use such a procedure in the face of disparate racial impact, an employer must show that "a higher score on a content valid selection procedure is likely to result in better job performance...." Guidelines, 29 C.F.R. § 1607.14(C) (9). The Court of Appeals stated that the necessary relationship between higher scores and better performance "may permissibly rest on an inference," but that where racial disparity increases with the scores, as it does here^[17], "the appropriateness of inferring that higher scores closely correlate with better job performance must be closely scrutinized." 630 F.2d 79, 100.

Test validity is a matter of degree. More is required where acceptance or rejection turns on a single point differential than where gross high-low distinctions are used.^[18] Evidence of substantial reliability *1101 is necessary to support rank ordering (630 F.2d 79, 101), and it is sufficiently present in this case. Also present is the use of pre-testing to isolate and discard inferior test items. (Defendants also point to the fact that the subject matter experts predicted that an above-average candidate would score 63.14 compared with 55.69 for an acceptable candidate. This adds little, however, as the SMEs could be expected to predict higher scores for persons defined as better performers regardless of the validity of that association.)

The limited representativeness of the exam counsels against the use of strict rank ordering to determine which of the passing candidates will be hired and in what order, particularly on the UCO test. The use of an exam that measured only 10% (or 27% for SCOs) was proper by reason of the test developers' recognition that other components of the job were not amenable to graded testing, the importance for the job of the attributes tested, and the fact that other important aspects were not needlessly excluded. Nevertheless, to use single point differentials on an examination of such limited scope to wholly determine hiring priority over-inflates the importance of those abilities measured. It makes little sense for someone far superior in other aspects of the job that were graded only pass-fail, to be hired much later, if at all, because of an insignificantly inferior score on a small portion of the job-related abilities. On the other hand, more substantial differences in test scores might well be of value in predicting job performance and an acceptable basis of distinction.

The Court of Appeals explained that "content validity sufficient with rank-ordering does not require literal compliance with every aspect of the Guidelines." 630 F.2d 79, 104. But what is required is "a substantial demonstration of job-relatedness and representativeness...." Id.^[19] The court officers examinations are unassailable on the first requirement, but inevitably much weaker on the second. I cannot conclude that strict rank ordering is justified on these examinations at the expense of increasingly disparate racial impact.

Conclusion

Plaintiffs present no evidence whatsoever that the court officer examinations were used merely as a pretext for discrimination, and it is my view that no such showing could be made. Consequently, to the extent that defendants have successfully shown job-relatedness, they have rebutted plaintiffs' prima facie case. I find that defendants have sufficiently validated their examinations as to all aspects (including the pass-fail break point) other than use of strict rank-ordering among the passing candidates to determine priority. Although some of the content involved in the examinations is located toward the abstract end of the spectrum, the demonstration of validity was sufficiently strong to satisfy the demands of Title VII. This finding disposes of the case as to those black and Hispanic provisionals who failed the examinations (this includes the entire category of preference provisionals, who were appointed regardless of ranking if they achieved a passing score). They are not entitled to any relief.

Plaintiffs who passed the examinations but were discharged because they failed to score high enough, may have a valid claim concerning the use of strict rank ordering. As the evidence did not particularly focus on this point, it is not possible to determine how many of the plaintiff class may have been aggrieved. Also some of the information supplied on the use of the eligibility lists to fill vacancies is now dated. Further *1102 evidence is necessary to make a final determination of the validity of the rank ordering system used. If it is found invalid, there are obviously various options for relief, including, though not limited to, use of "grosser" rank ordering, retroactive seniority, and/or present appointment.

A conference will be scheduled promptly to discuss the submission of further evidence and possible remedial plans.

NOTES

[1] Plaintiffs first moved to intervene in the Underwood action, described below. In an opinion dated April 28, 1983, Judge Charles Haight of this court denied that motion as untimely.

[2] Plaintiffs did not press their state law claims at trial. I note, however, that the requirement of the New York State Constitution, Art. 5, § 6, that civil service appointments "shall be made according to merit and fitness to be ascertained, as far as practicable, by examination which, as far as practicable, shall be competitive" is less strict, where racial disparities exist, than the dictates of Title VII. Plaintiffs have also failed to present any evidence of intentional racial discrimination to support Fourteenth Amendment claims alleged in the complaint. These allegations are considered withdrawn.

[3] Advanced Research Resources Organization ("ARRO") of Washington, D.C. was selected to develop and validate the physical ability examination and medical standards for the new tests. These aspects of the 1982 selection procedures are not at issue in this case.

[4] The SCO peace officer knowledge questions were not pretested.

[5] A major and secondary measure was used for each category. For the reading comprehension items, the confirming tests used were the SRA Reading Index and the SRA Writings Skills test. Subtests from the Watson-Glaser Critical Thinking Appraisal and selected portions of the Wechsler Adult Intelligence Scale (Revised) were used to validate the Ability to Apply Information items. MSI constructed a video memory test specifically for this project and used it together with the Brown-Carlson Listening Comprehension Test to test various memory stories.

[6] An item was considered difficult if fewer than 20% answered it correctly, and easy if more than 80% did so. Intermediate results were defined as moderate difficulty. Impact ratios were calculated as the percentage non-white answering correctly divided by the percentage of whites answering correctly (% non-white/% white). Impact was defined as a ratio less than 50; ratios between 50 and 79 (50 To calculate validity, the MSI Project Director and OCA Project Director constructed an ideal profile for each type of question, reflecting the proper degree of correlation between the item and its confirming and disconfirming measures. For example, the ideal profile for a reading comprehension question would show strong correlation with the SRA Reading Index, low correlations with the Minnesota Clerical Test, and moderate correlations with the Watson-Glaser examination. Each item profile was then matched against the ideal profile and rated as either a perfect, good, marginal or poor match or as flat (non-differentiating) or not computable.

[7] The percentages do not total to 100%. For the UCO test, 0.8% identified themselves as other minorities, and 4.8% did not indicate race. The analogous numbers for the SCO exam were 0.9% and 6.9%.

[8] The range of adjustments were from -14.84 (estimates reduced by 14.84) to +4.76.

[9] The results did not vary significantly between the white and non-white SMEs. Considered independently, the white SMEs yielded a cutting score of 55.72 and the non-white SMEs yielded 56.78.

[10] Item # 76 was deleted because it was judged to require knowledge of "a little known fact...." Item # 94 was deleted as confusing. See MSI Report, Vol. III: p. 35.

[11] Pursuant to a stipulation entered in the Underwood action, applicants who had previously taken the 1977 examination and received permanent appointment under the 1982 examination would be credited with seniority "retroactive to the date defendant OCA would have appointed a person with the same relative standing on [the 1977 examination] but for the restraining order previously issued by this court." (Pltf. Vol. II: Doc. 17-G) Over time, many of these applicants were shifted to the SCO promotional list.

[12] There is an important, though often ignored, distinction between the validity and job relatedness of a selection procedure. In order to have validity, a test must accurately and reliably measure the attributes skills, knowledges, abilities it sets out to measure. Validity is a necessary, but not a sufficient component of job relatedness. Not only must the questions on a job related exam actually test for that which they purport to test, but the attributes being tested must also be shown to be significantly related to the job, in that their possession is required for successful performance on the job and they constitute (or are required for) more than a trivial portion of the job.

Although the language used is not always precise, the procedure set out in Guardians IV, discussed below, tests for both validity and job relatedness.

[13] The Guidelines also see a sharp distinction between content and constructs, declaring that any "selection procedure based upon inferences about mental processes cannot be supported solely or primarily on the basis of content validity." 29 C.F.R. § 1607.14(C) (1).

[14] The Court of Appeals noted that content validation "remains inappropriate for tests that measure knowledge of factual information if that knowledge will be fully acquired in a training program." 630 F.2d 79, 94. This conclusion parallels the position of the Guidelines, that "[c]ontent validity is also not an appropriate strategy when the selection procedure involves knowledges, skills, or abilities which an employee will be expected to learn on the job." 29 C.F.R. § 1607.14(C) (1). The issue does not arise in this case, because of the care taken by MSI to exclude those portions (SKAP clusters # 1 & 2) from the examinations. The inclusion of generic peace officer knowledge on the SCO test does not fall within the prohibition, since one year's service as a peace officer is prerequisite to the SCO job.

[15] Plaintiffs have raised some significant concerns about the representativeness of the exam content. This issue is discussed independently below.

[16] The Court of Appeals has expressly rejected Dr. Barrett's view, which leads inexorably to the conclusion (which he also apparently holds) "that there is no test that can be considered completely valid for any but the most rudimentary tasks." 630 F.2d 79, 89. Even if professionally supportable, it is a conclusion which is legally unacceptable. In the practical world where perfect testing is unattainable, reasonable substitutes must be accepted.

[17] See discussion pages 1093-1094, supra.

[18] The fact that the use of a passing score inevitably makes a distinction at some point based on a single point difference is an unavoidable consequence of graded examinations. As explained by the Court of Appeals, "A cutoff score, properly selected, is not impermissible simply because there will always be some error of measurement associated with it." Guardians IV, 630 F.2d 79, 106.

[19] The EEOC has expressed the similar view that the inference from higher score to better job performance is "easier" the "more closely and completely the selection procedure approximates the important work behaviors...." EEOC, et al., Adoption of Questions and Answers to Clarify and Provide a Common Interpretation of the Uniform Guidelines on Employee Selection Procedures, 44 Fed.Reg. 11,996, 12,005 (Q & A # 62) (March 2, 1979). In this case, the approximation is close, but far from complete.

Some case metadata and case summaries were written with the help of AI, which can produce inaccuracies. You should read the full case before relying on it for legal research purposes.

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.