People v. Soto (1999)Annotate this Case
THE PEOPLE, Plaintiff and Respondent, v. FRANK LEE SOTO, Defendant and Appellant.
(Superior Court of Orange County, No. C-89008, Jean H. Rheinheimer, Judge.)
(Opinion by Baxter, J., expressing the unanimous view of the court.)
Richard Schwartzberg, under appointment by the Supreme Court, for Defendant and Appellant.
Linda Robertson; and Alan Crivaro for Californa Public Defenders' Association as Amicus Curiae on behalf of Defendant and Appellant.
Daniel E. Lungren and Bill Lockyer, Attorneys General, George Williamson, Chief Assistant Attorney General, Gary W. Schons, Assistant Attorney General, Rhonda Cartwright-Ladendorf, Keith I. Motley, Holly D. Wilkens and Frederick R. Millar, Jr., Deputy Attorneys General, for Plaintiff and Respondent.
We consider again the admissibility of deoxyribonucleic acid (DNA) evidence to prove identity in criminal prosecutions.  In People v. Venegas (1998) 18 Cal. 4th 47 [74 Cal. Rptr. 2d 262, 954 P.2d 525] (Venegas), we recognized the general scientific acceptance of restriction fragment length polymorphism (RFLP) analysis as a means of comparing the DNA in a known sample (e.g., blood from a suspect) with the DNA in a questioned [21 Cal. 4th 515] sample (e.g., blood or semen taken from a crime scene). Venegas further found general scientific acceptance of the modified ceiling principle, recommended for use by the National Research Council (NRC) in 1992, fn. 1 as a forensically reliable method of calculating the statistical probabilities of a match between the evidentiary samples and the DNA of an unrelated person chosen at random from the general population. We determined that calculations made under the modified ceiling approach-which modifies the product rule fn. 2 in such a way as to select random match probability figures most favorable to the accused from the scientifically based range of probabilities-qualify for admission under the Kelly test. fn. 3 (Venegas, supra, 18 Cal.4th at pp. 84-90.)
Venegas left open the question presented in this case: whether evidence of statistical probabilities calculated using the unmodified product rule is admissible at trial in a criminal case to assist the trier of fact in assessing the probative significance of a DNA match. fn. 4 We conclude the trial court and Court of Appeal below correctly determined that the unmodified product rule, as applied in DNA forensic analysis, is generally accepted in the relevant scientific community of population geneticists, and that statistical calculations made utilizing that rule meet the Kelly standard for admissibility. Accordingly, we shall affirm the judgment of the Court of Appeal.
As will be explained, although the court in People v. Barney (1992) 8 Cal. App. 4th 798 [10 Cal. Rptr. 2d 731] (Barney) held that general scientific acceptance of the unmodified product rule as a means of calculating DNA [21 Cal. 4th 516] match probabilities was precluded by a then ongoing dispute among population geneticists over whether "population substructuring" fn. 5 fatally undermines the reliability of those calculations, that dispute has been eclipsed by subsequent important scientific developments, most notably the publication in 1996 of a completely new report by the NRC, entitled The Evaluation of Forensic DNA Evidence (hereafter 1996 NRC Report), which generally recommends use of the unmodified product rule. At present it is clear from the evidence presented in this case, the published scientific commentary, and the clear weight of nationwide judicial authority that use of the unmodified product rule in DNA forensic analysis has gained general acceptance in the relevant scientific community.
II. Factual and Procedural Background
Defendant was charged with committing forcible rape (Pen. Code, § 261) and with having used a knife in the commission of the offense (Pen. Code, § 12022.3).
On November 17, 1989, the victim, a 78-year-old widow, treated her neighbors, Leroy and Alma B., to lunch in celebration of their anniversary. They returned to their respective mobilehomes in the afternoon, where Leroy remained until a few minutes after 5:00 p.m. At that time he heard a scream coming from the direction of the victim's trailer and ran over to see what was the matter. He found the front door closed and could not see any lights on. Walking around the place, he saw only the victim's car parked in the driveway and heard no noise or sounds coming from inside the trailer. Leroy looked down another driveway to Soto's trailer, the next closest mobilehome, saw no one, and returned home because "everything looked normal."
Leroy walked into his kitchen, got a drink of water and saw the victim in her kitchen through his kitchen window. At the same time his phone rang; it was the victim calling from her kitchen phone. All she said was, "I've been raped." The neighbors dropped everything and went to her aid. The victim was very upset, nervous and frightened. She told them she had washed and cleaned herself before they arrived. She explained she had been raped by a man who knocked on her back door. Thinking she recognized the voice (the victim told Alma she had been talking to their neighbor, Frank Soto, about hiring him to do her lawn work that afternoon), she opened the door to a man who thrust a knife at her throat. She screamed; he threatened to kill her if she screamed again. A stocking mask covered his face so she could not see his [21 Cal. 4th 517] features. He told her not to touch her "medic-alert" button, a service the victim had recently acquired. He then pushed her into the bedroom and raped her.
Officer Dennis Gabrielli arrived a short time later. The victim was still quite upset, although not crying. He asked her what had happened. She told him she had been raped. She said she answered a knock on her door, thinking she recognized the voice, but was unable to hear what the voice was saying. She thought it was her neighbor, Soto, because she had talked to him earlier about doing her lawn work. They had discussed what she wanted done on her lawn, and he left around 4:30 p.m. When she heard the knock on the door, followed by a muffled voice, she assumed he had returned. But when she opened the door, a man masked with pantyhose pushed his way inside and waived a knife at her face. He took her into the bedroom where he had intercourse with her. She was afraid he would kill her, and kept her eyes closed after he opened his pants, exposing his penis. She begged him not to hurt her, but he slightly penetrated her and ejaculated a few moments later. He ordered her to lie on the floor. After about five minutes, not hearing anything, she got up, washed herself and went into the kitchen. The back door was closed, which she then locked. She pushed her medic-alert button and followed that with the telephone call to her neighbors.
The victim told Officer Gabrielli she could not identify the rapist because of the mask. She described her assailant as a White male, about five feet nine inches tall and weighing one hundred and seventy pounds, with light or blond hair and an olive complexion, wearing a mask of beige pantyhose. Soto, who is Latino, is 5 feet 10 inches tall and weighs 183 pounds, with a dark complexion and black hair.
The victim was taken to the hospital and examined in the emergency room that evening. She told the doctor her vagina had been penetrated by the man's penis, but that after the assault, she urinated and wiped herself. The doctor analyzed vaginal swabs taken from the victim and found no sperm.
The following day the police seized a bedspread (also described as a comforter) from the victim's bedroom after they exposed it to a black light and found fluorescent areas indicating the presence of semen. A blood sample was obtained from Soto; its DNA and that from the semen stains on the bedspread were submitted to the Orange County Sheriff's Department (OCSD) crime laboratory and found to match. Robert Keister, a criminalist in the OCSD crime laboratory, testified at trial that there was a probability of only 1 in 189 million of finding the same DNA pattern in individuals selected at random from the population represented by the OCSD's Hispanic database. [21 Cal. 4th 518]
A hearing was held concerning the victim's competency to testify at trial; she had suffered a severe stroke in October 1990 which left her barely able to talk. The trial court found the victim incompetent to testify as a witness and told the jury only that she was unable to testify because of an unrelated medical problem. However, her statements about the attack made to the neighbors, the doctor and Officer Gabrielli on the day of the offense were admitted as spontaneous statements or utterances. (Evid. Code, § 1240.)
Defendant testified on his own behalf at trial, denying guilt. He claimed the only time he was in the victim's home prior to the crime was on that same day, about 4:30 p.m., for five or ten minutes. She had invited him in to explain the gardening work she needed in her front yard, offering him compensation, and he had agreed to do the work for $20.
A jury acquitted defendant of both the forcible rape charge and knife use allegation, but found him guilty of attempted rape, a lesser included offense, for which he was sentenced to the middle term of three years in prison (Pen. Code, §§ 264, 664).
[2a] Defendant argued in the trial court and the Court of Appeal that the forensic evidence and expert opinion testimony matching the DNA profile from his blood sample with that extracted from the semen stains found on the bedspread recovered from the victim's bedroom were erroneously admitted in violation of the Kelly standard. (Kelly, supra, 17 Cal. 3d 24.) He focused his challenge, not on the threshold RFLP analysis or on the comparison of his DNA with that found on the bedspread, but on the second stage of the analysis: the population frequency determination. As explained in greater detail below, once analysis and comparison result in the declaration of a "match," the DNA profile of the matched samples is compared to the DNA profiles of other available DNA samples in a relevant population database or databases in order to determine the statistical probability of finding the matched DNA profile in a person selected at random from the population or populations to which the perpetrator of the crime might have belonged. Soto contends that the propriety of using the unmodified product rule to calculate these statistical probabilities is disputed within the scientific community, and therefore such statistical DNA evidence must be excluded from the courtroom under Kelly, supra, 17 Cal. 3d 24.
 Under the Kelly standard, evidence based upon application of a new scientific technique such as DNA profiling may be admitted only after the reliability of the method has been foundationally established, usually by the [21 Cal. 4th 519] testimony of an expert witness who first has been properly qualified. The proponent of the evidence must also demonstrate that correct scientific procedures were used. (Kelly, supra, 17 Cal.3d at p. 30; see also Venegas, supra, 18 Cal.4th at p. 81; People v. Leahy, supra, 8 Cal.4th at p. 594.)
The scientific technique on which evidence is being offered must have gained general acceptance in the particular field to which it belongs. (People v. Brown (1985) 40 Cal. 3d 512, 529 [230 Cal. Rptr. 834, 726 P.2d 516], revd. on other grounds sub nom. California v. Brown (1987) 479 U.S. 538 [107 S. Ct. 837, 93 L. Ed. 2d 934].) However, Kelly "does not demand that the court decide whether the procedure is reliable as a matter of scientific fact: the court merely determines from the professional literature and expert testimony whether or not the new scientific technique is accepted as reliable in the relevant scientific community and whether ' "scientists significant either in number or expertise publicly oppose [a technique] as unreliable." ' [Citations.]" (People v. Axell (1991) 235 Cal. App. 3d 836, 854 [1 Cal. Rptr. 2d 411] (Axell).) " 'General acceptance' under Kelly means a consensus drawn from a typical cross-section of the relevant, qualified scientific community." (People v. Leahy, supra, 8 Cal.4th at p. 612.) In Venegas, supra, 18 Cal.4th at page 84, we held that the statistical calculation phase of RFLP analysis requires Kelly screening to assure both that the statistical methodology used is generally accepted in the scientific community, and that the calculations in the particular case followed correct scientific procedures.
Before turning to the merits of defendant's contention and examining the evidence established at the Kelly hearing below, a brief review of the scientific principles and procedures involved in DNA RFLP analysis will prove helpful.
1. Overview fn. 6
A. DNA Theory and RFLP Analysis
Virtually each one of the trillions of cells in the human body, with the exception of red blood cells, has a nucleus containing the DNA that underlies the person's entire genetic makeup. fn. 7 The DNA is organized into 23 pairs [21 Cal. 4th 520] of homologous chromosomes, 1 chromosome in each pair being inherited from the mother and the other from the father. (1996 NRC Rep., supra, pp. 60-61.) A chromosome is a long DNA molecule in the shape of a spiral staircase. (1992 NRC Rep., supra, p. 33.) "It consists of two parallel spiral sides (i.e., a double helix) composed of repeated sequences of phosphate and sugar. The two sides are connected by a series of rungs, which constitute the steps in the staircase. Each rung consists of a pair of chemical components called bases. There are four types of bases-adenine (A), cytosine (C), guanine (G), and thymine (T). A will pair only with T, and C will pair only with G." (Barney, supra, 8 Cal.App.4th at p. 805.) There are over 3 billion base pairs in the 46 chromosomes of a single human cell. fn. 8 When a cell reproduces, the parallel sides, or strands, of its DNA separate, and the bases of each strand pair off with the complementary bases of a new strand. (1996 NRC Rep., supra, p. 63.)
"A person's individual genetic traits are determined by the sequence of base pairs in his or her DNA molecules. That sequence is the same in each molecule regardless of its source (e.g., hair, skin, blood, or semen) and is unique to the individual. Except for identical twins, no two human beings have identical sequences of all base pairs. [¶] In most portions of DNA, the sequence of base pairs is the same for everyone. Those portions are responsible for shared traits such as arms and legs. In certain regions, however, the sequence of base pairs varies from person to person, resulting in individual traits. A region-or locus-that is variable is said to be polymorphic." (Barney, supra, 8 Cal.App.4th at pp. 805-806.)
The DNA sequences that determine a person's genetic traits are contained in the 50,000 to 100,000 genes making up his or her genetic code. (See 1992 NRC Rep., supra, p. 33.) Human DNA also includes other sequences that are noncoding, i.e., they serve no known genetic function. Compared to the genes, the noncoding sequences are more likely to be polymorphic since their individual variation is less constrained by forces of selection. (Id. at p. 34; 1996 NRC Rep., supra, p. 63.)
Because there is no practical way to sequence all three billion base pairs in a person's DNA, forensic scientists seek to identify individuals through variations in their base-pair sequences at polymorphic DNA locations (loci). [21 Cal. 4th 521] Each variation in sequence is called an "allele." fn. 9 The greatest variations are found at noncoding loci containing "variable number tandem repeats" (VNTR's) in which the same sequence of base pairs is repeated successively for numbers of times that differ from person to person. fn. 10 "This variance is what makes DNA analysis possible. In effect, the lengths of sets of multiple (usually eight) polymorphic fragments (or VNTR alleles) obtained from a suspect's DNA and from crime scene samples are compared to see if any sets match ...." (Barney, supra, 8 Cal.App.4th at p. 806.) In the absence of a nonmatch that conclusively eliminates the suspect as the source of the crime scene sample, each match between alleles from the suspect and from the crime scene may be accorded statistical significance.
"There are three discrete steps in [RFLP] analysis as performed by the FBI ... and by [private laboratories like Cellmark and also the OCSD crime laboratory in this case] ...: (1) processing of DNA from the suspect and the crime scene to produce X-ray films [autorads] which indicate the lengths of the polymorphic fragments; (2) examination of the [autorads] to determine whether any sets of fragments match; and (3) if there is a match, determination of the match's statistical significance." (Barney, supra, 8 Cal.App.4th at p. 806, original italics.)
B. Processing DNA Samples and Generating Autorads
[ ] [For a detailed description of the seven-step process by which RFLP methodology is applied to a DNA sample to produce an autorad, see Venegas, supra, 18 Cal.4th at pages 60 through 62.]
"The location of a band on the X-ray film [autorad] indicates the distance a fragment traveled as a result of electrophoresis, and hence the length of the fragment. The size-marker fragments also appear on the films, enabling measurement of the base-pair lengths of the sample fragments. [¶] The end result of the processing substeps is a picture of a person's DNA pattern ... consist[ing] of a series of bands (usually eight) representative of a few selected bits of DNA ...." (Barney, supra, 8 Cal.App.4th at p. 808.)
C. Interpreting Autorads Through Application of "Match Criteria"
"The second step of DNA analysis is to compare the DNA patterns produced by the processing step in order to determine whether the suspect's [21 Cal. 4th 522] DNA pattern matches the DNA pattern of bodily material found at the crime scene. [¶] First, the patterns are visually evaluated ... to determine whether there is a likely match. Most exclusions will be obvious, since the patterns will be noticeably different." (Barney, supra, 8 Cal.App.4th at p. 808.)
If an exclusion is obvious on any of the autorads, there is a conclusive nonmatch of the samples. Otherwise, "the bands in the patterns are subjected to computer-assisted analysis to determine the length of the represented DNA fragments as measured in base-pair units. The measurements are taken by comparing the bands for the sample fragments with the bands for the size-marker fragments of known base-pair lengths." (Barney, supra, 8 Cal.App.4th at p. 808.)
Because of inherent limitations in the DNA processing system, it is not possible to obtain exact base-pair measurements of the sample DNA fragments. For that reason, forensic laboratories have developed DNA match criteria based on the variations they have experienced in repeated measurements of DNA from the same source. Those criteria determine the "match window"-or range of sizes-constructed around each band for purposes of declaring a "match." For example, under the FBI's match criterion of plus or minus 2.5 percent, the window around a band that measures 1,000 base pairs is from 975 to 1,025 base pairs. If the window of either band, or a single band, on one sample fails to overlap the window of the corresponding band on another sample, there is an exclusion of any match between the samples. If the windows of both bands, or of the single bands, of each sample overlap, there is a match at the locus disclosed by that probe. [The OCSD crime laboratory arrived at its own match criterion by recording variations in its repeated measurements of the same thing. The OCSD's match criterion of 3.4 percent (plus or minus 1.7 percent) was narrower than the FBI's match criterion of 5 percent (plus or minus 2.5 percent).]
Some conditions adverse to reliability of measurement may call for a determination that a match at that locus is inconclusive. fn. 11 That determination, however, does not invalidate matches at the other loci. There can be a match at multiple loci only if (1) the match criteria are met for all the bands at those loci, and (2) there is no locus at which a match of any band was excluded.
D. Use of Population Databases to Assess Significance of Match
Once a match at multiple loci has been declared, the next step is to determine its statistical significance. (Barney, supra, 8 Cal.App.4th at p. [21 Cal. 4th 523] 809.) Unless a nonmatch between any band of the suspect's DNA and the corresponding band of the questioned sample conclusively eliminates the suspect as the source of that sample, a match of one or more of the suspect's bands with those of the sample places the suspect within a class of persons from whom the sample could have originated. The fact finder's determination of guilt may then turn on the degree of probability that the suspect was indeed the source of the sample. That probability, however, will usually depend, not on the DNA findings alone, but on a combination of those findings together with other, non-DNA incriminating evidence. (See State v. Bloom (Minn. 1994) 516 N.W.2d 159, 162-163.)
The question properly addressed by the DNA analysis is therefore this: Given that the suspect's known sample has satisfied the "match criteria," what is the probability that a person chosen at random from the relevant population would likewise have a DNA profile matching that of the evidentiary sample? fn. 12 That probability is usually expressed as a fraction-i.e., the probability that one out of a stated number of persons in the population (e.g., 1 out of 100,000) would match the DNA profile of the evidentiary sample in question. A greater probability, that is to say, a fraction with a smaller denominator (e.g., 1 out of 10,000), would tend to favor the suspect by increasing the probability that one or more other persons has a DNA profile matching the evidentiary sample.
To assess the probability in question, "the FBI and Cellmark [and the OCSD crime laboratory in this case] calculate how frequently each pair of bands produced by one probe is found in a target population." (Barney, supra, 8 Cal.App.4th at p. 809.) For this purpose, those and other forensic laboratories use one or more population databases containing measurements of the DNA fragments of several hundred persons at each of the loci reached by the probes. fn. 13 The samples from which those measurements are derived come from such varied sources as blood banks, hospitals, clinics, genetics laboratories, and law enforcement personnel. (See 1996 NRC Rep., supra, p. 126.) [21 Cal. 4th 524]
E. Comparing Individual Band Size with Population Database Bands-"Binning"
Like the base-pair measurements of the evidentiary bands, the measurements of bands in a comparative database are, by their nature, inexact. For comparison purposes, therefore, the database bands are sorted into ranges of size called "bins." There are two kinds: "floating bins" and "fixed bins."
A floating bin, constructed for each forensic comparison, is a range of sizes at least as large as the match window, centered on the measured size of the evidentiary band in question. The evidentiary band's frequency, i.e., the probability of its appearing in the DNA profile of a randomly selected member of the population underlying the database, is calculated from the ratio of the number of bands in the bin to the total number of bands in the database for that locus.
Fixed bins, on the other hand, compartmentalize the entire spectrum of VNTR base-pair sizes likely to appear as bands on an autorad. The spacing of the fixed-bin boundaries is somewhat uneven because, like the bands in the autorad's sizing-ladder lanes, they are derived from viral DNA that has been exactly measured. A separate fixed-bin table is compiled for each locus in each database. Each database band is entered within the bin that encompasses its base-pair size. To protect a suspect against unduly small frequencies, any bin with four or fewer bands is combined with its neighbor until each bin contains a minimum of five bands. The fixed-bin table shows not only each bin's range of sizes and number of bands, but also each bin's frequency, which is calculated from the ratio of the number of bands in the bin to the total number of bands in the table. (See 1996 NRC Rep., supra, pp. 97, 143; Budowle et al., Fixed-Bin Analysis for Statistical Evaluation of Continuous Distributions of Allelic Data from VNTR Loci, for Use in Forensic Comparisons (1991) 48 Am. J. Hum. Genetics 841, 846 [citing an example in which a table of 31 bins, ranging from 0 to over 12,000 base pairs, was collapsed into a table of 23 bins].)
In fixed-bin analysis, the frequency of an evidentiary band is determined by assigning it the frequency of the fixed bin into which its base-pair size falls. Special rules may apply when a window around the evidentiary band overlaps multiple bins. fn. 14
F. Probability of a Random Profile Match-The "Product Rule"
The final task is to calculate the statistical probability that the DNA profile of any one person, selected at random from the relevant population, [21 Cal. 4th 525] would contain all the alleles represented by the measured bands of the evidentiary sample. The most straightforward means of making this calculation is through application of the "product rule." [ ]
The essence of the product rule is the multiplication of individual band probabilities to arrive at an overall probability statistic expressed as a simple fraction, such as 1 in 100,000. The rule is applied in two stages: first, for determining the allelic frequency at each locus, and then, for determining the alleles' combined frequency at all loci. fn. 15 When the evidentiary sample has two bands at a locus, making the donor heterozygous, the frequencies of those bands are multiplied by each other and the result multiplied by two to reflect the fact that each band could have originated from either parent. If there is only one band at the locus, either the donor is homozygous or there is a second allele that for some reason did not appear on the autorad. fn. 16 In order to take both those possibilities into account while avoiding prejudice to the suspect, the frequency of the first band is simply multiplied by two. (See 1992 NRC Rep., supra, p. 78; 1996 NRC Rep., supra, pp. 105-106.)
Finally, under the product rule, the frequencies found at each locus are multiplied together to generate a probability statistic reflecting the overall frequency of the complete multi-locus profile. The resulting statistic will oftentimes be very small. fn. 17
G. Effects of Population Substructure-the "Ceiling Principles"
The foregoing application of the product rule to calculate the frequency of a multi-locus profile will produce an accurate result only to the extent that each multiplied frequency is statistically independent from all the others. (See People v. Collins (1968) 68 Cal. 2d 319, 328-329 [66 Cal. Rptr. 497, 438 P.2d 33, 36 A.L.R.3d 1176].)
Population genetics theory teaches that pairs of alleles at the same locus are statistically independent from each other if they are in "Hardy-Weinberg equilibrium." Hardy-Weinberg equilibrium has been defined as "the condition, for a particular genetic locus and a particular population, with the [21 Cal. 4th 526] following properties: allele frequencies at the locus are constant in the population over time and there is no statistical correlation between the two alleles possessed by individuals in the population; such a condition is approached in large randomly mating populations in the absence of selection, migration, and mutation." (1992 NRC Rep., supra, p. 169 [Glossary], italics added; see id. at p. 78; 1996 NRC Rep., supra, pp. 90-92.)
Alleles at different loci are said to be independent if they are in "linkage equilibrium." Alleles are not in linkage equilibrium if "a specific allele at one locus is non-randomly associated with an allele at another locus." (1992 NRC Rep., supra, p. 170 [Glossary]; see id. at p. 78; 1996 NRC Rep., supra, p. 106.)
Generally, the presence of both kinds of equilibrium in a given population depends on the extent to which mating within that population has been at random. If both kinds of equilibrium are not present, application of the product rule in theory may prejudice the suspect by understating the frequency of a profile within particular segments of the population.
Major laboratories that do RFLP analysis, including the FBI and Cellmark [and the OCSD crime laboratory in this case], have developed their own separate population databases for each of several broad racial or ethnic categories such as Caucasian, Black, and Hispanic (see Barney, supra, 8 Cal.App.4th at p. 809), the assumption being that mating among members of any one of those categories of the United States population is sufficiently random to justify using them in conjunction with the product rule to calculate the frequency of a DNA profile. (See 1996 NRC Rep., supra, p. 156.) fn. 18
2. Kelly Hearing in the Municipal Court
[2b] At the preliminary examination, Robert Keister, the criminalist in charge of RFLP analysis at the OCSD crime laboratory, testified that RFLP analysis of DNA samples from defendant's blood and from a semen stain on the seized bedspread showed matches at four DNA loci. As previously indicated, he calculated there was a probability of 1 in 189 million of finding that same DNA pattern in individuals selected at random from the population represented by the OCSD's Hispanic database. fn. 19
The principal issue at the preliminary examination was the admissibility of the DNA evidence under Kelly, supra, 17 Cal. 3d 24. Defendant stipulated [21 Cal. 4th 527] to general scientific acceptance of the OCSD crime laboratory's RFLP procedures up through production of the autorads from which the sizes of the DNA samples (alleles) were measured. During the 11 days of Kelly hearing testimony, held from July 30 to November 15, 1991, the parties elicited extensive DNA testimony from 3 prosecution experts: the criminalist, Robert Keister, and two human population geneticists, Ranajit Chakraborty and Bruce Kovacs. Each testified to the reliability of the product rule in frequency determinations involving human populations, and to the validity of the work, both analytical and mathematical, performed by the OCSD laboratory. Both sides also introduced voluminous exhibits. The defense called no witnesses at the Kelly hearing in the municipal court.
Dr. Chakraborty, a recognized authority in human population genetics involving RFLP, fn. 20 was routinely asked by DNA laboratories around the country to review their work and validate their databases used for generating probability statistics in RFLP analysis of DNA profiles. In his opinion the OCSD crime laboratory was the most careful in terms of validating its databases. He testified that application of the unmodified product rule was proper in the OCSD crime laboratory's analyses of probability estimates-all of the assumptions justifying use of the rule were supported by the laboratory's databases, including both Hardy-Weinberg equilibrium (independence of alleles at a locus) and linkage equilibrium (independence of probes across multiple loci).
Dr. Chakraborty used a series of tests to review and examine the DNA databases, calculations and extrapolations, taking particular note of published criticisms by Drs. Lewontin and Hartl, who, along with Dr. Laurence Mueller and Dr. Eric Lander, had questioned using the product rule because of fears that population substructuring might negatively affect the accuracy of such applications. The OCSD crime laboratory had sent a computer [21 Cal. 4th 528] diskette of its entire database, involving DNA profiles from over 1,200 persons, to Dr. Chakraborty for analysis and verification of its validity. Using computer programs, he concluded the unmodified product rule could be "reliably applied" in calculating probability estimates from such data. fn. 21
Dr. Chakraborty also testified that the OCSD crime laboratory's fixed-bin method of calculating a probability estimate (similar to that used by the FBI laboratory) was "conservative." Several characteristics of the method guard against underestimation, and often lead to overestimation favorable to the defendant, of the actual allele frequencies. The fixed bins are generally much wider than the actual ability of the laboratory to distinguish between alleles. The bins are "collapsed" into larger bins until each bin has a minimum of five occurrences. For apparent single-banded patterns, a conservative 2p frequency calculation is used ("p" being the probability of the occurrence of a single band at a given locus) instead of the p-squared standard Hardy-Weinberg calculation. And if an allele falls close to a bin boundary, the higher probability of the two adjacent bins is used in calculating a probability estimate.
Dr. Bruce Kovacs was a professor at the University of Southern California School of Medicine. He was involved in genetic research relevant to specific human DNA mutations both at the University of Southern California and in the Department of Medical Genetics at the City of Hope Medical Center. fn. 22 He was also an investigator for the human genome mapping project and had been a peer reviewer for the American Journal of Human Genetics and several other scientific journals. Kovacs testified he examined the OCSD [21 Cal. 4th 529] crime laboratory's work in this case and concluded the laboratory's RFLP analysis, use of the product rule, and calculation of probability frequencies would all be generally accepted in the human population genetics community. Specifically, that scientific community would accept the initial 1-in-214-million probability estimate (see ante, fn. 19 at pp. 526-527) for Soto's 4-probe banding profile, based on the Orange County Hispanic database, as reliable.
At the conclusion of the preliminary examination, the magistrate ruled that RFLP methodology for forensic DNA analysis is generally accepted by the relevant scientific community, that the methodology was correctly followed in this case by the OCSD crime laboratory, and that therefore Keister's testimony of the very small probability of a random match with the DNA found in defendant's blood and in the semen stain would be admissible in evidence. Accordingly, defendant was ordered held to answer in superior court.
3. Further Kelly Hearing in the Superior Court
By stipulation and order, all testimony and exhibits introduced at the preliminary examination were made part of the trial record for purposes of considering the admissibility of DNA evidence at trial. Meanwhile, in Axell, supra, 235 Cal. App. 3d 836, a published California appellate decision for the first time upheld trial court rulings admitting RFLP analysis of DNA samples as evidence of the identity of the perpetrator of a crime. (See post, at pp. 535-536.) After the Axell decision became final, the present trial court ruled that Axell had settled the general scientific acceptance of such analysis, but that the prosecution would still have to establish that correct scientific procedures were used in the present case. Moreover, the defense would be permitted to show changes in the views of the scientific community after the Axell findings. fn. 23
To show that correct scientific procedures had been used, the prosecution introduced testimony of criminalist Keister and Dr. Kovacs, both of whom had also testified at the preliminary examination. To show post-Axell changes in scientific views, the defense called two population geneticists, Drs. Laurence Mueller and William Shields.
Dr. Mueller, an ecologist and population geneticist, was an associate professor in the Department of Ecology and Evolutionary Biology at the [21 Cal. 4th 530] University of California, Irvine. He worked primarily with populations of fruit flies. In his opinion there was no consensus in the scientific community in support of the forensic RFLP techniques used in this case. He testified that the variations in probability estimates obtained using different Hispanic databases (i.e., those of the OCSD and FBI) (see post, at p. 533) reflected population substructuring that made it "incorrect" to apply the unmodified product rule to the OCSD database.
In Mueller's opinion there was no existing database that could properly be used in conjunction with the product rule to determine a correct probability estimate in defendant's case. He criticized the OCSD database as failing to reflect ethnic subgroups that he believed account for substructuring. In his opinion, selecting a database from the geographic region where the crime occurred and the potential perpetrators reside (e.g., using the OCSD crime laboratory's Hispanic database in this case) was less important than developing separate databases from the separate "ancestral populations" that live in places like Cuba, Mexico, Spain and Central America, in order to reflect the genetic background of the particular defendant. To his mind, the relevant comparison was not between the DNA profiles of the matched samples and those of persons residing in the area where the crime occurred (i.e., the OCSD crime laboratory's databases), but between "the ethnicity of the person involved" and persons in "ethnicity-defined" databases. (But cf. fn. 27, post, at p. 532.)
Mueller testified that, as an alternative to the unmodified product rule, he would use the "1/database method" of calculating a probability estimate, together with a correction for "the rate at which the laboratory makes false positives," because he preferred a rule that did not make the assumptions of Hardy-Weinberg and linkage equilibrium. Under that method (also known as the "counting method"), Mueller would use a fraction equalling 1/2 the number of samples in the database as the probability estimate for the four-probe DNA profile. In other words, Mueller would simply count how many people in the database matched the four-banded pattern. Assuming no match in the database, the probability would be reported as a fraction equalling 1/the number of samples in the database (modified by a confidence interval). In the present case, this would result in a probability estimate of approximately 1/250, since there were approximately 250 samples in the OCSD crime laboratory's Hispanic database.
Dr. William Shields, a professor at the State University of New York College of Environmental Science and Forestry, who had done work in population and conservation genetics, also testified for the defense. Shields's work mostly involved the study of small mammals; he had never worked [21 Cal. 4th 531] directly with RFLP analysis of human DNA samples. Shields believed that "representative samples," not random samples, were necessary for a valid database. He suggested that obtaining representative samples would involve sampling not only the frequency variance among subpopulations, but also the frequency with which the subpopulations occur in the total population. In his opinion, use of the 1/database or counting method was the "most reliable if you're going to put a number in at all."
On February 10, 1992, the trial court concluded that correct scientific procedures had been used and that there had been no material change in scientific acceptance of statistical evidence generated by the unmodified product rule since the filing of the Axell decision. Accordingly, the court ruled that the incriminating DNA evidence produced by the OCSD's crime laboratory met the requirements for admissibility set forth in Kelly, supra, 17 Cal. 3d 24.
4. DNA Evidence Introduced at Trial
Criminalist Keister testified he prepared four autorads in this case, each depicting VNTR fragments at a different DNA locus. The loci examined by the OCSD laboratory were D1S7, D2S44, D4S139, and D10S28, situated on chromosomes 1, 2, 4, and 10 respectively. (See Venegas, supra, 18 Cal.4th at p. 68, fn. 24 [explaining locus numbering system].) fn. 24 Keister found that the size differences between the eight bands of the known sample and those of the questioned sample on each of the four loci were 1.01 percent or less, well within the OCSD crime laboratory's match criterion of 3.4 percent.
Keister then determined the statistical probability that alleles numerically indistinguishable from the matched bands would appear in the databases of DNA collected from representative populations. The OCSD had created those databases by obtaining blood samples that the Red Cross collected from its Orange County blood bank and identified as Hispanic, Caucasian, Black, or Asian. fn. 25 The OCSD then performed RFLP analysis on each sample, calculating the base-pair sizes of the bands at each of the four DNA loci at which comparisons were to be made. For each locus, there was constructed a fixed-bin table, initially consisting of 31 bins, or ranges of [21 Cal. 4th 532] sizes, based on the 31 fragments of known length in the marker lanes customarily included in the autorads. The database bands were sorted into the bins by base-pair size. Any bin that contained less than five bands was "rebinned" by combining it with its neighbors until each bin contained a minimum of five bands. All bands in each bin were deemed to have a common frequency, calculated from the ratio of the number of bands in the bin to the number of bands in the entire table. (See Venegas, supra, 18 Cal.4th at pp. 64-65.)
Keister testified his initial comparison of the DNA profile of defendant's blood sample was with the OCSD Hispanic database of about 250 individuals, because the frequency in the database from the suspect's ethnic group "will generally be the most frequent number." As noted, defendant's sample had produced eight bands, two at each locus. Keister determined the frequency of each individual band in the Hispanic database from the fixed-bin tables, then used the unmodified product rule to ascertain the statistical probability of finding all combined eight bands in any one person in the population represented by that database. fn. 26 In that manner, he calculated the probabilities of finding defendant's DNA profile in the populations underlying four Orange County databases as follows: (1) Hispanic: 1 in 189 million; (2) Caucasian: 1 in 38 million; (3) Black: 1 in 807 million; (4) Vietnamese: 1 in 177 million. Keister also calculated the probabilities of finding defendant's 8 bands in individuals from 4 populations underlying databases published by the FBI, as follows: (1) Southwest Hispanic (Texas): 1 in 55 million; (2) Southeast Hispanic (Florida): 1 in 2.3 billion; (3) U.S. Black: 1 in 2.4 billion; and (4) U.S. Caucasian: 1 in 3 billion. fn. 27
Keister also made a calculation using floating bins, instead of fixed bins, to measure the probability of finding defendant's profile in the Orange [21 Cal. 4th 533] County Hispanic database. fn. 28 He sorted all the database fragments, in the order of their numerical size, into a table for each of the four DNA loci. At the two points on each table where the sizes of defendant's bands would place them, Keister superimposed a floating bin, ranging from 3.4 percent more to 3.4 percent less than the size of defendant's band, and counted the number of fragments within that bin. The frequency of the band was calculated by dividing the number of fragments in the bin by the total number of fragments in the table. Applying the product rule to those results, Keister calculated the frequency of defendant's eight-band pattern in the database at 1 in 6.7 billion. He explained that the size of the floating bin-plus or minus 3.4 percent-was based on the OCSD laboratory's match criteria. He said the fixed bins are generally wider than the floating bins, use of which results in a "less frequent number."
Dr. Bruce Kovacs, who, like Keister, had testified in the preliminary examination, testified as a DNA witness at trial. As noted, Kovacs was a professor of medicine at the University of Southern California Medical School and did research in medical genetics, using RFLP analysis to study human gene mutations. (Ante, at p. 528.) When asked his opinion on the significance of the variations between the population frequencies calculated from the various OCSD and FBI databases, he testified, "They [the denominators] are astronomically large numbers. The significance of whether something is 1 in 55 million or 1 in 110 million versus 1 in 4 billion is something that I can't really get my hands on in a real concrete way to distinguish that difference. It's a very, very, very rare event." Dr. Kovacs also expressed his view that Daniel Lewontin and Richard Hartl's published criticism of the unmodified product rule-see Lewontin and Hartl, Population Genetics in Forensic DNA Typing (Dec. 20, 1991) 254 Science 1745-1750 (Lewontin and Hartl article)-was based on a theoretical model that was not demonstrated by actual data.
The DNA witnesses called by the defense at trial were Laurence Mueller and William Shields, each of whom had testified at the further Kelly hearing in the trial court (ante, at pp. 530-531), and Seymour Geisser, director of the School of Statistics at the University of Minnesota.
Mueller, the University of California, Irvine biology professor who also testified in Venegas (see 18 Cal.4th at pp. 90-91), repeated the criticisms to which he had testified at the reopened Kelly hearing, i.e., that the unmodified product rule failed to reflect the distorting effects of population substructuring. (Ante, at p. 530.) He testified he would have used the more conservative [21 Cal. 4th 534] 1/database or counting method for generating a probability frequency in this case. Shields, the State University of New York College of Environmental Science and Forestry professor who has done work in population and conservation genetics with small mammals, likewise repeated much of his testimony given at the reopened Kelly hearing. (Ante, at p. 531.)
Seymour Geisser, a professor at the School of Statistics at the University of Minnesota, is a biostatistician, not a population geneticist. In his opinion, the unmodified product rule should not have been used to calculate a probability estimate from the OCSD crime laboratory's database because statistical independence (i.e., Hardy-Weinberg equilibrium and linkage equilibrium) were "suspect" in the database. He thought at least two and possibly three out of the four probes used in this case were not in Hardy-Weinberg equilibrium, and that use of the 1/database or counting method of calculating a probability frequency would have been more appropriate.
The prosecution then presented two rebuttal DNA witnesses: Drs. David Goldman and Kenneth Kidd.
Dr. Goldman testified his laboratory at the National Institute of Health employed the unmodified product rule in its genetic research into neuropsychiatric disorders, particularly alcoholism. He was aware of the debate among some scientists over whether the rule should be used; it centered on whether Hardy-Weinberg equilibrium and linkage equilibrium are met before the rule is applied in frequency determinations. He was aware of the research indicating there is independence of alleles, i.e., that the necessary assumptions regarding equilibrium support use of the product rule to calculate frequency determinations. He was also aware of the OCSD crime laboratory's database and believed it was sufficiently large to properly apply the product rule in its frequency determinations.
Dr. Kidd was a professor of human genetics, psychiatry and biology at Yale University, a director of the human genome mapping project, and director of the Yale University DNA research laboratory. He acknowledged the debate regarding probability estimates. His opinion was that any greater precision was impossible; the product rule could be applied because the frequency determination derived from it was still as accurate as any determination could be. Although the greater the database the greater the certainty of the estimate, any difference in estimates over one in a million was pragmatically meaningless. Finally, empirical data proved there was no significant substructuring; the larger the database, the more uniform the distribution of gene size became. Like Kovacs, he was of the view that [21 Cal. 4th 535] Lewontin and Hartl's published article was primarily a hypothesis with no empirical data to support it. fn. 29
The case was submitted to the jury on March 9, 1992, and the verdict finding defendant guilty of the lesser included offense of attempted rape returned on March 11, 1992.
On April 29, 1992, defendant filed a notice of motion for new trial. Attached to his points and authorities was an advance copy of the 1992 NRC Report, which had been released on April 14, 1992. A supporting affidavit by Laurence Mueller asserted that the report represented a significant shift in the academic community's views on the product rule, and that there was no longer a consensus on the subject within that community. At the hearing on the motion on May 22, 1992, the defense called Mueller to testify on the significance of the 1992 NRC Report as newly discovered evidence, and the prosecution called Keister in opposition.
Mueller testified that, applying the modified ceiling method described in the 1992 NRC Report, he calculated the Soto frequency at 1 in 182,000. He used fixed bins and a combination of the following databases: OCSD Hispanic, Caucasian, and Asian; and FBI Texas Hispanic, Florida Hispanic, and Sioux Indian.
Keister testified he had recalculated the Soto frequencies using the 1992 NRC Report's modified ceiling method and the OCSD databases and got the following frequencies: using floating bins: 1 in 5 million; using fixed bins: 1 in 500,000.
Upon conclusion of the testimony, the motion for a new trial was denied, the trial court commenting that the 1992 NRC Report did not undercut the court's obligation to follow Axell, which had endorsed use of the unmodified product rule under the Kelly standard.
5. Axell and Barney
Axell, supra, 235 Cal. App. 3d 836, decided in October 1991, was the first California appellate decision to confirm the general scientific acceptance of [21 Cal. 4th 536] RFLP DNA analysis within the meaning of Kelly, supra, 17 Cal. 3d 24. After a detailed Kelly hearing during which numerous experts (including Drs. Kenneth Kidd and Laurence Mueller) testified over a six-month period in 1989, the Axell court determined that all three phases of RFLP DNA analysis performed by the private laboratory Cellmark-processing, matching, and statistical calculation through utilization of the product rule-had achieved scientific acceptance under the Kelly standard. (Axell, supra, 235 Cal.App.3d at p. 868.) On the issue of population substructuring, the court concluded the testimony of the prosecution experts had overcome defense fears about the lack of statistical independence essential to the proper application of the unmodified product rule. (Id. at pp. 856-868.) "Any question or criticism of the size of the database or the ratio pertains to weight of the evidence and not to its admissibility." (Id. at p. 868.)
Barney, supra, 8 Cal. App. 4th 798, was decided in August 1992. In the months between the filing of the Axell and Barney decisions, two significant events occurred.
The first was the publication of a pair of articles contained in the December 20, 1991, issue of the journal Science. The article by Harvard University Professor Richard C. Lewontin and Washington University Professor Daniel L. Hartl (Lewontin & Hartl article, supra, 254 Science 1745; see ante, at pp. 527, 533) attacked the failure of DNA statistical calculation analysis to account for population substructuring. In a rebuttal article appearing in the same issue, Drs. Chakraborty and Kidd, both of whom testified in this case, defended the practice of performing statistical calculations of probability estimates without regard to substructuring. (Chakraborty & Kidd article, supra, 254 Science 1735; see ante, at p. 535, fn. 29.)
The second was the NRC's fn. 30 publication of its first report on DNA profiling, which appeared in April 1992. (1992 NRC Rep., supra; see ante, at p. 515, fn. 1.) Although the report did not constitute an outright rejection of the unmodified product rule, it acknowledged that the effect of population substructuring was controversial. Rather than attempt to resolve the controversy, the 1992 NRC Report assumed "for the sake of discussion" that substructuring might have a significant impact, and suggested the ceiling principles as a means of modifying the product rule to ensure that DNA probability estimates would be sufficiently conservative to account for substructuring. (1992 NRC Rep., supra, at pp. 12-15, 79-85.) [21 Cal. 4th 537]
These intervening publications, reasoned the Barney court, undermined Axell's conclusion that sufficient scientific consensus had been reached regarding the insignificance of the effects of population substructuring on calculations made with the unmodified product rule. The Barney court noted that although the briefing before it predated publication of the Science articles and the 1992 NRC Report, the briefing raised the same concerns regarding substructuring as had been discussed in the recent literature. (Barney, supra, 8 Cal.App.4th at p. 816.) Discussing Lewontin and Hartl's claim that "contrary to the assumption of random mating, ethnic subgroups within each database tend to mate endogamously (i.e., within a specific subgroup) with persons of like religion or ethnicity or who live within close geographical distance," the court noted its concern that the resulting substructuring, if not taken into account, could skew statistical calculations based on the available databases utilizing the product rule. (Barney, supra, 8 Cal.App.4th at p. 815.) The court then noted that Chakraborty and Kidd strongly disagreed with Lewontin and Hartl, contending the latter exaggerated both the extent of endogamy and the effect of substructuring. (Ibid.)
The Barney court also referred to an article in the same issue of Science introducing the Lewontin-Hartl and Chakraborty-Kidd articles, characterizing the debate between them as "bitter" and "raging," and citing another population geneticist as agreeing that the then current statistical methods " 'should not be used without more empirical data.' " (Barney, supra, 8 Cal.App.4th at pp. 815-816.) The court further noted that the 1992 NRC Report acknowledged the existence of a " '[s]ubstantial controversy' " concerning those methods of statistical analysis. (8 Cal.App.4th at p. 819; see 1992 NRC Rep., supra, at pp. 74-75.) Based on these observations, the Barney court concluded, "Whatever the merits of the prior decisions on the statistical calculation process-including Axell-the debate that erupted in Science in December 1991 changes the scientific landscape considerably, and demonstrates indisputably that there is no general acceptance of the current process.... Simply put, Axell has been eclipsed on this point by subsequent scientific developments. In reaching a conclusion different from that in Axell, we do not express disagreement with Axell's reasoning at the time, but rather have progressed to a point on the continuum of scientific debate which neither the Axell court nor the two trial courts in the present cases could have anticipated." (Barney, supra, 8 Cal.App.4th at pp. 820-821.)
6. Significant Developments Subsequent to Barney
Although the Barney court was correct in its observation that substantial debate was ensuing at that time over the effects, if any, of population [21 Cal. 4th 538] substructuring on probability calculations made using the unmodified product rule, it also appears that no empirical data then existed either supporting or disproving theories that postulated a substantial impact of substructuring upon DNA forensic analysis. (See Chakraborty & Kidd article, supra, 254 Science at p. 1735, cited in Armstead v. State (1996) 342 Md. 38 [673 A.2d 221, 237]; see also Kaye, DNA Evidence, supra, 7 Harv. J.L. & Tech. at p. 168 [stating that "[t]here is very little evidence, and certainly no scientific consensus, that the impact [of substructuring] is substantial in any known population"].)
Several developments since the filing of Barney indicate that the controversy over population substructuring and use of the unmodified product rule has dissipated.
First, in 1993, the FBI conducted an extensive, worldwide study of VNTR frequency data. (See IA Federal Bur. Investigation, U.S. Dept. Justice, VNTR Population Data: A Worldwide Study (1993) (FBI Report).) The FBI study concluded that population frequency calculation using the unmodified product rule was reliable, valid and meaningful, and free of any forensically significant consequences resulting from population substructure as had been postulated by some scientists. (Ibid.; see also Lindsey v. People (Colo. 1995) 892 P.2d 281, 294 [citing FBI Rep.]; State v. Copeland (1996) 130 Wn.2d 244 [922 P.2d 1304, 1319].)
Second, in 1994, Dr. Eric Lander, a former leading opponent of the unmodified product rule, co-authored an article in which he declared that the "DNA fingerprinting wars are over." (See Lander & Budowle, DNA Fingerprinting Dispute Laid to Rest (Oct. 27, 1994) 371 Nature 735, 735 (Lander and Budowle article).) In the article, the authors state that the 1992 NRC Report "failed to state clearly enough that the ceiling principle was intended as an ultra-conservative calculation, which did not bar experts from providing their own 'best estimates' based on the product rule." (Lander and Budowle article, supra, at p. 737; see also State v. Copeland, supra, 922 P.2d at p. 1319.) Lander and Budowle further stated that the FBI's laboratory maintained a "remarkable" database, and opined that "observed variation is modest for the loci used in forensic analysis and random matches are quite rare, supporting the notion that the FBI's implementation of the product rule is a reasonable best estimate." (Lander & Budowle article, supra, at p. 738; see also Lindsey v. People, supra, 892 P.2d at p. 293.) Finally, the authors emphasized the convergence of scientific opinion concerning human population genetics statistics, noting that Budowle was one of the principal creators of the FBI's DNA program, and Lander was an [21 Cal. 4th 539] early critic asserting the lack of scientific standards in DNA-typing and was on the NRC Committee. They suggest in the article, "it is fair to say that we represent the range of scientific debate." (Lander & Budowle article, supra, at p. 735.)
Third, and of greatest significance, in 1996 the NRC reexamined the methodology issue and concluded that use of the ceiling principle for forensic purposes is unnecessary, not only because the principle from a scientific standpoint overstates the actual effect of population substructuring, but also because of the current abundance of data regarding different ethnic groups within the major races. (1996 NRC Rep., supra, at pp. 156-159.) The 1996 NRC Report reaffirms the conclusion of the 1992 NRC Report-that properly conducted DNA tests produce highly reliable results, and that DNA analysis, including the application of statistical probabilities, is generally accepted in relevant scientific communities. (1996 NRC Rep., supra, at pp. 2-4.) The NRC Report explicitly approves use of the product rule in calculating match frequencies. (Id. at p. 122 ["Recommendation 4.1: In general, the calculation of a profile frequency should be made with the product rule"].)
Subsequently, Kaye, DNA, NAS, NRC, RFLP, DAB, PCR, and More: An Introduction to the Symposium on the 1996 NCR Report on Forensic DNA Evidence, an introduction to the Symposium on the 1996 NRC Report comprised of six articles, was published in 37 Jurimetrics J. 395 (1997). As pointed out in Professor Kaye's introduction, the authors are "prolific commentators on DNA evidence" (id. at p. 403), most of whom "have served as expert witnesses for defendants" (id. at p. 403, fn. 52). Moreover, the articles were selected by a process that tended to give preference to criticism, rather than approval, of the 1996 NRC Report. (37 Jurimetrics J. at p. 400, fn. 38.) The authors "emphasize different points, and they do not all reach the same conclusions" (id. at pp. 403-404), but it is noteworthy that none of the articles expresses disagreement with the 1996 NRC Report's general approval of the product rule, and two of the critical symposium contributors expressly concede that the product rule/population substructure issue has indeed been laid to rest. (See Thompson, Accepting Lower Standards: The National Research Council's Second Report on Forensic DNA Evidence (Summer 1997) 37 Jurimetrics J. 405, 423 [population structure studies have "tipped the balance of scientific opinion in favor of the product rule (or something close to it)"]; Lempert, After the DNA Wars: Skirmishing With NRC II (Summer 1997) 37 Jurimetrics J. 439, 455 [Lack of Hardy-Weinberg and linkage equilibria "seems hardly to matter. Empirical studies suggest that conservatism in estimating allele frequencies in the first instance can more than make up for any prejudice an accused suffers from the untenability of the assumptions."].) [21 Cal. 4th 540]
A majority of jurisdictions have acknowledged these developments-which include the 1993 FBI Report (a worldwide population study), the Lander and Budowle article, and most significant, the 1996 NRC Report-and have concluded that the controversy over population substructuring and use of the unmodified product rule has been sufficiently resolved. (See, e.g., Com. v. Blasioli (1998) 552 Pa. 149 [713 A.2d 1117] [Pennsylvania; Frye test]; State v. Freeman (1997) 253 Neb. 385 [571 N.W.2d 276] [Nebraska; Frye test]; Armstead v. State, supra, 673 A.2d 221 [Maryland; Frye test and statute authorizing admissibility of RFLP statistical evidence]; State v. Copeland, supra, 922 P.2d 1304 [Washington; Frye test]; People v. Miller (1996) 173 Ill. 2d 167 [219 Ill.Dec 43, 670 N.E.2d 721] [Illinois; Frye test]; State v. Marcus (1996) 294 N.J.Super. 267 [683 A.2d 221] [New Jersey; Frye test]; State v. Morel (R.I. 1996) 676 A.2d 1347 [Rhode Island; Daubert test]; Lindsey v. People, supra, 892 P.2d 281 [Colorado; Frye test]; State v. Dinkins (1995) 319 S.C. 415 [462 S.E.2d 59] [South Carolina; Daubert test]; State v. Weeks (1995) 270 Mont. 63 [891 P.2d 477] [Montana; Daubert test]; State v. Anderson (1994) 118 N.M. 284 [881 P.2d 29] [New Mexico; Daubert test]; State v. Futrell (1993) 112 N.C.App. 651 [436 S.E.2d 884] [North Carolina; Daubert test]; People v. Chandler (1995) 211 Mich.App. 604 [536 N.W.2d 799] [Michigan; Frye test]; Taylor v. State (Okla.Crim.App. 1995) 889 P.2d 319 [Oklahoma; Daubert test].)
Moreover, extensive literature in peer-reviewed journals has accumulated in support of the conclusion that population substructuring does not impact significantly upon DNA population frequency estimates and that use of the unmodified product rule is appropriate to estimate probabilities of a random match. fn. 31 (See, e.g., Kaye, DNA Evidence, supra, 7 Harv. J.L. & Tech. at pp. 126, fn. 113, 129-130, 161 [citing scientific journals espousing the view that [21 Cal. 4th 541] statistical tests demonstrate the independence of VNTR alleles and arguing that "suitably computed and presented match-binning frequencies and probabilities pass muster under conventional rules of evidence"]; Com. v. Blasioli, supra, 713 A.2d at p. 1126 [citing scientific journal articles]; State v. Copeland, supra, 922 P.2d at p. 1319 [same]; Armstead v. State, supra, 673 A.2d at pp. 238-239 [same].)
It is important to note that the relevant question in this case is not whether some population substructuring exists, but whether the deviations it induces have an appreciable effect upon the relative frequency of the particular highly variable alleles selected for DNA profiling. (See Com. v. Blasioli, supra, 713 A.2d at p. 1125, fn. 21; Kaye, DNA Evidence, supra, 7 Harv. J.L. & Tech. at p. 169.) None of the experts, so far as we are aware, believes that population substructuring has absolutely no effect on frequencies obtained under the unmodified product rule. They conclude instead that when, as in the present case, the probabilities of a random match are very rare-one in the multimillions or billions-substantial variations in such frequencies have no practical significance. (See, e.g., 1996 NRC Rep., supra, at pp. 34, 112, 150-151, 156.) We have no occasion in this case to consider whether substructuring could be a cause of material variation in much higher frequencies, e.g., one in several hundred.
It is clear from the evidence in the record, the clear weight of judicial authority, and the published scientific commentary, that the unmodified product rule, as used in the DNA forensic analysis in this case, has gained general acceptance in the relevant scientific community and therefore meets the Kelly standard for admissibility. [21 Cal. 4th 542]
The judgment of the Court of Appeal is affirmed.
George, C. J., Mosk, J., Kennard, J., Werdegar, J., Chin, J., and Brown, J., concurred.
FN 1. See National Research Council (1992) DNA Technology in Forensic Science (hereafter 1992 NRC Report) at pages 91-93.
FN 2. The product rule states that the probability of two events occurring together is equal to the probability that the first event will occur multiplied by the probability that the second event will occur. (See Kaye, DNA Evidence: Probability, Population Genetics, and the Courts (1993) 7 Harv. J.L. & Tech. 101, 127-128 (hereafter Kaye, DNA Evidence); Freund & Wilson, Statistical Methods (1993) p. 62.) Coin-tossing is illustrative-the probability of two successive coin tosses resulting in "heads" is equal to the probability of the first toss yielding heads (50 percent) times the probability of the second toss yielding heads (50 percent), or 25 percent. (See Johnson, Elementary Statistics (4th ed. 1984) p. 143.)
FN 3. See People v. Kelly (1976) 17 Cal. 3d 24, 30 [130 Cal. Rptr. 144, 549 P.2d 1240] (Kelly) and Frye v. United States (D.C. Cir. 1923) 293 F. 1013 [54 App.D.C. 46, 34 A.L.R. 145] (Frye). In Daubert v. Merrell Dow (1993) 509 U.S. 579 [113 S. Ct. 2786, 125 L. Ed. 2d 469], the high court held, as a matter of federal jurisprudence, that Frye had been superseded by the Federal Rules of Evidence. The foundational requirement for admission of new scientific evidence in California is now referred to as the Kelly test or rule. (People v. Leahy (1994) 8 Cal. 4th 587, 612 [34 Cal. Rptr. 2d 663, 882 P.2d 321].)
FN 4. In Venegas, the issue was not reached because the trial court had rejected admissibility of statistical evidence calculated under the unmodified product rule. (18 Cal.4th at p. 94.)
FN 5. Population substructuring is thought to occur when a certain subsection of the population, for example, a racial or ethnic group, mates predominantly within that same subsection, resulting in the interrelation of certain genetic traits.
FN 6. The material in this part (pp. 519-526) has been adopted from our opinion in Venegas, supra, 18 Cal.4th at pages 58 through 67, with appropriate deletions and omissions. Subheadings have been retained. Brackets together in this manner [ ], without enclosed material, are used to denote our deletions from the opinion in Venegas; brackets enclosing material are used to denote our additions. Footnotes that have been retained are sequentially renumbered.
FN 7. Though missing from red blood cells, DNA abounds in white blood cells. (See 1996 NRC Rep., supra, at p. 66, fn. 4.)
FN 8. Individual sperm and egg cells have only 23 chromosomes that upon conception are paired with 23 chromosomes from the mate. A man's semen sample, however, normally contains a large quantity of sperm cells that collectively include all 46 chromosomes.
FN 9. In genetics, "allele" usually means an alternate form of gene on one of a pair of chromosomes at a particular locus. In forensic analysis, the term is expanded to include an alternate form of any base-pair sequence. (1996 NRC Rep., supra, p. 214.)
FN 10. "VNTR regions are not genes, and our interest in them is solely related to their use for identifying individuals." (1996 NRC Rep., supra, p. 65.) "[S]ome VNTRs might have disease associations [citation], [but] [t]hese are not used in forensics ...." (Id. at p. 71.)
FN 11. [In Venegas,] the FBI conservatively reported an inconclusive match at one of the four loci under its policy of refraining from measuring bands of more than ten thousand base pairs because it has concluded the reliability of measurements above that size is open to question.
FN 12. The evidentiary sample targeted for comparison with the population can be a suspect's sample, a questioned sample, or a combination of both. [ ] Barney appears to equate the "statistical significance of a match" with "how unlikely it is that the crime scene samples came from a third party who had the same DNA pattern as the suspect." (8 Cal.App.4th at p. 809, italics added.) On the other hand, the 1996 NRC Report (supra, pp. 142-144) recommends use of the DNA profile of the questioned sample to calculate the probability of a random match in the pertinent population. It should be kept in mind, however, that at that point in the analysis, the questioned and known suspect samples will themselves already have been declared a "match," and therefore the fragment sizes used in the calculation cannot vary beyond the bounds of the match criterion.
FN 13. Thus, the measurements are necessarily made on autorads prepared with the same restrictive enzyme and the same probes that are used for testing the evidentiary samples.
FN 14. The rules typically relate to (1) how to determine what bin or bins are overlapped by the appropriate match window, and (2) the consequences of an overlap of multiple bins. [ ] The 1992 NRC Report recommended that the bands in the overlapped bins be added together (supra, p. 86), but the 1996 NRC Report rejects that recommendation as "excessively cautious" (supra, p. 144).
FN 15. The "frequency" of one or more alleles is the statistical probability that it or they will be found in the DNA of a randomly selected member of the population from which the database is derived.
FN 16. Reasons for this nonappearance may be, for example, that the size of the second allele is unusually small or large or is so close to that of the first allele that their two bands are indistinguishable. (See 1996 NRC Rep., supra, p. 69.)
FN 17. For example, if the evidentiary sample were to have two alleles at each of four loci, each allele having a frequency of 1 out of 10 (0.1), application of the product rule would produce a frequency of 1 out of 6.25 million (.00000016).
FN 18. Conversely, the laboratories do not use a single interracial United States database, presumably because the incidence of random mating between members of the different racial categories is deemed low enough to preclude use of the product rule to calculate an overall frequency statistic for the United States population as a whole.
FN 19. This is the frequency to which Keister testified at trial. His initial calculation of 1 in 214 million, introduced at the preliminary examination, was revised to 1 in 189 million after the OCSD crime laboratory added some more samples to its database and he ran further tests on the augmented database.
FN 20. Dr Chakraborty is a well-known expert in human population genetics. Dr. Kovacs described Dr. Chakraborty as "one of the major league players in human population genetics." Defense expert Dr. Mueller agreed that Dr. Chakraborty was a "renowned" and well-respected human population geneticist. Chakraborty worked at the Center for Demographic and Population Genetics, University of Texas Graduate School of Biomedical Sciences, in Houston, Texas. Throughout his career he had published 219 peer-reviewed journal articles, 38 book chapters and invited articles, and 89 scientific abstracts. Over 100 of his publications dealt with issues of both human and nonhuman population genetics. At the time of trial he was serving on the editorial boards of more than a dozen scientific journals, in which capacity he peer-reviewed articles submitted by other scientists for publication. His research was funded by agencies such as the National Institutes of Health, the National Science Foundation, and the National Institute of Justice. He had performed data analysis for many DNA laboratories, including the FBI laboratory at Quantico, Virginia.
FN 21. Chakraborty performed several different statistical analyses before reaching his conclusion that the OCSD crime laboratory's database permitted use of the Hardy-Weinberg calculation. When the number of double-banded patterns seen in a database is lower than what would be expected under the assumptions of Hardy-Weinberg equilibrium, there is said to be a "heterozygous deficiency" in the database. Only one of the loci in one of the databases showed any heterozygous deficiency (i.e. less than the Hardy-Weinberg predicted number of two-banded patterns). That heterozygous deficiency involved one particular locus in the Orange County Caucasian database, not the Hispanic database. Even as to that one locus, Chakraborty was "firmly convinced" this "[did] not necessarily imply population substructure" as the cause of the observed deficiency, since there are "other factors" that can account for an apparent heterozygous deficiency "other than substructuring." After additional analysis, he concluded a hypothesis that substructuring had caused the deficiency "cannot be validated."
FN 22. The use of RFLP DNA analysis is widespread in the medical academic community; it has applications in a variety of medical, biomedical and biological fields. Dr. Kovacs's published research included studies on the use of RFLP technology, including multiple-locus probes, to examine genetic changes within human tumors. He was very familiar with the product rule since his work in human population genetics utilized various statistical techniques, including the product rule, to analyze data to determine mutation rates, disease frequencies, and the frequencies of specific genes within human populations.
FN 23. The trial court presumably was relying on the following holding of Kelly, supra, 17 Cal.3d at page 32: "[O]nce a trial court has admitted evidence based upon a new scientific technique, and that decision is affirmed on appeal by a published appellate decision, the precedent so established may control subsequent trials, at least until new evidence is presented reflecting a change in the attitude of the scientific community."
FN 24. In Venegas, the FBI laboratory examined the same first three loci, using D17S79 instead of D10S28 as the fourth. (Venegas, supra, 18 Cal.4th at p. 68, fn. 24.) Both laboratories used restriction enzyme Hae III to cut the DNA into fragments. (Id. at p. 60.)
FN 25. The Asian samples were collected from a variety of sources. Samples identified specifically as Japanese, Chinese, Korean, or Vietnamese were obtained from Southern California medical clinics serving a high proportion of those ethnic groups.
FN 26. Keister pointed out in his testimony that the information sought from these calculations is not the likelihood of finding the DNA pattern in the particular individuals who happen to comprise the database, but rather the frequency with which the pattern would appear if one "looked at ... thousands and thousands of samples" in the population from which the database was derived. The calculations "do not estimate the frequencies of profiles in the current population," but "refer to infinite populations and need not be limited to the reciprocal of the population size." (See Weir, Invited Editorial: The Second National Research Council Report on Forensic DNA Evidence (1996) 59 Am. J. Hum. Genetics 497.)
FN 27. Keister's use of all these databases in his calculations reflected an objective of finding the probabilities of a random match in databases representing all possible perpetrators. Even though defendant is Hispanic, a possible perpetrator other than defendant could have belonged to some other ethnic group. (See 1996 NRC Rep., supra, at p. 122 ["Recommendation 4.1: ... If the race of the person who left the evidence-sample DNA is known, the database for the person's race should be used; if the race is not known, calculations for all racial groups to which possible suspects belong should be made...."].)
FN 28. "A floating bin, constructed for each forensic comparison, is a range of sizes at least as large as the match window, centered on the measured size of the evidentiary band in question." (Venegas, supra, 18 Cal.4th at p. 64.)
FN 29. As explained below, Kidd, along with Chakraborty, had co-authored an article that appeared in the same issue of Science as did the Lewontin and Hartl article and defended the practice of performing statistical calculations of probability estimates without regard to substructuring. (Chakraborty & Kidd, The Utility of DNA Typing in Forensic Work (Dec. 20, 1991) 254 Science 1735 (Chakraborty and Kidd article).)
FN 30. The NRC is a private, nonprofit society of distinguished scholars that is administered by the National Academy of Sciences, the National Academy of Engineering and the Institute of Medicine. The NRC formed the Committee on DNA Technology in Forensic Science to study the use of DNA analysis for forensic purposes, resulting in the issuance of the 1992 report.
FN 31. In a supplemental brief, defendant contends that the Kelly principle does not allow courts to refer to scientific literature to show that a new scientific technique is generally accepted, and that Kelly permits such reference only to establish the absence of general acceptance. Though disagreement in scientific writings has repeatedly played an important role in judicial conclusions that general acceptance was lacking (e.g., People v. Shirley (1982) 31 Cal. 3d 18, 55 [181 Cal. Rptr. 243, 723 P.2d 1354]; Kelly, supra, 17 Cal.3d at pp. 35-36; Huntingdon v. Crowley (1966) 64 Cal. 2d 647, 656 [51 Cal. Rptr. 254, 414 P.2d 382]; Barney, supra, 8 Cal. App. 4th 798; People v. Law (1974) 40 Cal. App. 3d 69, 75, 84-85 [114 Cal.Rptr. 708]), nothing in our case law precludes judicial reliance on writings to support a conclusion of general acceptance (People v. Shirley, supra, 31 Cal.3d at p. 56 [court considered writings as "evidence" of general scientific "acceptance vel non"]; People v. Palmer (1978) 80 Cal. App. 3d 239, 252-254 [145 Cal. Rptr. 466, 1 A.L.R.4th 1056] [general acceptance established by apparent unanimity of opinion in scientific journals].) Moreover, allowing the courts to consider scholarly literature solely to reject, and not to uphold, scientific acceptance would undermine the salutary effects of Kelly's holding that published appellate affirmation of general scientific acceptance controls subsequent trials (17 Cal.3d at p. 32). In a context of rapidly changing technology, every effort should be made to base that controlling effect on the very latest scientific opinions, including those published during the appellate phase of the case.
It remains true that "[o]ften ... the technical complexity of the subject matter will prevent lay judges from determining the existence, degree, or nature of a scientific consensus or dispute without the interpretive assistance of qualified live witnesses subject to a focused examination in the courtroom. It is for this reason that Kelly ... properly emphasizes the record made in the trial court." (People v. Brown, supra, 40 Cal.3d at p. 533.) With that in mind, we declined in Venegas, supra, 18 Cal.4th at page 94, to undertake our own study of the scientific literature as a basis for reconsidering a claim of scientific acceptance that had been rejected by the trial court and Court of Appeal. Here, however, the rationale for general acceptance was amply explained by extensive scientific testimony, accepted by the trial court and Court of Appeal. It then became our task to ascertain the extent of agreement with that rationale. While there may have been merit in defendant's warning against our uncritical acceptance of the 1996 NRC Report before we could gauge response from the scientific community, that response is now available and has properly been an important factor in our conclusion.