United States v. Raymond, 337 F. Supp. 641 (D.D.C. 1972)
February 2, 1972
Albert RAYMOND and Roland Addison.
United States District Court, District of Columbia.
John F. Evans, Asst. U. S. Atty., Washington, D. C., for United States.
Jean Dwyer, Washington, D. C., for defendant Raymond.
Leonard I. Rosenberg, Washington, D. C., for defendant Addison.
GASCH, District Judge.
This matter came on for consideration on the government's motion to introduce into evidence voice spectrograms, the opposition noted and the hearing conducted pursuant thereto. As indicated below, this Court concludes that such evidence is admissible. Since this is a case of first impression in this jurisdiction, the Court believes that it may be helpful to set forth the basic facts in this case as well as the circumstances which, in the Court's view, justify the admission of this evidence.
Defendants were charged with shooting Sergeant Ronald Wilkins, a member of the Metropolitan Police Department, as he responded to the radio dispatch of a telephone call made to police headquarters falsely reporting a Signal Thirteen, or policeman in trouble. The Metropolitan Police Department maintains a twenty-four hour tape of all incoming calls, and for the purposes of this case, re-recorded the phone call which brought Wilkins to the scene of the alleged ambush. After the defendants were arrested based upon Wilkins' identification, with counsel present, each defendant read the statements made by the caller into a tape recorder. The recorded samples were then forwarded, along with the tape of the April 9 telephone call, to Lt. Ernest Nash, a voice identification technician with the Michigan State Police Department, Lansing, Michigan. Lt. Nash then made spectrograms from each of the tapes supplied by defendants and compared them with *642 the spectrogram he made of the Signal Thirteen. On the basis of this comparison, Lt. Nash concluded that the phone call made to police headquarters which led to the shooting of Sergeant Wilkins was made by defendant Albert Raymond. On December 15, 1971, expert testimony was elicited to determine whether or not the spectrograms or "voiceprints" were admissible as evidence, and after considering that testimony, this Court concludes that the spectrograms may be admitted.
The voice spectrogram, which is produced by a spectrograph machine, is a visual record of human speech. In substance, the spectrograph machine consists of (1) a magnetic recording device, (2) a variable electronic filter, (3) a paper-carrying drum that is coupled to the magnetic recording device, and (4) an electronic stylus that marks the paper as the drum rotates. Spectrograms thus produced can be compared point for point to determine if any significant similarities exist. This is precisely what was done by Lt. Nash in the case at bar when he compared the unknown voice of the caller to the known voices of the defendants.
Spectrography first gained prominence in the scientific community some ten years ago through the pioneer study of Lawrence G. Kersta, a scientist at the Bell Laboratories. However, despite the strong claims made by Kersta for the reliability of voice identification through spectrogram analysis, few acoustical scientists shared his confidence. E. g., Ladefoged and Vanderslice, The "Voice-print" Mystique, reprinted from Working Papers in Phonetics, Dep. Linguistics, U.C.L.A., November, 1967. Significantly, however, the reservations of the scientific community were not based upon a belief that spectrography could not produce the results claimed by Kersta, but that his experiment had not demonstrated with sufficient certainty the reliability of spectrogram analysis. E. g., Bolt, et al., Identification of a Speaker by Speech Spectrograms, Science, Vol. 166, October 17, 1969. This view was most prominently expressed by the Committee on Speech Communication of the Acoustical Society of America in Speaker Identification by Speech Spectrograms: A Scientist's View of its Reliability for Legal Purposes, 47 Journal of the Acoustical Society of America, 597 (1970).
Not surprisingly, the skepticism of the scientific community was reflected in the attitude expressed by the courts concerning the use of spectrogram analysis as evidence. While the United States Military Court of Appeals upheld the admissibility of expert testimony based upon sound spectrograph identification, United States v. Wright, 17 U.S. C.M.A. 183, 37 C.M.B. 447 (1967) other courts refused to permit its use. See, People v. King, 266 Cal. App. 2d 437, 72 *643 Cal.Rptr. 478 (1968); State v. Cary, 53 N.J. 256, 250 A.2d 15 (1969). It was not until the most recent reported case on the subject, State ex rel. Trimble v. Hedman, Minn., 192 N.W.2d 432, 11/26/71, which is factually quite close to the case at bar, that an appellate court, the Supreme Court of Minnesota, upheld the admission of spectrogram evidence to corroborate an identification made by the human ear alone. Trimble, supra, at 441. Significantly, that ruling was based upon the latest scientific information available, supplied by Dr. Oscar Tosi, Professor of Audiology and Speech Sciences at Michigan State University. It is on the basis of the extensive Tosi study, his testimony in open Court, and the opinions expressed by other experts, that this Court concludes spectrogram analysis is admissible evidence.
The real import of the Tosi study is that it remedies the two major defects of the Kersta study. First, Kersta was criticized for using a heterogeneous sampling of unknown voices, i.e., the spectrograms used represented speakers with different accents, of different ages and backgrounds, and that this fact made it easier to differentiate between speakers. Tosi, on the other hand, used a homogeneous sampling of 250 students at Michigan State University each of whom was carefully screened by Tosi's associates from a group of over 25,000 students. Thus, the 250 selected each spoke what is referred to as non-accented, or General-American English, had no noticeable speech defects, were all male, undergraduate students, and ranged in age from 19 to 34. The second major criticism of the Kersta experiment was that it was conducted using only closed testing groups, i.e., that the spectrogram of the unknown voice was always included in the group of known voices being used. Thus, in the Kersta test, all the examiner had to do was find the sample in the known group of spectrograms that most closely matched the spectrogram of the unknown voice in order to make an "identification." Dr. Tosi, mindful of this defect, set up both open and closed experiments, i.e., in the open tests, the examiners were told that the spectrogram of the unknown voice may or may not be among the spectrograms of the known speakers. Thus, if the spectrogram of the unknown speaker did not match one of the known speakers' spectrograms, no identification would be made. This "open" situation more closely parallels the actual situation confronted by law enforcement officials when making voice identifications, in that the voice spectrograms of various suspects may or may not be that of the unknown caller.
After two years of experimentation and nearly 35,000 separate voice identification trials, Dr. Tosi concluded that voice identification through spectrogram analysis has "a definite usefulness in the investigation of crime." This conclusion is more than supported by the actual results reported. While the total reported percentage of false identifications of an unknown speaker as a known speaker was approximately six percent, and the total percentage of failure to identify an unknown speaker as a known speaker was between ten and twelve percent, these figures do not reflect the full degree of reliability established. The examiners were compelled to draw a conclusion in each case, whether they felt that conclusion to be accurate or not. Each examiner had to rate his or her degree of confidence in each determination on a scale containing four degrees of certainty. Thus, when the *644 cases in which an examiner expressed uncertainty as to his or her conclusion are deducted from the number of misidentifications, the margin of error is only about two percent. In an actual forensic situation, an experienced examiner like Lt. Nash will only make an identification when he feels a high degree of certainty. For example, out of some 1,250 examinations performed by Nash in which spectrograms of an unknown speaker were compared to those of a known speaker, Nash made only about 180 positive identifications, eliminated positively about 450 and would not make a definite decision in the remaining 620 some odd comparisons. This is one of the significant factors which led Dr. Tosi to state that the possibility of Nash making a mistaken identification is "negligible."
Moreover, as Dr. Tosi's testimony indicates, there exist a number of other factors which lead to the conclusion that the Tosi study had a lesser degree of success than should be expected in the type of analysis utilized by Lt. Nash. First, Tosi's examiners were all students who were only given a one-month training period before they performed their assigned tasks. Nash, who studied under Mr. Kersta, and was called by Dr. Tosi one of the five or six most expert examiners in the country, has been with the Michigan State Police Department for over fifteen years, has inspected over 50,000 voice spectrograms, and has been qualified as an expert in sound spectrography on numerous occasions in various courts. Second, Dr. Tosi's student examiners were allowed to spend a maximum of fifteen minutes per identification, while Nash testified that it took him a few hours to make the voice identification of Raymond. Third, Tosi's student examiners, as stated above, were required to make a match, and in the open tests, either make a match or state that the unknown spectrogram was not among the known. In an actual forensic situation, a professional examiner can reach any of at least five conclusions: positive identification; positive elimination; possible identification; possible elimination; and inability to reach a conclusion. Fourth, Dr. Tosi's examiners were given spectrograms of no more than nine clue words; a professional examiner uses as many samples as he needs either to make a match or eliminate a suspect. For example, in the case of Nash's identification of Raymond, Nash had the complete verbatim statements of the caller and the suspects in actual context. Finally, Dr. Tosi's examiners made their identifications only through a visual examination of the spectrograms; professional examiners like Lt. Nash also hear the actual tape recorded voices to aid in their comparisons.
Dr. Tosi's study has substantially changed the opinion expressed by the scientific community as to the reliability of voice spectrograms as a means of identifying an unknown voice. A striking *645 example of this can be seen in the case of Dr. Peter Ladefoged, Professor of Phonetics at U.C.L.A. Dr. Ladefoged was co-author of a leading article which criticized the Kersta study and conclusions, Ladefoged and Vanderslice, The "Voiceprint" Mystique, supra, and even testified as an expert against the admission of spectrograms into evidence in the Trimble case, supra. After examining the Tosi study, however, Dr. Ladefoged stated he now believes that spectrograms have been established as a reliable method of voice identification, and testified in favor of the admission of spectrograms in the case at bar.
In ruling that the spectrographic identification proffered in the case at bar is admissible, this Court does not imply that such evidence is mistake-proof or that any voice identification should be admitted. Our holding, based upon the complete record before the Court, relying especially on the latest scientific evidence and the expertise of the individual making the identification, is that the spectrographic identification of Albert Raymond was clearly reliable enough to be admitted into evidence. The jury, having the benefit of the available expert testimony on the subject at trial and fully aware of the facts of the case, may give the evidence as little or as much credence as it sees fit.
The government's motion is, accordingly, granted.NOTES
 While no objection was made to the admission of the voice spectrograms on the grounds that the taking of voice exemplars, necessary to the preparation of voice spectrograms, violates the Fifth Amendment, the Court notes that United States v. Wade, 388 U.S. 218, 87 S. Ct. 1926, 18 L. Ed. 2d 1149 (1967), appears dispositive. In Wade, the Supreme Court held that compelling the defendant to utter words allegedly spoken by the robber within hearing distance of witnesses did not violate the Fifth Amendment. Wade, supra, at 222-223, 87 S. Ct. 1926.
 The instant case presents a situation wherein due to the scientific nature of the evidence proffered expert testimony is necessarily admissible. Richardson, Evidence § 387 (9th ed. 1964); See, United States v. Sheiner, 410 F.2d 337 (2d Cir. 1969); Jenkins v. United States, 113 U.S.App.D.C. 300, 307 F.2d 637 (1962).
 Voice Identification Research, A Report to the Law Enforcement Assistance Administration, United States Department of Justice, Department of Michigan State Police, East Lansing, Michigan, Grant No. NI 70-004, February, 1971, page 9 (hereinafter cited as Research); See, Presti, High-Speed Sound Spectrograph, 40 Journal of the Acoustical Society of America, 628 (1966).
 See, Kersta, 196 Nature, 1253 (1962).
 "We conclude that the available results are inadequate to establish the reliability of voice identification by spectrograms. ... Procedures exist, as we have suggested, by which the reliability of voice identification methods can be evaluated. We believe that such validation is urgently needed." 47 Journal of the Acoustical Society of America, at 603.
 One commentator, however, mindful of the then-ongoing Tosi study, believed the spectrogram technique to be potentially reliable. Kamine, The Voiceprint Technique, 6 San Diego L.Rev. 213 (1969). See, Cederbaums, Voiceprint Identification: A Scientific and Legal Dilemma, 5 Crim.L.Bull. 323 (1969).
 Transcript of evidentiary hearing on the admissibility of spectrogram analysis, December 15, 1971, pp. 16, 39-44, hereinafter cited as Tr. at ____.
 Tr. at 15.
 Research at 39.
 Tr. at 18; Research at 32.
 Research at 32.
 Tr. at 17.
 Tr. at 17.
 Tr. at 18.
 Tr. at 80-81.
 Tr. at 32.
 Tr. at 27.
 Dr. Tosi testified that the voiceprint identification method relies heavily on the skill of the examiner. Tr. at 35. It is for this reason that Tosi stated that the possibility of an expert of Nash's calibre making a false identification is "negligible." Tr. at 32.
 Tr. at 22.
 Dr. Tosi testified that the 2.4 percent error reported would decrease if the examiner had all the time he or she felt necessary to make a determination. Tr. at 22.
 Dr. Tosi stated that the expected rate of error should be reduced if more words were provided in the sample. Tr. at 23.
 According to Dr. Tosi, "listening in addition to visual inspection will greatly enhance the success of this identification or elimination task." Tr. at 21.
 On May 24, 1971, Dr. Peter Ladefoged, after reading the Tosi study, wrote a letter to Dr. Edward David, the President's Science Advisor, stating his newfound respect for "voiceprint" identification. Dr. Ladefoged also sent copies of this letter to fifty of the leading experts in the field. Based upon his discussions with a number of experts at professional meetings and comments he received on his letter to Dr. David, Dr. Ladefoged testified that the general views of the scientific community regarding spectrograph identification "are substantially as I have presented in the letter as my opinion." Tr. at 99.
 Tr. at 93.
 In light of the latest scientific information on the subject, "voiceprint" identification compares quite favorably with other recognized and admissible forms of scientific identification, such as handwriting analysis, Lewis v. United States, 127 U.S.App.D.C. 269, 382 F.2d 817 (1967), ballistics tests, Goodall v. United States, 86 U.S.App.D.C. 148, 180 F.2d 397, cert. denied, 339 U.S. 987, 70 S. Ct. 1009, 94 L. Ed. 1389 (1950), analysis of boot tracks left at the scene of the crime, McClard v. United States, 386 F.2d 495 (8th Cir. 1967), and the recent use of neutron activation analysis to identify the source of bomb fragments, United States v. Stifel, 433 F.2d 431 (6th Cir.), cert. denied, 401 U.S. 994, 91 S. Ct. 1232, 28 L. Ed. 2d 531 (1970).