A Java Applet to Calculate Virtual Pitch

Jeff Jensen           October 2008

  1. Introduction: What is virtual pitch?
  2. How to use the applet
  3. Applet and source code
  4. Examples of output and some quirky behavior
  5. Technical discussion of how the algorithm works
  6. Ideas for further work
  7. A Crash Course in Acoustics (work in progress)
  8. References
  9. Contact me

Introduction: What is virtual pitch?

Virtual Pitch is a concept originated by Prof. Ernst Terhardt from Technische Universitat Munchen in 1969-1970.. Basically it is an extension of the fundamental bass of Rameau and the residue pitch of Schouten. In very basic terms, it deals with how a bunch of distinct pitches (called a complex tone) are percieved by the human auditory system to fuse together into a "single" tone. The simplest example would be plucking a single guitar string, say the top string tuned to e, with frequency ν. It has harmonics 2ν, 3ν, 4ν...., and these are all distinct and audible pitches, but we percieve the sound as a single pitch e. Now if the string is not perfectly flexible (think of a steel piano string), then when it vibrates in segments, the frequencies are not exactly integer multiples of the fundamental ν; instead they may be 2.0001ν, 3.00006ν, 4.00002ν.... (and now they would better be termed partials rather than harmonics). But we still tend to percieve a single pitch e, provided the partials are not too far off their ideal values.

Now consider several different guitar strings played at the same time. There is still the tendency for the human auditory system to try to fuse these sounds together, if possible (this is a well established experimental fact). Pluck the chord c4-e4-g4 (the subscripts are the standard numbering for what octave the notes are in; the 4th octave is from middle c up to b). People will "hear" notes that are not actually present, such as c3 and c2. These tones are virtual pitches in Terhardt's terminology; by contrast a tone that actually is present as a vibration in the air, like c4, is called a spectral pitch.

There is a little more to the definition than that. Following the discussion on Terhardt's web site ([Terhardt *] in the references), a spectral pitch is directly the effect of sound waves in the air exciting a place in the cochlea of the ear. Virtual pitch is phenomenon of processing by the brain. Thus in the example above with the note e and waves in the air of frequencies ν   2ν   3ν   4ν...., the ear picks up a spectral pitch of ν, and at the same time the brain processes the upper harmonic sequence   3ν   4ν...., and detects a virtual pitch of ν!

The perception of virtual pitch is a bit delicate, however. There is some variability from listener to listener. It depends on the loudness (sound pressure level, SPL) of the sounds, and the mixture of the partials. A complex of sounds almost never fuses into a single pitch, instead we get a spectrum, and which pitch from the spectrum is most prominent can depend on the musical context. This is still an active area of research in the psychological acoustics community.

How to use the applet

This is a conversion of Terhardt's original C code into Java, with a graphical user interface, and an additional level of processing to allow the user to input note names or just intonation fractions, rather than raw frequencies. The algorithm itself is completely independent of any tuning or temperament; it works internally only with frequencies and loudness values. For musical convenience, the interface of this applet lets you input note names in 12-equal temperament and get the results back in terms of 12-equal temperament note names, if you like.
  1. [Frequencies box]
    The text box functions like a command line. There are several formats that you can use to input the base frequencies:
  2. [SPL box]
    Here we input the loudness (in decibels) of the frequencies listed above. For convenience, you don't have to list them all. For example, if you have 5 frequencies in the upper box, you should need to list, in order, 5 loudness values here. But if you put only 3 (or 1), it will set all the remaining ones to the last value.
  3. [masking]
    As an experimental feature, you can turn masking off. It is on in Terhardt's original algorithm.
  4. [upper harmonics]
    This is purely a convenience feature. To save you having to type in upper harmonics in the Freq box , this will automatically add them on. Real musical instruments produce tones with upper harmonics.
  5. [pitch shifts]
    Probably you will want to leave this turned off, for music theoretic purposes. It is a psychoacoustical fact that people percieve a slight shift in pitch, dependent upon the loudness of the tone, the shape of the amplitude spectrum, and the presence of other tones which may produce masking. It tends to produce results that diverge from traditional music theory, and thus may not really be applicable in the context of a concert, for example.
  6. [weight]
    The weight values can be expressed as their raw numbers, or expressed as a percentage of the sum of all such values.
  7. [identify to pitch classes]
    By default, we calculate the weights of all the distinct pitches in all octaves. However, it may be desirable to group all the notes "E" together, for example, without regard to octave. This feature only works for named notes, like E or C#; a pitch that is in between the standard ones is still expressed as something like "??(3)".
  8. [Temperament for output]
    This refers to assigned note letter names to the output frequencies.
  9. [Calculate button]
    When you have made all your input settings, click on this. The applet automatically resets itself, so you can keep on changing the inputs and keep on calculating.
Sometimes the applet image gets distorted after scrolling; if this happens, just click on your browser's refresh button.


alt="Applet failure..? Your browser does not support Java???"
source code

Examples and explanation of output

Lets run with the default input and see what happens:

Input settings
Box name Value Remarks
Freqs (Hz): n 440 550 660  
SPLs (dB): 70 equivalent to   70 70 70
masking: on active if box is checked
upper harmonics: No upper harmonics  
pitch shifts: no pitch shifts  
weight: % The weight values can be expressed as a percentage of the sum of all of them
pitch classes: Not pitch classes This feature adds together the weights of all the notes that have names.
output reference: 12-equal temperament the standard

Output of algorithm
Note name Weight Freq Pitch type Remarks
A4 0.395 440.0 Hz   The pitch with the highest weight value is A in the 4th octave with frequency 440 Hertz. The psychoacoustical meaning of a raw weight value is unclear, as Terhardt points out in his paper (referenced below).
A2 0.242 110.0 Hz Virtual This pitch is not present in the original sounds
A3 0.223 220.0 Hz Virtual  
??(5) 0.216 550.0 Hz   The ?? is because the frequency 550 Hz is not a close enough match to the equal temperment value of C#5 = 554.37
E5 0.188 660.0 Hz    
A1 0.121 55.0 Hz Virtual  

What I call the Rameau root is the greatest common factor of all the input frequencies (with a small margin for error built in).

Things to Puzzle over:

  1. n 300 350 400 450 500 produces only one output: 300
  2. l C#4 D4 D#4 gives no output.
  3. l C4 C4 C4 should give root C4! [Perhaps it is just that the algorithm intends all the same pitches to be collected together at the outset.
  4. Adding upper harmonics sometimes turns spectral pitches into virtual ones. But this is not suprising because, as was discussed in the introduction, both types of pitch perception are active at the same time, and one will produce a stronger signal than the other.
The solution to these anomalies may be to be able to turn off masking and to be able to adjust the weighting of the various parameters in Terhard's algorithm (described below).

Note also that we do not get exactly the same results as Terhardt published in 1982 for this same triad A4 - C#5 - E5:.

Output of algorithm from [Terhardt, Stoll, Seewann 1982a] (Table 1 p.677)
Note name Weight Freq Pitch type
A4 1.41 440 Hz Virtual
A3 1.09 220 Hz Virtual
A2 0.59 110 Hz Virtual
D3 0.52 293.3 Hz Virtual
E6 0.35 1320 Hz Spectral
F2 0.28 87.3 Hz Virtual

The explanation for this is probably the unknown mix of partials; Terhardt recorded the chord being played on a piano and analyzed that, but didn't publish the spectra, except to say the sound was 70 dB. (I might also remark here that the mix of partials in a piano tone is quite complicated and constantly changing as the tones are sustained). It is also possible that Terhardt's algorithm changed slightly from 1982 to 1994, which is the date of the code I got.

Description of the algorithm

The raw C language source code is available from Prof. Richard Parncutt's website:
http://www-gewi.uni-graz.at/staff/parncutt/ptp2svpCode.html (see also the references at the end).

Here, Terhardt's original code is reincarnated in the Java classes PartTonePattern, SpectralPitchPattern, VirtualPitchPattern, and CombinedPitchPattern.

The algorithm takes a set of R input frequencies, in Hz { f0, f1, ..., fR-1 } and R sound pressure level (SPL) values, in dB. Once input, these are sorted and put into the Part Tone Pattern object.

The Spectral Pitch Pattern creates a set of weights for each frequency { w0, w1, ..., wR-1 }. It does masking and computation of pitch shifts, based on experimentally measured acoustical parameters.

The Virtual Pitch Pattern is where the virtual pitches are calculated. Here is the criteria that Terhardt says he uses to give a numerical weight to the virtual pitch candidates:

  1. The number of relevant spectral components which provide the same (or nearly the same) virtual pitch. The weight should increase with the number of components. This doesn't seem to be present in the code, however! [See the function sortIntoVP( )].
  2. The spectral pitch weight of the relevant components: Higher spectral weight supporters should imply higher virtual weight for the candidate. This is accounted for in the formula for Cij.
  3. The weight should decrease as the subharmonics numbers get higher (and thus the subharmonics more distant from the actual frequencies present). This is accounted for in the formula for Cij.
  4. The weight should increase with the accuracy of the subharmonic coincidences, attaining a maximum value for perfect matching. This is accounted for in the formula for Cij, specifically in the factor (1 - γ/δ).

One idea for a future enhancement would be to allow the user to alter the amount of importance given to each of these items.

The heart of the algorithm is the function subCoincidence():

[Note that we only care about how the harmonics of the sub-harmonics of fi match the fundamentals fj; we don't try to match upper harmonics of a given fj ].

Since formatting equations in HTML is problematic at best, here are some of the expressions for the above quantities written separately from the list:

n = Int [
fj
           
1/m·fi
]

The expression for γ, the what Terhardt calls the degree of inharmonicity, is the amount which 1/m·fi misses being a subharmonic of fj.

We want | n·(  1/m·fi )   -   fj |   <   δ·fj     [ 8% of fj ]

γi,j,m   =   γ   =  
  1  
fj
| n·( 1/m·fi )   -   fj |

We require γ < δ = 0.08 for the 2 frequencies to be considered a "match". In this case, we compute Cij, which Terhardt calls the Coincidence coefficient. (It is a geometrical mean of the weights, but I don't fully understand its justification):

C( 1/m·fi ,   fj )   =   Cij   =   {
[
Wi   Wj

·
m   n
]½ · (1 - γ / δ)
     
0   if γ > δ or n > 20

Note that we do not use the shifted frequencies to do the subharmonic matching! The shifts are only accounted for later in forming the Combined Pitch Pattern.

The Rameau root is a theoretical construction; this pitch is not necessarily perceived by listeners. It is the greatest common divisor of all the input frequencies, with a little fudge factor built in.

Things I have added that were not in Terhardt's original C code:

  1. Of course, the graphical interface
  2. Rameau root
  3. I disabled the cutoff of virtual pitches at 500 Hz in the subroutine truVP.

Ideas for further work:

  1. Mathematically viewing the virtual pitch algorithm as a map:
    frakct : (Spectra) (Spectra)

    We can ask: Does this map have any fixed points? Invert the map to five a spectra that maps to a single tone?
  2. Understand the meaning of the pitch weights. In [Terhardt,Stoll,Seewan 1982b] p.687, various possible interpretations are given as "pitch strength", "salience", and "probability of perceiving a particular pitch".
  3. Express the various pitch weights as percentages of their total, rather than just a raw number.

References

Return to Music Theory  
Return to Home Page 

Send me email: jjensen14@hotmail.com Advisory: messages with keywords typical of spam in the subject line (including "!" as in "Get out of debt!") get automatically discarded before I see them.