Fundamental Frequency Measures
#
Rate of vocal fold vibration — perceived as pitch.
-
Mean Fundamental Frequency
f0_mean_hz
- Average rate of vocal fold vibration. Indicates typical pitch (age/gender differences).
-
Min / Max Fundamental Frequency
f0_min_hz
,
f0_max_hz
- Lowest and highest detected pitch; useful for range assessment.
-
Standard Deviation of f0
f0_std_hz
- Pitch variability. High values may indicate instability or vocal tremor.
-
Pitch Frequency Range
pfr_semitone
- Range expressed in musical semitones. Narrow ranges can indicate pathology (paralysis, muscle tension).
Voice Quality — Noise / Harmonics
#
-
Harmonics-to-Noise Ratio
hnr_db
- Ratio of periodic (harmonic) energy to noise, in dB. Lower HNR indicates breathy or hoarse voice.
Normal > 20 dB; pathological < 15 dB
-
Harmonics-to-Noise Ratio (Voice Report)
hnr_voice
- Alternate HNR calculation using Praat's voice report algorithm.
-
Noise-to-Harmonics Ratio
nhr
- The inverse of HNR. Higher NHR = more aperiodic/noisy energy.
-
NHR (Harmonicity)
nhr_harmonicity
- NHR derived from the harmonicity (autocorrelation) method.
-
NHR (Voice Report)
nhr_voice
- NHR calculated via Praat's voice report.
Voicing Break Measures
#
-
Degree of Unvoiced Frames
duv
- Percentage of frames where pitch was not detected (voice stopped). High values may indicate severe voice issues.
-
Number of Voice Breaks
nvb
- Total count of phonation interruptions lasting longer than a few milliseconds.
-
Degree of Voice Breaks
dvb
- Total duration of all voice breaks combined, as a percentage of phonation time.
Frequency Perturbation — Jitter
#
Cycle-to-cycle frequency variability — correlates with perceived roughness.
-
Local Jitter
jitter_local
- Average absolute difference between consecutive pitch periods, divided by average period.
Normal < 0.5%; pathological > 1.0%
-
Local Absolute Jitter
jitter_abs
- Same as local jitter but expressed in microseconds (µs).
-
Relative Average Perturbation
jitter_rap
- Smoothed jitter averaging over 3 consecutive cycles — reduces random noise.
-
Pitch Perturbation Quotient
jitter_ppq5
- Jitter averaged over 5 consecutive cycles.
-
Difference of Differences
jitter_ddp
- Average absolute difference between consecutive differences of pitch periods — measures jitter "acceleration."
Amplitude Perturbation — Shimmer
#
Cycle-to-cycle amplitude variability — correlates with perceived harshness or strain.
-
Local Shimmer
shimmer_local_percent
- Average absolute difference between consecutive peak amplitudes, divided by average amplitude.
Normal < 3.0%; pathological > 5.0%
-
Local dB Shimmer
shimmer_db
- Amplitude perturbation expressed in decibels.
-
APQ3
shimmer_apq3
- Smoothed shimmer averaging over 3 consecutive cycles.
-
APQ5
shimmer_apq5
- Smoothed shimmer averaging over 5 consecutive cycles.
-
APQ11
shimmer_apq11
- Smoothed shimmer averaging over 11 consecutive cycles.
-
Difference of Differences
shimmer_dda
- Average absolute difference between consecutive differences of amplitudes — shimmer "acceleration."
Temporal & Voicing Measures
#
-
Maximum Phonation Time
MPT
- Longest sustained vowel duration (seconds). Indicates respiratory support and glottal closure efficiency.
Normal > 15s (adults); > 10s (children)
-
Lowest Intensity
ILow
- Quietest dB level detected — useful for assessing dynamic range.
-
Number of Unvoiced Frames
nuv
- Raw count of silent/breathy frames in the recording.
-
Total Speech Segments
seg
- Number of detected speech chunks (separated by silent pauses).
-
Amplitude Variation
vAm
- Percentage of frames with rapid amplitude changes — marker of instability.
-
Frequency Variation
vF0
- Percentage of frames with rapid pitch changes — marker of instability.
-
Pitch Frames
per_pitch
- Total number of analysis frames where pitch was detected. Higher counts = more reliable analysis.
-
Number of Periods
per_pp
- Total number of individual glottal cycles counted.
-
Average Pitch Period
avg_period_ms
- Average time (ms) between consecutive glottal pulses (= 1000 / mean f0).
Spectral & Cepstral Measures
#
Analyze the overall frequency spectrum — capturing vocal quality, resonance, and vocal tract filtering.
-
Mean Cepstral Peak Prominence
meanCPP
- How prominent the pitch-related cepstral peak is. The single best acoustic marker of dysphonia severity.
Normal > 12 dB; pathological < 8 dB
-
Std Dev of CPPS
stdevCPP
- Variability of cepstral peak prominence over time.
-
CPPS-derived F0
mean_cppF0
- Average pitch derived from the cepstrum (vs. traditional autocorrelation).
-
Std Dev of CPPS F0
stdev_cppF0
- Pitch variability from cepstral analysis.
-
Low-to-High Ratio
meanLH_ratio
- Energy ratio below vs. above ~1000 Hz. Higher = "dark"/muffled; lower = "bright"/thin.
Useful for assessing hyponasality vs. hypernasality
-
Low-to-High Ratio (dB)
meanLH_ratio_dB
- Same ratio expressed in decibels.
-
Std Dev of LH Ratio
stdevLH_ratio
- How much the low-to-high ratio changes over time — high variability = poor resonance stability.
-
Std Dev of LH Ratio (dB)
stdevLH_ratio_dB
- LH ratio variability in decibels.
-
LTAS Slope
mLTAS
- Long-term average spectrum slope. Steeper = softer/breathier; flatter = louder/pressed.
-
LTAS Tilt
tiltLTAS
- Difference between low and high frequency slopes. Helps differentiate vocal fold stiffness from edema.
Clinical Voice Indices
#
Mathematical combinations of raw acoustic parameters into single severity scores.
-
Dysphonia Severity Index
DSI
- Combines MPT, highest f0, lowest intensity, and jitter.
Scale: +5 (healthy) to −5 (severe dysphonia)
-
Acoustic Voice Quality Index
AVQI
- Aggregates CPP, shimmer, HNR and other features.
Scale: 0 (normal) to 10 (severely pathological)
-
Cepstral Spectral Index of Dysphonia
CSID
- Uses CPP and L/H ratio to quantify roughness and breathiness.
Scale: 0 (normal) to 100 (severe)