Emotion Perception in Music: Study with 1,563 Participants on Genre and Affective Responses
Participants
The study recruited 1,563 participants from Amazon Mechanical Turk (MTurk). All participants were located in the United States, spoke English, and were at least 18 years old. Informed consent was obtained from all participants after explaining the survey’s aims.
Sampling Procedure
Participants were randomly assigned to one of 50 survey subsets, each featuring 15 music excerpts. Random assignment ensured that each participant experienced a diverse range of genres, with no genre appearing more than three times in any subset.
Materials and Measures
Stimulus Set
A total of 750 five-second instrumental music excerpts were selected from an existing database. The excerpts spanned 16 genres, including Alternative, Ambient, Classical, and others, allowing for a broad representation of musical styles.
Randomisation
Each music excerpt was presented to participants in a constrained randomised manner. This ensured even distribution and allowed every subset to be allocated before repeating any for another participant.
Acoustic Measures
Acoustic analyses were performed using the MIRtoolbox, focusing on various audio features, including:
- RMSE (Root Mean Square Energy): Indicates amplitude ranging from 0 to 1.
- Brightness: Measures high-frequency energy, also scored from 0 to 1.
- Roughness: Reflects sensory dissonance, with scores ranging from 3 to 1,390.
- Rolloff: Measures energy below a defined threshold, with higher percentages indicating more low-energy sounds.
- Zero-Crossing Rate: Indicates noisiness; higher values reflect noisier samples.
- Mode: Determines major or minor modality, scoring between -1 and 1.
- Pitch: Measures mean pitch frequency, higher values indicate greater presence.
- Pulse Clarity: Scores indicate how easy it is to discern a rhythmic beat.
- Event Density: Measures the average frequency of events in an audio sample.
Other spectral measures included MIR spread, skewness, kurtosis, flatness, and entropy to assess spectral dispersion.
Response Measures
Participants rated the emotional qualities of the music excerpts across several dimensions:
- Energy Arousal (EA): Ranging from tired to energetic.
- Tension Arousal (TA): Ranging from relaxed to tense.
- Valence: Ranging from unpleasant to pleasant.
- Dominance: Ranging from submissive to dominant.
- Affiliation: Ranging from antisocial to highly social.
The ratings used a 1 to 7 scale, with ratings above 5 indicating strong emotional perception.
Modified Self-Assessment Manikin (SAM)
SAM was used to visually measure levels of pleasure, arousal, and dominance, assisting participants in understanding the emotional dimensions.
Tempo
Participants rated the tempo of each excerpt on a 7-point scale to validate the MIRtoolbox’s tempo extraction.
Procedure
The survey was conducted via MTurk, where participants provided informed consent after reading a project description. They received financial compensation only after completing all 15 audio samples. Each music excerpt was accompanied by instructions and the option to listen twice. Participants filled out six ratings after each excerpt, focusing on emotional dimensions.
Data Screening
Out of 750 music excerpts, one had an unmeasurable pitch value and was excluded, leaving 749 excerpts for analysis. After screening responses, 1,513 valid participants remained. Randomising survey subsets resulted in a balanced number of responses for each music excerpt, allowing for comprehensive statistical analysis.
