Participant Profile
Masahiro Sato
Permanent Conductor of the Wagner Society Male ChoirGraduated from the Department of Vocal Music, Faculty of Music, Tokyo University of the Arts. Completed a Master's degree in Piano Accompanying at the Juilliard School. Active domestically and internationally as an opera conductor. Lecturer at Aichi Prefectural University of the Arts.
Masahiro Sato
Permanent Conductor of the Wagner Society Male ChoirGraduated from the Department of Vocal Music, Faculty of Music, Tokyo University of the Arts. Completed a Master's degree in Piano Accompanying at the Juilliard School. Active domestically and internationally as an opera conductor. Lecturer at Aichi Prefectural University of the Arts.
Tsuyoshi Moriyama
Other : Associate Professor, Faculty of Engineering, Tokyo Polytechnic UniversityFaculty of Science and Technology GraduatedGraduate School of Science and Technology GraduatedCompleted the Major in Electrical Engineering at the Keio University Graduate School of Science and Technology in 1999. Ph.D. (Engineering). While a student, he was a member of the Keio University Wagner Society Male Choir. He practices "voice research" as one of his specialties. He supervised the "Mote-goe Diagnosis Tool VQ Checker," which became a hot topic.
Tsuyoshi Moriyama
Other : Associate Professor, Faculty of Engineering, Tokyo Polytechnic UniversityFaculty of Science and Technology GraduatedGraduate School of Science and Technology GraduatedCompleted the Major in Electrical Engineering at the Keio University Graduate School of Science and Technology in 1999. Ph.D. (Engineering). While a student, he was a member of the Keio University Wagner Society Male Choir. He practices "voice research" as one of his specialties. He supervised the "Mote-goe Diagnosis Tool VQ Checker," which became a hot topic.
Rie Uozumi
Other : Former Nippon TV AnnouncerOther : Freelance AnnouncerOther : Speech and Voice DesignerFaculty of Letters GraduatedGraduated from the Major in French Literature, Faculty of Letters, Keio University in 1995. Utilizing her many years of announcement techniques, she practices the "Uozumi Method of Speech."
Rie Uozumi
Other : Former Nippon TV AnnouncerOther : Freelance AnnouncerOther : Speech and Voice DesignerFaculty of Letters GraduatedGraduated from the Major in French Literature, Faculty of Letters, Keio University in 1995. Utilizing her many years of announcement techniques, she practices the "Uozumi Method of Speech."
2019/02/25
One's "Way of Life" Creates the Timbre
Mr. Moriyama, have you been specializing in "voice research" for a long time now?
That's right. I have been researching emotions in human voices since my days at Keio's Faculty of Science and Technology. However, while research on speech recognition has a long tradition, research on emotions was not yet recognized as a field of study in the early 90s when I started.
I heard the singing of a tenor named Fritz Wunderlich and wondered, "Why is his voice so beautiful?" His emotional expression was also so rich. But at the time, research into beautiful voices was dismissed with the attitude of, "That's something for artists to do."
How do differences in voices manifest themselves?
Humans create the source of sound with the vocal cords located in the throat. If you take just this sound source, it's just a "drone" with no timbre, only pitch. But when that sound source resonates in the upper part of the skull, the noise components cancel each other out, and only the beautiful harmonic components emerge like the cream on top. That is where the timbre is created.
Therefore, as everyone works hard every day and accumulates wrinkles, the condition from the throat up changes, and all of that is reflected in the creation of this timbre. In other words, one's entire way of life is related to the timbre of their voice.
Does that mean you can change the timbre yourself by how you open your mouth? Do people who produce high voices tend to get wrinkles in certain parts of their faces?
Yes. High voices are difficult to produce without making good use of the nasal cavity. There are four large cavities in the head: the maxillary sinus, frontal sinus, ethmoid sinus, and sphenoid sinus. When the sound created by the vocal cords resonates there, it creates something like a pathway for the breath.
An important thing in vocalization methods is that the air pathway must be well-opened. Specifically, with the "i" mouth shape, the tongue moves forward, but with "a," although the front opening of the mouth is large, the tongue pulls back, making the pathway surprisingly narrow.
Therefore, the "i" sound makes nasal resonance easier. It's like the feeling of a smile. When you make the "i" mouth shape, the timbre suddenly becomes bright and easy to hear. This is actually because the way it resonates has changed.
So it's about making it resonate in the upper part of the head?
Exactly. In the past, when elementary school music teachers said, "Produce your voice from the top of your head," everyone would say, "There's no way that's possible," but there actually is a resonance point at the top of the head, so they weren't wrong.
For example, even for men, do those who sing high-pitched parts tend to have the corners of their mouths turned up?
I am a bass, but if I have to produce a note that is a bit higher than my natural voice, like a tenor, I do use the corners of my mouth or lift the way the nasal cavity resonates.
However, while nasal resonance is of course important in opera and choral singing, I think the most important thing is still the breath.
Starting first with taking air into the lungs.
That's right. Breath goes in, and how you push it out. Where we differ a bit from Ms. Uozumi is that we have to produce a voice that can be heard in every corner of a 2,000-seat hall without using a microphone for amplification, so a certain amount of volume is required.
In that case, it's not just about vibrating the vocal cords strongly, but about how much resonance you can create to make the voice carry, and breath pressure is vital. Without strong physical support and pressure, you can't produce a voice that resonates throughout the whole space.
Is the way you produce a loud voice different between singing and speaking normally? Quite a few people worry that they can't keep speaking in a loud voice when they talk.
Breath is important even when speaking. If you say, "For now, let's inhale and exhale a lot," a loud voice will naturally come out.
That is the basic. Vocalization for things like opera starts with the breath, but when people go to learn singing, the conversation often turns to how to open the mouth and they forget about the breath. But how you control the breath is the most important part.
The word "aria" in opera means "air" in Italian. I believe Italian opera is an art of breath control. Resonance, vocalization, and beauty of voice are all necessary, but in the end, I think it's about how you connect the breath.
Connecting Breath to Voice
It's difficult, isn't it? You mean singing as long as possible in one breath after inhaling once.
To sing a long phrase, you need a long breath.
Was it called messa di voce? There is a way of singing that rides on the natural flow of the breath. When I listen to that, it feels comfortable. I feel a natural circulation.
It is said that when speaking, if you talk in one breath without pausing for air, people like business executives gain more persuasiveness and charisma. Is there a trick to inhaling and then speaking all at once? I often teach people to "use their abdominal muscles."
How to connect the breath to the voice is actually very difficult. But fortunately, in Japanese, important sounds are at the beginning of words, so if you say the first and second characters clearly, it carries quite well.
As long as the breath is on the beginning of the word, the rest follows by inertia, so if you just say the first character clearly, it somehow connects with the breath.
Is that specifically regarding Japanese?
In Italian, with words like "buon-gio-rno" or "lonta-no," it might be easier to put the breath on the long/short accents. In German, with "Ich lie-be dich," it might be good to successfully put the breath where the accent is.
Actually, the other day while practicing, I was constantly complaining that I "couldn't understand the first word." In the case of Japanese, it's true that if you understand the first word, you can predict what comes next.
Enlisting the help of the listener as well.
That's part of it too. So, if the first sound isn't clear, the listener thinks "Huh?" and has to think about it. But if the first sound is perfectly clear, it enters the ear naturally. I'm always saying that in practice, but it's quite hard to do.
What Exactly is a "Good Voice"?
I think pronunciation is very important, but on the other hand, everyone says they "want to have a good voice." But what is a good voice? I think there are many types of attractive voices.
Perhaps a voice that matches the person's character and shows their humanity?
For example, if a very rugged-looking person had a very high, cute voice, it wouldn't match, so it probably wouldn't be called a "good voice."
I see, so when it matches the appearance and character, people think, "Oh, what a lovely voice."
In 2011, I supervised a web app called the "Attractive Voice Diagnostic Tool VQ Checker," and it easily exceeded 12 million hits. Since then, I've often been interviewed by radio and TV stations asking, "What is a good voice?"
First of all, it's about the time, place, and occasion (TPO). For example, a good voice for a shop clerk in Ameyoko is a spirited, "husky voice" shouting "Step right up, step right up!" On the other hand, people in small spaces like elevator operators speak softly, as if whispering at their lips. In this way, the yardstick for a good voice is completely different depending on the situation.
The situation matters too. When someone is discussing a private matter, there are people who talk so loudly that even the person behind them can hear (laughs).
It's also about whether you can control your way of speaking like that.
Also, listeners' preferences are infinitely varied.
Preferences are a difficult thing.
So, there isn't just one answer for a "good voice."
Even among singers, there are those with voices that make your heart tremble.
Like Professor Sato's low notes.
Exactly. For example, if someone with a low voice that perfectly fits their character spoke to me on the phone, I'd think, "Oh, I love this" (laughs).
So there are two types: those that depend on the TPO, and those that are unconditionally "What a great voice!"
Everyone is wondering how they can produce that "unconditionally good voice."
Correct Voice and Way of Speaking
Everyone has high ideals. A good voice is, after all, something influenced by talent.
But a "correct voice" is something anyone can do. Since the structure of the human body doesn't have that much individual variation, if you are conscious of points like the importance of the breath pathway and connecting the breath and voice by saying the first character clearly, everyone can produce a "correct voice."
Furthermore, in situations where you are trying to persuade people, the climax—as in music—is important. A way of speaking without a climax simply won't be listened to.
You mean bringing a peak to the way you speak?
Yes. After all, good music has a flow. It is often said that a beautiful melody repeats three times. Once, twice, and then it changes on the third time. Like "Sakura, sakura, yayoi no sora wa..." Apparently, this feels rhythmic.
Even in speeches, when executives give greetings, they often mention "three points regarding..."
I also think that as long as the vocal cords are healthy, the voice itself doesn't differ that much. It's really about resonance and how much good pressure you use to push out the breath. For persuasive politicians whose appeals stay in the heart, I feel that such "pressure" is strong.
In addition, from a vocal artist's perspective, nasal resonance is individual just like vocal cords, but a larger resonator provides better resonance for carrying in a large hall and results in a mellower sound.
So it's better to have a larger body.
Yes. The larger the resonance part, the better. Japanese women tend to be petite, so from a global perspective, the roles they can sing in operas and such are very limited. For example, Japanese people are often cast in roles like the soubrette, a lively young girl, or roles that only sing leggiero (light and graceful).
There is a wonderful young baritone, but I felt the sound and color were different from the role I had in mind. He can perform the role properly and has the character to sing it through, but when I talked to him, he said, "Actually, I used to be a tenor." He was told by a doctor that his "resonator is smaller than a normal baritone's."
So, even if he produces the low notes of a baritone, he can only use the resonator originally meant for the high notes of a tenor, so I felt the richness of the low notes was a bit lacking.
Vocal cords are sometimes compared to string instruments, but there are individual differences in the body part—strengths and weaknesses, so to speak.
The Secret of a "Carrying Voice"
There's a sense that for opera singers, a loud voice is everything, but music has pianissimo, and in fact, even if they aren't always singing loudly on stage, they can be heard clearly all the way to the back row of the audience. That's not about volume.
We call it a "carrying voice." What is considered not good is a voice that "rings nearby"—a voice so loud you want to cover your ears up close, but when heard from a distance in a large hall, you think, "Oh, it's not that much."
Conversely, there are voices that don't seem that loud up close, but even in a large hall with an orchestra, every single word can be heard clearly. I always wonder what the difference in those voices is.
The secret lies in the characteristics of the human ear. Sound is a mixture of various frequencies that combine into a single sound, but frequency components around 3,000 Hz are amplified as they pass through the ear canal before reaching the eardrum, making them resonate most sensitively in the human ear. Vocalists produce their voices so that they resonate well in the paranasal sinuses, which results in a voice with a very high concentration of frequencies around 3,000 Hz.
A Swedish researcher conducted an experiment on singers performing in an orchestra and found a large peak around 3,000 Hz. This peak, called the 'singer's formant,' appears only when they are singing.
I also conducted experiments with Kabuki actors, and when they switched to Kabuki vocalization, a peak appeared in that same range.
I see, that makes sense. For example, when I go to a concert, at first it feels like the sound is coming from very far away, but after about ten minutes, it reaches a volume that transmits properly to my body, and I no longer think the sound is quiet. I've had that experience many times; is that related as well?
That is further known as the 'cocktail party effect.' Humans have the mysterious ability to hear sounds clearly from wherever they direct their attention.
Does that mean the ears gradually 'open up'?
Exactly. At first, the sound of the person next to you coughing and the sound from the stage seem to be heard at the same level. Gradually, your hearing becomes specialized toward the sound on the stage.
On a negative note, in a coffee shop or similar place, once you start noticing the sound of the person at the next table typing on their laptop, it starts to sound incredibly loud. Is that also the cocktail party effect?
That's right. A mistake musicians make is moving their hands frequently or making various irrelevant movements that distract the audience; this reduces the cocktail party effect. It's called selective hearing, and they stop being 'selected.'
Therefore, if you want to be heard, the staging used to make the other person concentrate is very important. When a salesperson wants their opinion to be heard, they stand side-by-side with the client. Once they have the client's attention, speaking in a quieter voice actually makes the person listen more closely.
Mimicking Behavior
Many people also struggle with the inflection of their voice. If there is no inflection in the way you speak, the message doesn't get across. I have people practice by reading books. I've been doing recitations since high school, and I have them add variation—raising or lowering the pitch, pausing, reading softly, loudly, or quickly. I think it's very similar to music.
It is exactly the same.
After having them read like that intensely, when I have them do free talk afterward, their emotions start to flow naturally, and they become able to speak while controlling it themselves.
I do quite a lot of narration work, where I stay in a booth for about three hours and read text written by others with a lot of inflection. After that, I feel like I can talk endlessly. I wonder if some kind of circuit is formed in the brain.
In the world of psychology, it was believed until the 19th century that the inner self and behavior were the same thing, and that the inner self came first, leading to behavior.
However, individuals named James and Lange proposed the James-Lange theory, which suggested that 'humans do not cry because they are sad, but are sad because they cry.' One recognizes 'Oh, I am sad' by observing their own behavior of crying. In other words, they said behavior comes first.
Today, we know that both processes occur.
So it's okay for behavior to come first.
Yes, it's the idea of 'starting with the form.' If you want to acquire a voice full of confidence, it might take a lifetime if you have to wait until your inner self is full of confidence. But if you can just mimic someone who is full of confidence, it's much easier, and eventually, the inner self may follow.
There is something called 'behavioral therapy,' where patients with depression are treated by mimicking energetic people, and this is very popular in places like the United States.
Speaking of 'mimicking,' people who become good at singing are good at mimicking. They likely naturally sense how the body and breath are being used. Being able to do that is one of the secrets to becoming a good singer.
It's exactly 'manebu' (to mimic). The etymology of 'manabu' (to learn) comes from 'manebu,' meaning to mimic.
Even for announcers, there is a process of mimicking while listening to the live broadcasts and reports of their seniors. First, you learn the form, and then confidence emerges.
Most people enter the field because they have an announcer they admire, so they start with mimicry and then get better. And they are good at singing and love karaoke. I suppose they naturally like holding a microphone (laughs).
In our practice as well, I sing for them saying 'it's like this' and have them mimic me.
Therefore, I think a big reason why the Wagner Society has maintained a certain level for so long is that the instructors, like Professor Tamotsu Kinoshita and Professor Ryosuke Hatanaka, were originally vocalists.
The reason I originally started the 'Mote-goe (Attractive Voice) Diagnosis' was precisely because of behavioral therapy. I wondered if information technology could help people who had lost confidence in their voices due to aging, or those who didn't have much confidence in their voices to begin with.
In the Mote-goe Diagnosis, if you speak into a computer, it evaluates your voice on five points and gives you an attractiveness score out of 100. Then, when you're told your score, you try harder to make it higher next time.
At the same time, advice appears such as 'Try to smile a bit more to increase clarity' or 'Improve your articulation,' so you follow that and think, 'I'll try to smile more than before,' and it's rewarding when the score goes up.
Once you gain confidence, you want to let others hear your voice, so before you know it, your inner self has changed. In that sense, I believe the voice is an alter ego of oneself. It's invisible, but being praised for it makes one immensely happy.
That's true. It's like being told 'You're handsome' or 'You're beautiful.'
If someone says 'I want to hear more,' well, I feel like I could sleep soundly for several nights without a worry (laughs). It gives you that much energy.
Male Voices, Female Voices
When people get angry, their voices get higher and louder. Anger is a response to a perceived threat where you have to show 'I am strong,' so you have to assert muscular or physical strength. It's said that the male hormone testosterone is also involved.
I think it's the same for humans and other animals: high sounds are symbols of being small, cute, and weak, while low voices represent things that are large or strong.
A masculine feeling.
So, to express anger and drive someone away, you need the resonance of the body, and the vocal cords must lengthen to produce low sounds.
On the other hand, children have cute, high voices that contain a lot of frequencies around 3,000 Hz so that adults can hear them well. They are intentionally made to be 'ear-piercing,' in a way. Because they need to be heard by adults.
It certainly draws attention.
In the case of men, when their voices change as they become adults, a kind of gruffness or thickness emerges instead.
Women's voices also get lower as they get older, don't they? Like Hikaru Utada—her key was incredibly high in her debut song 'Automatic,' but it has gradually become lower.
Mariah Carey is the same.
What is the reason for women's voices getting lower?
Vocalists also find it harder to produce high notes. The weakening of muscles is a big factor.
Also, the surface of the vocal cords has a mucous membrane that allows them to come together very delicately when moist, but if you speak using your throat in a way like 'No way!' (in a shrill tone), it damages the vocal cords. How is this way of speaking perceived among women? (laughs)
If women use high voices with each other and it starts to feel a bit 'condescending,' they start mounting each other, so women must be unconsciously using lower voices to keep things casual. They might be using their throats that way to avoid being disliked (laughs).
That's not very good for the throat.
Does that also contribute to aging?
Also, it's about how they heal after being damaged. When you're young, they heal quickly, but like wounds on the skin, the vocal cords become harder to heal.
Starting with "Exhaling"
When I teach vocal music, as one form of expression, I tell students to use the vocal cords like the tip of a calligraphy brush, letting it down naturally. I say that is the proper way to bring the vocal cords together.
I believe that kind of use of breath and vocal cords, where it enters smoothly, is good. But as you get older, you can no longer do that. Even for vocalists, the vocal cords eventually wear out after long use.
What should one do to avoid tiring the vocal cords or to use them for a long time?
I mentioned earlier that breathing is important, and this is truly consistent. Throat muscles weaken, but respiratory muscles also weaken. The act of inhaling and exhaling gradually becomes weaker, and one becomes unable to do abdominal breathing, relying only on the chest up. Then, because you have to make yourself heard with less breath, you're forced to strain and produce a 'thin, forced' voice.
If you inhale without using your body, it leads to a poor way of speaking.
So, if you continue to produce your voice using your stomach, the vocal cords are more likely to be protected.
I believe so. Therefore, it's better not to speak too fast. You'll only be able to take shallow breaths.
It certainly seems like fast talkers have rougher voices.
Since the era is all about 'good, cheap, and fast,' there are times when you have to speak rapidly in quick succession, but you should restrain yourself and speak after taking a slow breath. Wouldn't that increase your persuasiveness?
It makes you feel like a living human being with a physical body, rather than a speaking machine.
That's true. But when I say 'please inhale,' everyone moves their chest area. Instead, it should be the image of expanding the stomach.
When you tell beginners to 'inhale,' their stomachs get stiff. That's why I always say, 'Let's start by exhaling.' First, exhale all your breath with a 'haaa' sound.
It's the same in singing. In any case, exhale completely. Then, breath will enter naturally.
That natural entry is very good. If you say 'let's inhale,' people try to inhale even more when there's already air, so it has no choice but to go into the chest, and they have to use force to expand it.
That's wonderful. You know everything that a vocalist needs to do.
Announcing and Narration
Announcers first learn how to use their breath by extending a long 'ah' sound for about 30 seconds. This is to give persistence to speaking long sentences in one breath, but we also do training like that, as well as exercises to burst out short 'ah' 'ah' sounds.
Then there's articulation. We practice the 'Uiro Uri' (The Medicine Seller) monologue as a tongue twister. Reading a manuscript—turning what you see with your eyes into a voice—is also a special skill, so we practice reading things at first sight without making mistakes.
Just like music, we train in dynamics, tempo, and pausing. But there aren't many truly skilled announcers out there.
Is that so? (laughs)
Very few people can transition from being an announcer to a narrator. I've been doing recitals since high school and participated in competitions, so I trained extensively.
I've played the piano since I was little, so the flow of sound and melody was already in my body. I translated that into 'reading,' but there aren't many people who go from being an announcer to a narrator.
That's interesting. They also say that voice actors can't really become announcers either, right?
That's true. After all, narration is difficult if the body isn't involved. When you put the movement of your body into the sound, the listeners can be deeply moved.
So narration is like singing.
Voice actors focus on dialogue. In narration, you also read the stage directions; in fact, there's more of that kind of situational explanation.
They use the same voice and look like similar jobs, but they are completely different.
The Exquisite Charm of Choral Singing
Is there something different about the way you produce your voice when harmonizing compared to singing solo?
When you hold a long note, a vibrato naturally attaches to the voice, but when creating harmony, you reduce that vibrato. However, if there's none at all, it's hard to harmonize well, so it's easier to create harmony at a point where there's some leeway and you can feel each other. That balance is very difficult.
Also, it depends on how much solfège ability you have. Solfège is the ability to read a musical score correctly and hit the right pitches.
As is the case with the Wagner Society, I think it's a very high hurdle for people who are seeing a score for the first time to sing with the correct pitch. It's very important to be able to read the score correctly and train your body to know if your sound is at the right pitch; if you can do that, you can maintain the note.
Solfège ability also relates to 'imitation.' Even among people who have never played the piano, there are those with excellent pitch. I suppose they have a good ear.
Also, what's interesting about choral singing is that while many people sing the same part, there's an exquisite quality where the sound source produced by one person's vocal cords resonates in the body of the person next to them.
Therefore, if you use your bodies—the resonators—in the same way, the singing resonates with each other and rings out powerfully. It's not just one person's body that is sounding.
That's why, in a choir where the members use their bodies differently, they can't support each other even when they gather to sing.
In a choir that isn't very good, even if the pitch is right, it feels like the quality of the sound doesn't match, doesn't it?
Conversely, you start to hear each individual voice. In a good choir, you lose track of who is singing, and it blends in a way that a single personality seems to ring out from one part.
Everything is resonating together as one.
That's why choral singing is mysterious; even if individual abilities aren't that high, if the bodies resonate with each other, you get a pretty good sound. I think that's the accessibility—or rather, the charm—of choral singing.
That's true. So, if ten professional singers gather to sing, it doesn't necessarily mean it will result in a charming resonance. They would have the volume and hit the pitches correctly, so it would be a proper sound, but it might not necessarily have 'flavor.' Sometimes it's better when people with various voices and diverse backgrounds gather and sing using a method where everyone is looking in the same direction.
This is the same for orchestras as well.
Perhaps it's at the level of 'intent.' I heard that a conductor named Yoichiro Fukunaga used to only say 'gently' during certain rehearsals. 'Gently,' then he'd stop the sound and say, 'More gently' (laughs). When he did that, they would align perfectly, and the gentle sounds would become one. I think that at the moment the image of 'gently' was shared, everyone's resonators aligned and rang out together.
I feel like I've gained so much from our conversation today (laughs). Thank you very much.
*Affiliations and titles are those at the time of publication.