Hi, I am trying here because it seems like there are a lot of experts knowing all sort of DSP ++. Have found a lot of answers to othert questions. My problem is to calculate intensity contours of a speech waveform with amplitude +1/-1 expressed in a desibel scale. I have spent 5-6 hours of googling without results. The intensity is defined as : sound intensity = sound power / (4 pi R2) , but here we have a radius that i cant get to fit in. The intensity of a sound wave depends on how far we are from a source. The only thing i have is the waveform and no references to a distance. If i have managed to get this intensity i think i could have used this : sound intensity level = LI = 10 log ( I / I0 ) , where I0 = I0 = 10-12 W/m2 to get the desibel scale. I have used some feature extraction in an application called Praat that does this intensity tracking based on a waveform. The algorithm used there says ; The values in the sound are first squared, then convolved with a Gaussian analysis window (Kaiser-20; sidelobes below -190 dB). The effective duration of this analysis window is 3.2 / (minimum_pitch), which will guarantee that a periodic signal is analysed as having a pitch-synchronous intensity ripple not greater than our 4-byte floating-point precision (i.e., < 0.00001 dB). This algorithm i really dont understand. The values that are squared is it the amplitudes of the waveform ? what does :Gaussian analysis window (Kaiser-20; sidelobes below -190 dB) mean? It also seems that this calculation is done after the pitch extraction as the analysis window is 3.2/minimun_pitch Any matlab experts that can provide code for this algorithm ? Can anybody explain it to a DSP newbie or is it other easier methods to calcultate the instensity contour of the waveform ? I really hope anybody will help me out here :) Tommy, Norway

# intensity of sound/speech

Started by ●January 6, 2007

Reply by ●January 6, 20072007-01-06

"stromhau" <stromhau@stud.ntnu.no> wrote in message news:tbCdncGzjomunz3YnZ2dnUVZ_tijnZ2d@giganews.com...> > Hi, > > I am trying here because it seems like there are a lot of experts knowing > all sort of DSP ++. Have found a lot of answers to othert questions. > > My problem is to calculate intensity contours of a speech waveform with > amplitude +1/-1 expressed in a desibel scale. > I have spent 5-6 hours of googling without results. > > The intensity is defined as : sound intensity = sound power / (4 pi R2) > , but here we have a radius that i cant get to fit in. > The intensity of a sound wave depends on how far we are from a source. > The only thing i have is the waveform and no references to a distance. > > If i have managed to get this intensity i think i could have used this : > > sound intensity level = LI = 10 log ( I / I0 ) , where I0 = I0 = 10-12 > W/m2 > > to get the desibel scale. > > > I have used some feature extraction in an application called Praat that > does this intensity tracking based on a waveform. > > The algorithm used there says ; The values in the sound are first squared, > then convolved with a Gaussian analysis window (Kaiser-20; sidelobes below > -190 dB). The effective duration of this analysis window is 3.2 / > (minimum_pitch), which will guarantee that a periodic signal is analysed > as having a pitch-synchronous intensity ripple not greater than our 4-byte > floating-point precision (i.e., < 0.00001 dB). > > This algorithm i really dont understand. > > The values that are squared is it the amplitudes of the waveform ? > what does :Gaussian analysis window (Kaiser-20; sidelobes below -190 dB) > mean? > It also seems that this calculation is done after the pitch extraction as > the analysis window is 3.2/minimun_pitch > > Any matlab experts that can provide code for this algorithm ? > > Can anybody explain it to a DSP newbie or is it other easier methods to > calcultate the instensity contour of the waveform ? > > > I really hope anybody will help me out here :) > > Tommy, > Norway > >Do you want real dBs as defined by international standard (ie 10dB is a squeek 120dB is pretty loud - 150dB is deafening and so on - or whatever) or do you want just the definition for dB? This is 10Log10(Power of signal) where Power is just the sum of squares divided by the number of samples. The latter definition is what is used by most engineers (electrical) and is never absolute - it is used as a comparison with other dB signals ie the noise is 20dB down on the signal and so on. The acoustic decibel has a reference defined at 1kHz and is so many Pascals of sound pressure(forget what) agreed internationally.This is only needed if you are measuring sound power and comparing it eg for some form of acoustic measuring device which checks on noisy neighbours! Tam -- Posted via a free Usenet account from http://www.teranews.com

Reply by ●January 6, 20072007-01-06

> >Do you want real dBs as defined by international standard (ie 10dB is a >squeek 120dB is pretty loud - 150dB is deafening and so on - or whatever)or>do you want just the definition for dB? This is 10Log10(Power of signal) >where Power is just the sum of squares divided by the number of samples.The>latter definition is what is used by most engineers (electrical) and is >never absolute - it is used as a comparison with other dB signals ie the >noise is 20dB down on the signal and so on. >The acoustic decibel has a reference defined at 1kHz and is so manyPascals>of sound pressure(forget what) agreed internationally.This is only neededif>you are measuring sound power and comparing it eg for some form ofacoustic>measuring device which checks on noisy neighbours! > >Tam > > > >-- >Posted via a free Usenet account from http://www.teranews.com > >Thank you for replay. The program i used to extract features says this : "An Intensity object represents an intensity contour at linearly spaced time points ti = t1 + (i – 1) dt, with values in dB SPL, i.e. dB relative to 2�10-5 Pascal, which is the normative auditory threshold for a 1000-Hz sine wave." So i guess it is the real DBs i am looking for. And please i am a newbie here so dont be irritated. here is one definition i found : "Sound intensity is defined as the sound power per unit area. The usual context is the measurement of sound intensity in the air at a listener's location" But when i have just the speech waveform i cant figure how the "unit area" and the "listeners location" fit in. Lets say i have a waveform of some speech, i then want to compute the intensity levels at short time frames, thats my key problem. Because i did my feature extraction using Praat and the fact that i am planning to do some machine learning on that data i was hoping to stick to this algorithm, but i there are better/easier methods i would gladly reorganize my plan. "The values in the sound are first squared, then convolved with a Gaussian analysis window (Kaiser-20; sidelobes below -190 dB). The effective duration of this analysis window is 3.2 / (minimum_pitch), which will guarantee that a periodic signal is analysed as having a pitch-synchronous intensity ripple not greater than our 4-byte floating-point precision (i.e., < 0.00001 dB)." The time step for the resulting intensity contour is :0.8 / (minimum_pitch). Tommy,

Reply by ●January 7, 20072007-01-07

"stromhau" <stromhau@stud.ntnu.no> wrote in message news:reidnVW2qs4Aoz3YnZ2dnUVZ_q2pnZ2d@giganews.com...> > > > >Do you want real dBs as defined by international standard (ie 10dB is a > >squeek 120dB is pretty loud - 150dB is deafening and so on - or whatever) > or > >do you want just the definition for dB? This is 10Log10(Power of signal) > >where Power is just the sum of squares divided by the number of samples. > The > >latter definition is what is used by most engineers (electrical) and is > >never absolute - it is used as a comparison with other dB signals ie the > >noise is 20dB down on the signal and so on. > >The acoustic decibel has a reference defined at 1kHz and is so many > Pascals > >of sound pressure(forget what) agreed internationally.This is only needed > if > >you are measuring sound power and comparing it eg for some form of > acoustic > >measuring device which checks on noisy neighbours! > > > >Tam > > > > > > > >-- > >Posted via a free Usenet account from http://www.teranews.com > > > > > Thank you for replay. > > The program i used to extract features says this : > > "An Intensity object represents an intensity contour at linearly spaced > time points ti = t1 + (i – 1) dt, with values in dB SPL, i.e. dB > relative to 2�10-5 Pascal, which is the normative auditory threshold for a > 1000-Hz sine wave." > > So i guess it is the real DBs i am looking for. > > And please i am a newbie here so dont be irritated. > > here is one definition i found : > > "Sound intensity is defined as the sound power per unit area. The usual > context is the measurement of sound intensity in the air at a listener's > location" > > But when i have just the speech waveform i cant figure how the "unit > area" > and the "listeners location" fit in. > > Lets say i have a waveform of some speech, i then want to compute the > intensity levels at short time frames, thats my key problem. > > Because i did my feature extraction using Praat and the fact that i am > planning to do some machine learning on that data i was hoping to stick to > this algorithm, but i there are better/easier methods i would gladly > reorganize my plan. > > "The values in the sound are first squared, then convolved with a Gaussian > analysis window (Kaiser-20; sidelobes below -190 dB). The effective > duration of this analysis window is 3.2 / (minimum_pitch), which will > guarantee that a periodic signal is analysed as having a pitch-synchronous > intensity ripple not greater than our 4-byte floating-point precision > (i.e., < 0.00001 dB)." > > The time step for the resulting intensity contour is :0.8 / > (minimum_pitch). > > Tommy, >Ithink you will need something as a reference sound to calibrate - dBs are all relative. I am not sure of your other ramblings.. Tam -- Posted via a free Usenet account from http://www.teranews.com

Reply by ●January 7, 20072007-01-07

"stromhau" <stromhau@stud.ntnu.no> wrote in message news:tbCdncGzjomunz3YnZ2dnUVZ_tijnZ2d@giganews.com...> > Hi, > > My problem is to calculate intensity contours of a speech waveform with > amplitude +1/-1 expressed in a desibel scale.*** "+1/-1 expressed in a decibel scale" doesn't make much sense. Perhaps you mean "waveform with peak amplitudes of +1 (something units) to -1 (something units) and then converted to a dB scale - which requires that you set a reference level. dB is a measure of a ratio, thus the need for a reference. ***Not sure what you mean by "contours" of a waveform. "envelope" is a defined term. "power spectrum" is a defined term. Both are "contours" of a sort....> The intensity is defined as : sound intensity = sound power / (4 pi R2)***Only if you have the luxury of knowing the spreading law that applies - *if* there's an easy one. That's where the "R" comes from.> , but here we have a radius that i cant get to fit in.*** ??????> The intensity of a sound wave depends on how far we are from a source.*** ... and the intensity at the source, and the attenuation of the path, etc....> The only thing i have is the waveform and no references to a distance.***Then you shouldn't be using a spreading formula.> > If i have managed to get this intensity i think i could have used this : > > sound intensity level = LI = 10 log ( I / I0 ) , where I0 = I0 = 10-12 > W/m2***You haven't defined W/m2 or where you got 10 or 12. It appears that IO is the sound power level reference and I is the sound power level measured / estimated. Answers will be more helpful if the situation is better described. Fred

Reply by ●January 7, 20072007-01-07

> >"stromhau" <stromhau@stud.ntnu.no> wrote in message >news:tbCdncGzjomunz3YnZ2dnUVZ_tijnZ2d@giganews.com... >> >> Hi, >> >> My problem is to calculate intensity contours of a speech waveformwith>> amplitude +1/-1 expressed in a desibel scale. > >*** "+1/-1 expressed in a decibel scale" doesn't make much sense.Perhaps>you mean "waveform with peak amplitudes of +1 (something units) to -1 >(something units) and then converted to a dB scale - which requires thatyou>set a reference level. dB is a measure of a ratio, thus the need for a >reference. > >***Not sure what you mean by "contours" of a waveform. "envelope" is a >defined term. "power spectrum" is a defined term. Both are "contours"of a>sort.... > >> The intensity is defined as : sound intensity = sound power / (4 piR2)> >***Only if you have the luxury of knowing the spreading law that applies->*if* there's an easy one. That's where the "R" comes from. > >> , but here we have a radius that i cant get to fit in. > >*** ?????? > >> The intensity of a sound wave depends on how far we are from a source. > >*** ... and the intensity at the source, and the attenuation of the path,>etc.... > >> The only thing i have is the waveform and no references to a distance. > >***Then you shouldn't be using a spreading formula. > >> >> If i have managed to get this intensity i think i could have used this:>> >> sound intensity level = LI = 10 log ( I / I0 ) , where I0 = I0 =10-12>> W/m2 > >***You haven't defined W/m2 or where you got 10 or 12. It appears thatIO>is the sound power level reference and I is the sound power levelmeasured />estimated. > >Answers will be more helpful if the situation is better described. > >Fred > > >Thanks Fred, I am sorry for my diffuse explanation.>you mean "waveform with peak amplitudes of +1 (something units) to -1 >(something units) and then converted to a dB scale - which requires thatyou>set a reference level. dB is a measure of a ratio, thus the need for a >reference.Yes, this is exactly what i want, and the reference is 10e-12 W/m2 which is the threshold of hearing.>***Not sure what you mean by "contours" of a waveform. "envelope" is a >defined term. "power spectrum" is a defined term. Both are "contours"of a>sort....Ok, what i want is to get a intensity value for each frame, maybe 10 or 20 ms. If it is still unclear, let me reconfigurate my question and skip the Praat algorithm and DB. I got to hung up in the way the application Praat do things. What i want is to extract features from the waveform to find the emotional state of the speaker. After doing some research of papers the pitch and intensity are two of the most important features to indentify the emotional state of a speaker. How can i calculate the intensity of short time frames(10 ms) of a waveform? Tommy

Reply by ●January 7, 20072007-01-07

"stromhau" <stromhau@stud.ntnu.no> wrote in message news:lcudnbJnY4PN2jzYnZ2dnUVZ_u-unZ2d@giganews.com...> > >>"stromhau" <stromhau@stud.ntnu.no> wrote in message >>news:tbCdncGzjomunz3YnZ2dnUVZ_tijnZ2d@giganews.com... >>> >>> Hi, >>> >>> My problem is to calculate intensity contours of a speech waveform > with >>> amplitude +1/-1 expressed in a desibel scale. >> >>*** "+1/-1 expressed in a decibel scale" doesn't make much sense. > Perhaps >>you mean "waveform with peak amplitudes of +1 (something units) to -1 >>(something units) and then converted to a dB scale - which requires that > you >>set a reference level. dB is a measure of a ratio, thus the need for a >>reference. >> >>***Not sure what you mean by "contours" of a waveform. "envelope" is a >>defined term. "power spectrum" is a defined term. Both are "contours" > of a >>sort.... >> >>> The intensity is defined as : sound intensity = sound power / (4 pi > R2) >> >>***Only if you have the luxury of knowing the spreading law that applies > - >>*if* there's an easy one. That's where the "R" comes from. >> >>> , but here we have a radius that i cant get to fit in. >> >>*** ?????? >> >>> The intensity of a sound wave depends on how far we are from a source. >> >>*** ... and the intensity at the source, and the attenuation of the path, > >>etc.... >> >>> The only thing i have is the waveform and no references to a distance. >> >>***Then you shouldn't be using a spreading formula. >> >>> >>> If i have managed to get this intensity i think i could have used this > : >>> >>> sound intensity level = LI = 10 log ( I / I0 ) , where I0 = I0 = > 10-12 >>> W/m2 >> >>***You haven't defined W/m2 or where you got 10 or 12. It appears that > IO >>is the sound power level reference and I is the sound power level > measured / >>estimated. >> >>Answers will be more helpful if the situation is better described. >> >>Fred >> >> >> > > Thanks Fred, > > I am sorry for my diffuse explanation. > >>you mean "waveform with peak amplitudes of +1 (something units) to -1 >>(something units) and then converted to a dB scale - which requires that > you >>set a reference level. dB is a measure of a ratio, thus the need for a >>reference. > > Yes, this is exactly what i want, and the reference is 10e-12 W/m2 which > is the threshold of hearing. > >>***Not sure what you mean by "contours" of a waveform. "envelope" is a >>defined term. "power spectrum" is a defined term. Both are "contours" > of a >>sort.... > > Ok, what i want is to get a intensity value for each frame, maybe 10 or 20 > ms. > > If it is still unclear, let me reconfigurate my question and skip the > Praat algorithm and DB. > I got to hung up in the way the application Praat do things. > > What i want is to extract features from the waveform to find the emotional > state of the speaker. > > After doing some research of papers the pitch and intensity are two of the > most important features to indentify the emotional state of a speaker. > > How can i calculate the intensity of short time frames(10 ms) of a > waveform? > > TommyTommy, OK - well I'm not at all sure that you really care about absolute levels nor an absolute reference for this application. dB, being a ratio, can be scaled by adding or subtracting a constant - in which case all measures remain the same distance apart in dB. So, an arbitrary reference is quite likely fine for what you're doing. As long as the sensor is the same then intensities will vary according to their absolute levels and the attenuation from the source to the sensor - which it sounds like you don't really care about and only care about the levels at the sensor (microphone or .... ). The simplest reference is no reference at all - that is use an intensity of 1.0 on whatever scale you're using. So, 20*log(I1/I2) = 20*log(I1) - 20*log(I2). If I2=1.0 then log(I2)=0 and you're left with 20*log(I1). Part of the problem you're having is that there can be a sound pressure level reference which will translate to volts according to the sensor's sensitivity (gain or transfer function) in volts per micropascal for example. At the output of a microphone you get volts and the only way to go backwards to sound pressure level is to inversely apply the sensitivity to the output voltage: volts/(volts/micropascal). The other way is to simply relate the microphone output to 1.0v as the zero dB level and forget about absolute sound pressure levels altogether - since it's related to the volts by a constant factor. You could calculate the rms level in the frame time - that's a measure of energy. Square the values, add them together, divide by the number of them and take the square root for relative intensity (as micropascals or volts) or don't take the square root for relative power. In either case, the dB level will be the same as 20* log10(ratio) if using intensities or 10*log10(ratio) if using power. If you care about comparing speakers (people) in a room then intensity differences will be more difficult because of distance / attenuation for each speaker and the need to differentiate from one speaker to another. That would be a much tougher problem. Fred

Reply by ●January 7, 20072007-01-07

"Fred Marshall" <fmarshallx@remove_the_x.acm.org> wrote in message news:XZqdnbBk5Zkp_TzYnZ2dnUVZ_ualnZ2d@centurytel.net...> > "stromhau" <stromhau@stud.ntnu.no> wrote in message > news:lcudnbJnY4PN2jzYnZ2dnUVZ_u-unZ2d@giganews.com... > > > > >>"stromhau" <stromhau@stud.ntnu.no> wrote in message > >>news:tbCdncGzjomunz3YnZ2dnUVZ_tijnZ2d@giganews.com... > >>> > >>> Hi, > >>> > >>> My problem is to calculate intensity contours of a speech waveform > > with > >>> amplitude +1/-1 expressed in a desibel scale. > >> > >>*** "+1/-1 expressed in a decibel scale" doesn't make much sense. > > Perhaps > >>you mean "waveform with peak amplitudes of +1 (something units) to -1 > >>(something units) and then converted to a dB scale - which requires that > > you > >>set a reference level. dB is a measure of a ratio, thus the need for a > >>reference. > >> > >>***Not sure what you mean by "contours" of a waveform. "envelope" is a > >>defined term. "power spectrum" is a defined term. Both are "contours" > > of a > >>sort.... > >> > >>> The intensity is defined as : sound intensity = sound power / (4 pi > > R2) > >> > >>***Only if you have the luxury of knowing the spreading law that applies > > - > >>*if* there's an easy one. That's where the "R" comes from. > >> > >>> , but here we have a radius that i cant get to fit in. > >> > >>*** ?????? > >> > >>> The intensity of a sound wave depends on how far we are from a source. > >> > >>*** ... and the intensity at the source, and the attenuation of thepath,> > > >>etc.... > >> > >>> The only thing i have is the waveform and no references to a distance. > >> > >>***Then you shouldn't be using a spreading formula. > >> > >>> > >>> If i have managed to get this intensity i think i could have used this > > : > >>> > >>> sound intensity level = LI = 10 log ( I / I0 ) , where I0 = I0 = > > 10-12 > >>> W/m2 > >> > >>***You haven't defined W/m2 or where you got 10 or 12. It appears that > > IO > >>is the sound power level reference and I is the sound power level > > measured / > >>estimated. > >> > >>Answers will be more helpful if the situation is better described. > >> > >>Fred > >> > >> > >> > > > > Thanks Fred, > > > > I am sorry for my diffuse explanation. > > > >>you mean "waveform with peak amplitudes of +1 (something units) to -1 > >>(something units) and then converted to a dB scale - which requires that > > you > >>set a reference level. dB is a measure of a ratio, thus the need for a > >>reference. > > > > Yes, this is exactly what i want, and the reference is 10e-12 W/m2 which > > is the threshold of hearing. > > > >>***Not sure what you mean by "contours" of a waveform. "envelope" is a > >>defined term. "power spectrum" is a defined term. Both are "contours" > > of a > >>sort.... > > > > Ok, what i want is to get a intensity value for each frame, maybe 10 or20> > ms. > > > > If it is still unclear, let me reconfigurate my question and skip the > > Praat algorithm and DB. > > I got to hung up in the way the application Praat do things. > > > > What i want is to extract features from the waveform to find theemotional> > state of the speaker. > > > > After doing some research of papers the pitch and intensity are two ofthe> > most important features to indentify the emotional state of a speaker. > > > > How can i calculate the intensity of short time frames(10 ms) of a > > waveform? > > > > Tommy > > Tommy, > > OK - well I'm not at all sure that you really care about absolute levelsnor> an absolute reference for this application. dB, being a ratio, can be > scaled by adding or subtracting a constant - in which case all measures > remain the same distance apart in dB. So, an arbitrary reference is quite > likely fine for what you're doing. As long as the sensor is the same then > intensities will vary according to their absolute levels and theattenuation> from the source to the sensor - which it sounds like you don't really care > about and only care about the levels at the sensor (microphone or .... ). > > The simplest reference is no reference at all - that is use an intensityof> 1.0 on whatever scale you're using. > So, 20*log(I1/I2) = 20*log(I1) - 20*log(I2). > If I2=1.0 then log(I2)=0 and you're left with 20*log(I1). > > Part of the problem you're having is that there can be a sound pressure > level reference which will translate to volts according to the sensor's > sensitivity (gain or transfer function) in volts per micropascal for > example. At the output of a microphone you get volts and the only way togo> backwards to sound pressure level is to inversely apply the sensitivity to > the output voltage: volts/(volts/micropascal). The other way is to simply > relate the microphone output to 1.0v as the zero dB level and forget about > absolute sound pressure levels altogether - since it's related to thevolts> by a constant factor. > > You could calculate the rms level in the frame time - that's a measure of > energy. > Square the values, add them together, divide by the number of them andtake> the square root for relative intensity (as micropascals or volts) or don't > take the square root for relative power. In either case, the dB levelwill> be the same as 20* log10(ratio) if using intensities or 10*log10(ratio) if > using power. > > If you care about comparing speakers (people) in a room then intensity > differences will be more difficult because of distance / attenuation for > each speaker and the need to differentiate from one speaker to another. > That would be a much tougher problem. > > Fred > > >I agree, forget about absolute measures - it is not required here. Besides if you use the sound card then it's settings will be different from PC to PC. F. -- Posted via a free Usenet account from http://www.teranews.com

Reply by ●January 7, 20072007-01-07

stromhau wrote: ...> What i want is to extract features from the waveform to find the emotional > state of the speaker. > > After doing some research of papers the pitch and intensity are two of the > most important features to indentify the emotional state of a speaker.http://www.newfreedownloads.com/find/voice-stress-analysis-lie-detector-software.html http://news-info.wustl.edu/news/page/normal/669.html http://arstechnica.com/news.ars/post/20061222-8485.html Jerry -- Engineering is the art of making what you want from things you can get. �����������������������������������������������������������������������

Reply by ●January 8, 20072007-01-08

> >Tommy, > >OK - well I'm not at all sure that you really care about absolute levelsnor>an absolute reference for this application. dB, being a ratio, can be >scaled by adding or subtracting a constant - in which case all measures >remain the same distance apart in dB. So, an arbitrary reference isquite>likely fine for what you're doing. As long as the sensor is the samethen>intensities will vary according to their absolute levels and theattenuation>from the source to the sensor - which it sounds like you don't reallycare>about and only care about the levels at the sensor (microphone or ....).> >The simplest reference is no reference at all - that is use an intensityof>1.0 on whatever scale you're using. >So, 20*log(I1/I2) = 20*log(I1) - 20*log(I2). >If I2=1.0 then log(I2)=0 and you're left with 20*log(I1). > >Part of the problem you're having is that there can be a sound pressure >level reference which will translate to volts according to the sensor's >sensitivity (gain or transfer function) in volts per micropascal for >example. At the output of a microphone you get volts and the only way togo>backwards to sound pressure level is to inversely apply the sensitivityto>the output voltage: volts/(volts/micropascal). The other way is tosimply>relate the microphone output to 1.0v as the zero dB level and forgetabout>absolute sound pressure levels altogether - since it's related to thevolts>by a constant factor. > >You could calculate the rms level in the frame time - that's a measure of>energy. >Square the values, add them together, divide by the number of them andtake>the square root for relative intensity (as micropascals or volts) ordon't>take the square root for relative power. In either case, the dB levelwill>be the same as 20* log10(ratio) if using intensities or 10*log10(ratio)if>using power. > >If you care about comparing speakers (people) in a room then intensity >differences will be more difficult because of distance / attenuation for>each speaker and the need to differentiate from one speaker to another. >That would be a much tougher problem. > >Fred > > > >Thank you very much for helping me out here! However i have another thing on my mind if you or someone else would be so nice to help. I see that when people calculate short time evergy they use a time frame(about 10 ms) and the multiply with a window function(often hanning) and then they have a overlap between frames. Could anyone explain why ? What i dont understand is the windowing function and the overlap of frames. Let me dig this algorithm back :) : "The algorithm used there says ; The values in the sound are first squared, then convolved with a Gaussian analysis window (Kaiser-20; sidelobes below -190 dB)" Nearly the same thing but here a convolution with the windowing function is used. Why ? Tommy,