Home > Codec
matrix

Codec
matrix

Page 1
Codec
matrix

Michael
Knappe
 Co‐chair,
codec
WG

1
 Michael
Knappe



IETF
77


Page 2
Voice
transmission

Transmission line Transducers / Amplifiers
2
 Michael
Knappe



IETF
77


Page 3
VoIP:
Messaging
vs.
transmission

3
 Michael
Knappe



IETF
77


Page 4
VoIP
transmission

Encode Decode PLC / Comfort Noise VAD Jitter buffer EC TD EC Synchronous Synchronous Asynchronous
4
 Michael
Knappe



IETF
77


Page 5
Interac,ve
Quality

•  Quality
 –  Clarity,
latency,
 echo

Clarity Echo Latency Three orthogonal components define interactive audio quality Intelligible Real Natural
Relative BW scale:
0.01- 1 100+
codec WG
•  Clarity
 –  More
than
intelligibility
 –  “ease
of
use”
 –  Factors
incl.
dist,
noise,
 freq
resp,
loudness
 –  Scale
of
barely
intelligible
through
‘holographic’

5
 Michael
Knappe



IETF
77


Page 6
Audio
Transmission

Nomenclature
 Sampling
rate
 Usable
bandwidth

Narrowband
 8
kHz
 200
to
3400
Hz
 Wideband
 16
kHz
 50
to
7000
Hz
 Super
wideband
 32
kHz
 50
to
14,000
Hz
 Fullband
 44.1
kHz
and
up
 20
to
20,000
Hz

Michael
Knappe



IETF
77
 6

Useful comparisons: AM radio is limited to 5000 Hz audio FM radio is limited to 15,000 Hz audio CD is limited to 20,000 Hz audio Speed of sound in air: 343 m/s (approx 3 ms/m)

Page 7
Audio
frequencies

Michael
Knappe



IETF
77
 7

http://www.podcomplex.com/images/ podcomplex-frequency-overview-chart.gif

Page 8
Lossy
Compression
101

•  Source
model
based
coding

–  Parameterizes
source
excita,on,
pitch
and

 formants
(a,e,i,o,u)

 –  Generally
,ed
to
human
speech
produc,on

 mechanisms,
with
limited
support
for
auditory

 perceptual
weigh,ng
 –  e.g.
G.728,
G.729

Michael
Knappe



IETF
77
 8

http://www.sungwh.freeserve.co.uk/sapienti/phon/headxsec.gif http://www.skidmore.edu/~hfoley/images/AuditorySystem.jpg
•  Perceptual
audio
coding

–  Uses
principals
of
psychoacous,cs
and
the
human
auditory
system
to 
dynamically
assign
the
most
bits
to

 temporal
and
frequency
characteris,cs
most

 likely
to
be
heard

 –  e.g.
MP3,
AAC
 –  Does
an
MP3
sound
ok
to
a
dog?


Page 9
Subjec,ve
Tes,ng

MOS Quality Impairment
5 Excellent Imperceptible 4 Good Perceptible, but not annoying 3 Fair Slightly annoying 2 Poor Annoying 1 Bad Very annoying
▪ MOS is both a method and metric for subjective
quality scoring based on a five point rating system:
9
 Michael
Knappe



IETF
77

▪ Compressed 4.5 – 5 range makes MOS not suitable for
wideband+ quality determination
▪ MUSHRA (MUltiple Stimuli with Hidden Reference
and Anchor) with 0-100 scale and more compact
statistical requirements better suited

Page 10
Applica,on
Drivers

Applica on
 Channels
 Bandwidth
 End
to
end
 Latency
 Allowable
 complexity
 Allowable
bit‐ rate
 Speech

1
‐
2
 NB
‐
WB
 <150
ms
 Low
 <
64
kbps

Conference

1
‐
2
 NB
‐
SWB
 Ac,vity
driven
 Medium
 <
128
kbps

Telepresence

2+
 SWB
‐
FB

 Ac,vity
driven
 High
 <
512
kbps

Gaming

2+
 SWB
‐
FB
 <150
ms
 High
 <
320
kbps

Interac ve
 music

2
 SWB
‐
FB
 <
25
ms
 Medium
 <
256
kbps

Content: even traditional phone calls handle signal types other than speech (e.g. music-on-hold), as a baseline we must assume non-specific audio content
10
 Michael
Knappe



IETF
77

Other useful features: packet loss concealment, quality and bandwidth layering, joint multi-channel encoding

Page 11
Narrowband
matrix
(8
kHz
fs)

Codec

Bit
rate
 (kbps)
 Look
 ahead
 (ms)
 Frame
 size
(ms)
 PSQM
 (zero
 impair)
 DTX
 PLC

G.711

64
 0
 Arbitr.
 4.45
 Appendix
II
 Appendix
I

G.723.1

5.3,
6.3
 7.5
 30
 3.6,
3.9
 (MOS)
 Yes
 Yes

G.728

16
 0
 0.562
 3.6
 (MOS)

G.729AB

8
 5
 10
 4.04
 Yes
 Yes

AMR

4.75
–
 12.2
 5
 20
 4.14
 Yes
 Yes

GSM‐EFR

12.2
 0
 20
or
30
 Yes

iLBC

13.33,
 15.2
 0
 20
or
30
 4.14
 (15.2)
 Yes

Michael
Knappe



IETF
77
 11

Sources: http://en.wikipedia.org/wiki/Comparison_of_audio_formats,
 Cable Labs PKT-SP-CODEC-MEDIA-I08-100120

Page 12
Wideband
+

Michael
Knappe



IETF
77
 12

Codec

Sample
 rate
(kHz)
 Bit
rate
 (kbps)
 Algorithm
latency
 (ms)
 Comp
 Cmplx
 #
Chan
 PLC

G.711.1

8,
16
 64,
80
(8
kHz)
80,
 96
(16
kHz)
 11.875
 1

G.718

8,
16
 (extens.)
 8
‐
32
 42.875
–
43.875
(20
 ms
frames)
 1
 Yes

G.719

48
 32
‐
64
 40
(20
ms
frames)
 18
FP‐ MIPS
 1,
MC
 (MP4)

G.722

16
 64
 4
 10
MIPS
 No

G.722.1(C)

16,
32
(c)
 24,
32,
48
(32)
 40
(20
ms
frames)
 10
 WMOPS
 Yes

G.722.2
 (AMR‐WB)

16
 6.6
–
23.85
 25
 38
 WMOPS
 1,
MC
 (MP4)
 Yes

G.729.1

8,
16
 8
‐
32
 48.9375
 Yes

Siren

16
‐
48
 16
(m)
–
128
(s)
 40
(20
ms
frames)
 1
or
2

Speex

8
‐
32
 2
‐
44
 30
NB,
34
WB
 1,
2
opt.
 Yes

AAC‐ELD

?
‐
48?
 24
‐
64
 15
(64)
–
32
(
24)
 1+
 Yes


Page 13
Summary

•  Goal
1:
set
codec
applica,on
space
‐>
define 
parameters
of
interest
 •  Goal
2:
survey
current
codecs
and
works‐in ‐progress
 •  Goal
3:
define
benchmark
tools
and 
performance
goals
 •  Goal
4:
qualify
codecs,
make
choice(s)

Michael
Knappe



IETF
77
 13

Search more related documents:Codec
matrix
Download Document:Codec
matrix

Set Home | Add to Favorites

All Rights Reserved Powered by Free Document Search and Download

Copyright © 2011
This site does not host pdf,doc,ppt,xls,rtf,txt files all document are the property of their respective owners. complaint#nuokui.com
TOP