[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[linrad] Speech processing (Linrad-01.25)



Hi All,

Working with the WSE hardware for Linrad (and any other 
SDR package that someone might want to use it with) has
diverted my focus from Linrad to performance of ham radio
transceivers in general.

As I see it, the use of ALC for speech processing is the
most severe limitation for dynamic range right now. It is 
absolutely not the right way of compressing speech, that
should be done before the signal is sent to the bandwidth-
defining filter.

There seems to be a general consensus among amateurs that 
RF clipping is very much better than audio clipping and 
therefore (I think) the obvious modification which is to 
use the clipped signal intended for FM to feed the SSB 
generator has not become popular.

A voice peak that has been clipped to a flat top resembles
a square wave. The corners caused by clipping contain energy
over a wide frequency range. When the clipped signal is sent 
through a rectangular filter, there will be oscillations
corresponding to the removed signal energy outside the passband.
These oscillations are around the clipping level and they reach
as high as 3 dB above the clipping level for very hard clipping.
Using an ALC to flatten the waveform will of course restore the
energy outside the passband which is a really bad idea. Energy
outside the passband will be useless to the QSO partner.
A properly operating ALC will set the gain for the peak of the
oscillation to not saturate the power amplifier, but then the
average power suffers slightly.

There are many possible solutions to this problem. One wants to
limit the peaks in the speech in a way that does not create signals
outside the passband. One way is to have a chain of several 
band pass filters. The clipping then gradually becomes softer.
Another way is to use a soft limiter (like vacuum tubes in HIFI
amplifiers are supposed to work). It is also possible to find a
signal that fits to the filter and that one can AM modulate the 
SSB signal with to generate a waveform that does not go above
the desired maximum level but that retains as much as possible 
of the information content.

I have just started to write a little about these things 
for Dubus. I do not know how to deduce what kind of processing
to prefer from theory so I decided to do it experimentally.
Linrad-01.25 contains a package "voicelab" which produces
random Phonetics, Alfa, Bravo, Charlie.... with selectable 
processing and a fixed peak signal level. To this processed
voice signal one can add white noise at suitable levels.
When running the program one should then press the correct 
keyboard key for each letter or number, Linrad then gives 
the percentage of correct key pressings.

When I do the testing with my own voice I find that there 
is a very small difference between RF clipping and audio
clipping. Audio clipping actually gives slightly better
intelligibility at the threshold. At high S/N the RF clipper
sounds much better than the audio clipper.

You can use your own voice to record a file containing the
Phonetics and then you can evaluate what processing will give 
the best readability in your own and in your friends ears.

You may also find that although very hard clipped signals
can be copied at very low S/N, such signals sound very ugly
and can not be copied to 100% at any signal level. In real life
it may be better to loose 1 or 2 dB on the detect limit in 
order to have easy copy when qsb lifts the signal above the 
threshold.

I would be interested in some feedback on your findings if you
are interested to take the time to do some statistics on
how you can copy at different signal levels. My interest is
twofold. Firstly I want to know for sure how it really is
for several different voices before I write what I think one
should do to avoid the ALC generated splatter in modern 
transceivers. Secondly I want to know what types of processing
to implement for the Linrad transmitter.

The voicelab package is intended as an aid when setting up 
the speech processing for the Linrad transmitter. It is far 
too complicated and it it is very inefficient in terms of
CPU usage to be practically useful in the Linrad transmitter,
but once I know better what kind of processing to use I will
make fast code to produce the same result. The current filters
are implemented by FFTs that span the entire time of one
Phonetic (2 seconds) and there are many filters at AF and RF
(operating on real or complex waveforms)

73

Leif / SM5BSZ

LINRADDARNIL
l