Audience
s/w engineers familiar with DSP concepts who are interested
in building signal processing systems
Research, education in signal processing concepts

Introduction

Pspectra is a software development environment developed by the Software Systems and Devices Group at the Laboratory for Computer Science at MIT.

What is pspectra?
Move digital interface to antenna
Minimize the use of specialized h/w
investigate adaptive systems
actual situation, not worst case nor average
Statistical real time
use buffering to handle jitter
Apply principles of software engineering to signal processing
Use general purpose o/s - Linux

Who will use it?
signal processing application design
wireless - software radio
ultrasound

Advantages of pspectra
rapid deployment
integration
flexibility
reduced cost for specialized devices
easy, in-field upgrade
adaptation

Disadvantages of pspectra
power
size

Outline of system.
SMP GPP
special h/w
goal directed scheduling/processing
cache aware
 

  • Overall design

  • Lazy evaluation architecture
    Library of composable modules
    GUI design
    Out-of-band control processing
    Separation of processing, buffer management, control

    The Pspectra, a signal processing program is separated into the following components:

  • main program to setup and control signal processing modules,
  • composable signal processing algorithm modules,
  • scheduling of signal processing calculations, and
  • data buffer management.
  • Users of Pspectra environment write the first two of these components. Pspectra supplies generic scheduling and data management algorithms.
     
     
     
  • Main Program

  •  

     

    The top level control program creates all the modules, configures their parameters, connects them in an appropriate graph, and enters a processing loop. It initiates by calling start() on a sink (or on VrMultiTask, which handles topologies with multiple sinks). Calling start() runs the initialization process and begins the actual signal processing.

    After calling start(), the control program calls process() repeatedly on the sink, which runs one iteration of Pspectra’s scheduling algorithm. Between calls to process(), the control program may perform other tasks, such as running a GUI or dynamically changing parameters in the system. The program may end processing and cleanup any remaining threads by calling stop() on the sink.

    Here is an example Pspectra program that demodulates an FM channel, filters the audio, and plays it on the computer’s speaker.

    void main() {
    /* Create the signal processing modules */
    VrGuppiSource<char>* source = new VrGuppiSource<char>();
    VrComplexFIRfilter<char>* channel_filter = new VrComplexFIRfilter<char>(
        CFIRdecimate,cTaps,chanFreq,chanGain);
    VrQuadratureDemod<float>* demod = new VrQuadratureDemod<float>(quadGain);
    VrRealFIRfilter<float,short>* if_filter = new VrRealFIRfilter<float,short>(
        RFIRdecimate,realCutoff,realTaps,realGain);
    VrAudioSink<short>* sink = new VrAudioSink<short>();

        /* Connect Modules */
        CONNECT(sink, if_filter, audioRate, 16);
        CONNECT(if_filter, demod, quadRate, 32);
        CONNECT(demod, channel_filter, quadRate, 64);
        CONNECT(channel_filter, source, gupRate, 8);
        /* Start System */
        sink->start();
        while(sink->elapsedTime() < SECONDS)
            sink->process();
        sink->stop();
    }
     
     

  • Module Structure

  • Code for an individual signal processing functions is written as a single, reusable class or "module" (e.g., FM demodulation or NTSC sync pulse detection).

    Modules may have multiple outputs and inputs, exchanging data with other modules through sample data streams.  Although data is generated by "sources," processed by some chain of modules, and consumed by "sinks," lazy-evaluation is used to enabling the system to avoid generating intermediate data not necessary for the final output data.  Locations in a stream are identified by a 64-bit index, numbered sequentially beginning with zero. Modules use indices to reference ranges of data in their input and output streams. Pspectra provides modules with pointers to the actual data in memory. This allows the environment to separate the implementation of sample streams from both the signal-processing and the scheduling of the various signal-processing tasks.

    In the second phase of the scheduling algorithm, Pspectra calls work() on the appropriate modules to produce the marked data, re-marking the data as "completed" after each return from work(). This phase proceeds in the opposite direction as the marking phase, computing data first in the upstream modules (e.g., A) before proceeding downstream.

    Work() is passed pointers to input and output data and runs a tight signal-processing loop producing output data from input data.  All input data required to produce the requested output samples is computed before entering the work procedure.  When modules cannot always accurately predict their input requirements, work() may not complete all requested work. In this case, the scheduling process then begins again in the data-marking phase to reassess what data samples are necessary.

    This is simplified code for an FIR filter. Note that modules can have multiple inputs and multiple outputs, which explains the need for arrays in the arguments for forecast() and work(). Forecast() ensures that there will be enough input data that the expression inputArray[j] always corresponds to valid data.

    void VrFilter::initialize()
    {
        taps=buildFilter(numberOfTaps);
    }

    unsigned int VrFilter::mapSizeUp(int inputNumber,unsigned int size) {
        return size + history - 1;
    }

    void VrFilter::forecast(VrSampleRange output,
    VrSampleRange inputs[]) {
        for(unsigned int i=0;i<numberInputs;i++) {
            inputs[i].index=output.index;
            inputs[i].size=output.size + numberOfTaps-1;
        }
    }

    int VrFilter::work(VrSampleRange output, oType *o[],
    VrSampleRange inputs[], iType *i[]) {
    float result;
    unsigned int size = output.size;

        for (;size>0;size--,i[0]++) { //increment pointer to input # 0
            iType* inputArray = i[0];
            /* Compute the dot product of inputArray[] and taps[] */
            result = 0;
            for (int j=0; j < numberOfTaps; j++)
                result += taps[j] * inputArray[j];
            *o[0]++ = (oType) result; //output the result to output # 0
        }
        return output.size; //all outputs completed
    }
     

    Programmers implement modules as classes that extend the class VrSigProc. This section describes the functions that a module must implement to interact with the system, divided into functions that are run once during initialization and functions that are run repeatedly in the signal-processing loop.
    The environment calls initialize() and mapSizeUp() once during the initial setup of the signal-processing topologies. Then forecast() and work() are called alternately; the scheduler uses the requirements returned by forecast() to determine which upstream data must be computed first. Pspectra then computes this data by calling forecast() and work() on the upstream modules, finally instructing the original module to compute its output by calling work().

    The separation of forecast() and work() allows the system to perform all data management and scheduling functions and frees the signal-processing algorithm in work() from overhead such as checking that its input data has been computed or space available in the output buffer.

    Initialization begin at sinks and propagates upstream through all modules.

    Scheduling proceeds in two-phases, a "data-marking" phase, which proceeds upstream following data-dependencies, and a "work" phase, which proceeds downstream from the head, producing output data.
     

  • Initialization

  • Before signal-processing can begin modules must be initialized to perform:
  • Pspectra calls initialize() on each module to allow the module to initialize module-specific parameters,
  • Pspectra determines the size required for each buffer based on the number of processors, the block size for each sink, and the behavior of upstream and downstream modules (determined by mapSizeUp()), and
  • Pspectra attempts to determine the optimal block size for each sink.
  • initialize()

  • Pspectra calls initialize() to perform setup after the module has been connected to other modules. Because input and output sampling rates are properties of the individual streams in Pspectra, module parameters that depend on sampling rate should be configured in initialize() since the streams do not exist when the constructor is called. A typical example is a filter that computes its tap values in initialize().

    A module may also wish to call setOutputSize() in its initialize() procedure to fix the smallest number of units on which the module will run (for example, a particular interpolating filter may create three output points per input and thus have a natural outputSize of three). Pspectra guarantees work() and forecast() are called with a multiple of outputSize.
     

  • mapSizeUp()

  • MapSizeUp() returns the maximum number of input samples required for the module to generate N output samples. MapSizeUp() allows Pspectra to determine the space required for data buffers to hold all the necessary data for the operation of the system.
     
  • Type derivation tree

  • VrSigProc
    VrSink
    VrSource
    Decimating
    History for FIR calculations
    Interpolating
    VrMultiTask
    Sink processing
    Source processing
     
     
    1. Signal-Processing Loop


    After the initial setup of the modules and of the environment, Pspectra runs a two-phase loop on each available processor.  In the first phase, the scheduler walks the graph using forecast()’s return values to determine modules whose input data is available, marking this data to prevent any other processors from working on the same data. The second phase runs work() on the chosen modules to produce the marked data.  When a module finishes producing a data segment, Pspectra marks the data as completed so that in later iterations the scheduler knows which modules can be scheduled.

    We run this loop on a single-processor system without the use of extra threads by performing other tasks (e.g., the GUI) between iterations of the loop.

    On multiprocessor systems, this two-phase loop runs in parallel on each processor, with synchronization mechanisms only necessary in the scheduling phase. Because Pspectra spends the bulk of the processor time in the second phase (doing the actual signal-processing work), lock contention is low and overhead minimal. We reduce overhead by marking and working on larger blocks of data, as the cost of running the scheduler is independent of the size of the blocks marked.
     

  • Data Marking

  • Pspectra implements a lazy-evaluation paradigm; scheduling starts with the sinks and requests proceed upstream only by following data-dependencies. This enables Pspectra to avoid computing intermediate data that is not necessary for the production of the final output data.

    The scheduler attempts to schedule a consecutive string of modules to help keep data in the cache between the execution of different modules. The scheduler starts with a block of data at one of the sink modules. To determine if this block can be scheduled for computation, the scheduler recursively checks the block’s data dependencies, i.e., the data ranges returned by calling forecast() on the module that produces the block. To schedule a block, the conditions listed below must be met.

  • No more than one data dependency may be incomplete.
  • If one data dependency is not complete, then this data must also be scheduled for computation in this iteration (and thus must also meet these requirements).
  • If more than one data dependency is incomplete, this block cannot be scheduled, but the scheduler may choose one of the data dependencies to mark (provided it meets these requirements).
  • All other input data was completed in a previous iteration of the scheduling algorithm.
  • The input data ranges are determined by the return value of forecast(). The scheduler marks data in a consecutive string of modules. The first module in this string (A) is either a source, which has no inputs, or has inputs that are entirely complete. Subsequent modules (B and C) use data computed both in previous iterations and in this iteration. For example, module C may use data that B and D computed in previous iterations in addition to the data B will compute on this iteration. Scheduling a string of consecutive modules helps ensure that as much data as possible remains in the cache. In our example, most of the data module B reads will still be in the cache since module A has just written it.
     
  • forecast()

  • The module developer must implement this method to inform Pspectra what input data is necessary to produce a particular range of output data. For example, a filter with a history of h would request n+h units of data to produce n outputs. Pspectra uses forecast() to determine what work needs to be done upstream and to schedule work appropriately on the available processors. If the required inputs are not precisely known, the module should return a best guess.

    If the range returned by forecast() does not include all the necessary data, work() will later determine that the guess was incorrect. In this case, Pspectra will call forecast() again. This mechanism results in a performance loss, so forecast() should err on the side of returning a larger range than necessary, in order to avoid rescheduling. Too large of a range, however, will also results in performance loss and increased latency in the system.

    Note that Pspectra assumes that input data will be processed in a monotonically increasing fashion. Everything prior to the input data range returned by forecast() is discarded and unavailable later. If the module might need this data at a later time, it must include this data in the range returned by forecast().
     

  • work()

  • This is where the actual signal-processing takes place in a module. Pspectra provides as arguments pointers to input and output data locations in memory, along with indices corresponding to the location of the data in the sample streams. Work() then enters a loop to produce the requested outputs and returns the actual number of outputs it was able to produce from the available inputs, which should usually be equal to the number requested. If the module needs more input data than originally requested (with the return value of forecast()), work() returns without computing all the output data and forecast() is called again by Pspectra to determine what data is needed.
    For Pspectra’s implicit parallelism to work, a module’s work() procedure must produce the same outputs regardless of what order blocks are produced in. For example, producing block N before block N-1 should result in the same output data as producing these blocks in order.  When a module’s work() procedure needs to force data to be computed serially instead of in parallel, it calls sync(), which allows work() to safely use state that the module modified during the production of past outputs. Sync() ensures that no other processors are computing (or will later compute) data that precedes the data this thread is computing.  The variable-rate compression example introduced above would have to use the sync() call since the module only knows exactly what input data is needed to produce a block of output data after it has finished the previous block of output data.

    Pspectra allows multiple threads to execute a particular module’s work() procedure simultaneously. This means multiple threads may be writing different, non-overlapping, ranges of data in a buffer. Although there is only one module writing to a buffer, there may be many threads writing to the buffer at any one time. VrBuffer and its subclasses keep track of the ranges each thread is writing and what data has been completed. This allows Pspectra’s scheduling algorithm to determine what data can be computed downstream without blocking. VrConnect keeps track of threads that are reading data from a particular downstream module’s input. VrBuffer uses its associated VrConnect objects to determine when data is no longer being used downstream. VrBuffer then reuses this space within the circular buffer.  VrConnect and VrBuffer use a simple, linked-list, implementation for tracking reading and writing threads.

    To handle unpredictable, or data-dependent, input/output ratios, forecast() must guess which input data the module will need; work() returns the actual number of outputs (less than or equal to the number requested) produced using this input data. If Pspectra asked the module to compute n outputs and work() discovers that the input ranges forecast() requested are inadequate, then work() returns the number of outputs it was actually able to generate. The scheduler calls forecast() and work() again to complete the unfinished outputs.

    Modules may have input streams that go unused during the production of a particular block of output data. For example, a particular multiplexing module may read 1000 bytes from one input module followed by 1000 bytes from another input module. In this case, forecast() must still return a VrSampleRange for the unused input(s). This range should be a zero length range to indicate no data is needed, with the starting index equal to the smallest index that the module may request in the future. Pspectra uses this information to determine what data can be discarded.

    Meta-modules can be built to encapsulate multiple signal-processing modules into a single larger module.  The meta-module does no processing itself, but simply creates the appropriate sub-modules, connects their inputs and outputs, and provides methods to access visible methods on the component modules.  A meta-module creates its component modules in its constructor and overrides connect(), which connects the module’s inputs to an upstream output buffer, and getOutputBuffer(), which returns the module’s output buffer. The meta-module’s connect() and getOutputBuffer() implementations redirect inputs and outputs respectively by calling the corresponding methods on the components modules.
     
     

  • Buffers

  • Signal-processing modules in the Pspectra system communicate data using sample streams for input and output. Each module can directly read and write a range of data in a steam using a pointer provided by Pspectra. The streams are implemented using two classes, VrBuffer "buffers" and VrConnect "connectors".

    Modules may have multiple outputs (e.g., a module that separates the left and right audio stream from a stereo input), each represented by a VrBuffer object. The "buffer" is simply a circular data buffer with a single writer (the module) and multiple readers (the "connectors"). A single writing module can have multiple, parallel instantiations which can simultaneously write disjoint portions of the buffer.

    Each input of a module is a VrConnect object associated with a particular VrBuffer object. The separation of the sample stream functionality into these two objects simplifies allowing multiple downstream modules to connect to a single VrBuffer object. Downstream modules share data produced by the upstream module, avoiding the recomputation of data used by multiple modules.  In addition, data produced and stored in the buffer by one processor may be read by the other processors.

    use of mmap to make buffers circular
    Buffer format, sizes, processing
     
     
     

  • Construction of an application

  • Application outline
    declare modules
    'connect' modules
    'start' sink
    loop on 'process'
    on termination, call 'stop' if needed
    Define processing computation
    Define user interface
    Construct processing chain
    Test with file data
    Test with live data
    Performance tuning
     
     
  • Algorithms

  • FIR vs IIR
    Complex FIR filter
    downconversion combined with bandpass filter
    Table driven transmit waveform generation
     
  • Hardware interface

  • Hardware design (simple)
    Wideband A/D and D/A
    Page-based DMA interface at PCI bus rate 900mbps,
    application rate 500mbps
    high sustained data rates
    Device driver design
    Device driver ioctl interface
    a file descriptor must be only for reading OR writing, not both
    GIOCSETBUFSIZE unsigned int
    Set standard buffer size to some number of pages
    GIOCSTART
    Start up the DMA/interrupt driver
    GIOCSTOP
    Stop the DMA/interrupt driver
    GIOCSETGETSTATUS struct guppi_status
    Tell the driver that the user is done with a range of pages
    Get the range of pages available for the user and
    the number of overruns since last call

    struct guppi_status {
    unsigned int index;
    index of the first valid page
    (this page is valid if num > 0)
    unsigned int num;
    number of valid pages starting at index
    unsigned int lost;
    number of pages thrown away because the buffer
    was full since the last status check
    if this is >0 then num should equal the
    number of pages in the entire buffer
    };
    Guppi source
    Guppi sink
     

  • System definitions

  • VrBuffer.h VrBuffer.cc
    VrConnect.h
    VrObj.h VrObj.cc
    VrSigProc.h VrSigProc_base.h

    VrSink.h
    VrSource.h

    VrCycleCount.h

    VrTypes.h
    VrComplex.h VrComplex.cc
     

    VrMultiTask.h VrMultiTask.cc

    I/O Module definitions

    VrSkippingSink.h

    VrAudioSink.h
    VrAudioSource.h
    VrFileSink.h
    VrFileSource.h
    VrGuppiBuffer.h
    VrGuppiSink.h
    VrPcvfoSink.h
    VrGuppiSource.h
    VrAR5000Source.h
    VrSigSource.h

    VrNullSink.h
    VrPerfGraph.h VrPerfGraph.cc

    VrGnuPlotSink.h
    VrFFTSink.h

    Processing Module definitions
    VrDecimatingSigProc.h
    VrHistoryProc.h
    VrInterpolatingSigProc.h

    VrAmplitudeDemod.h
    VrComplexFIRfilter.h
    VrRealFIRfilter.h
    VrMixer.h
    VrSum.h
    VrQuadratureDemod.h
    VrAmp.h

    VrAMMod.h
    VrAWGN.h
    VrComplexCascadefilter.h
    VrCobsZpeStuff.h VrCobsZpeUnStuff.h
    VrDownSample.h
    VrFMMod.h VrFHFSKMod.h VrFHFSKDemod.h
    VrFSKMod.h VrFSKDemod.h
    VrHoppingComplexFIRfilter.h
    VrIIRfilter.h
    VrSquelch.h

    MMX Module definitions
    VrAdd_MMX.s VrAdd4_MMX.s
    VrMMX.h VrMMX.s
    VrFMMul.s

    Softlink Module definitions
    VrSoftLinkSink.h
    VrSoftLinkSource.h
    VrIPpacket.h VrIPpacket.cc
    Softlink device driver

    Sample applications
    AMPS receiver
    multiband
    TV
    PAM
     

  • History

  • GSM receiver design
    Viewsystem
     
  • Future work

  • Rockwell receiver/transmitter
    Application of OSI RM for wireless
    Partition of functions
    'Layering' the physical layer
    Asynchronous packets - tight integration w/ TCP/IP
    Cache oblivious
    change the scheduler so that if a module does not finish
    its requested calculation, the sink is rerun with the
    requested size cut in half. Note that this can only
    be used on sync()'ed streams (threading won't work)
     
  • exported functions

  • getMarkedWP() returns the first timestamp after currently scheduled work
    (for synchronizing computation source/sink)
    overridable functions for recursive module definitions
    connect(3 parameter)
    override in containing module to call connect() in
    contained modules
    getOutputBuffer()
    override in containing module to call getOutputBuffer()
    in last contained module
     
  • List of files???

  •  

     
     
     
     
     
     

    Brett W. Vasconcellos and John Ankcorn