# Speex Encoder

Video Capture SDK .Net Video Edit SDK .Net Media Blocks SDK .Net

Speex is a patent-free audio compression format designed specifically for speech encoding. The VisioForge provides a flexible encoder with multiple operation modes and configuration options to optimize speech compression for different use cases.

# Cross-platform Speex output

VideoCaptureCoreX VideoEditCoreX MediaBlocksPipeline

# Encoder Modes

The Speex encoder supports four distinct modes of operation, each optimized for different frequency ranges:

Auto (0): Automatically selects the most appropriate band mode based on input
Ultra Wide Band (1): Optimized for 32 kHz sampling rate
Wide Band (2): Optimized for 16 kHz sampling rate
Narrow Band (3): Optimized for 8 kHz sampling rate

Use the SpeexEncoderSettings class to configure the encoder mode and other parameters.

# Supported Audio Parameters

# Sample Rates

The encoder supports three standard sampling rates:

8,000 Hz (Narrow Band)
16,000 Hz (Wide Band)
32,000 Hz (Ultra Wide Band)

# Channel Configuration

Supports both:

Mono (1 channel)
Stereo (2 channels)

# Rate Control Methods

The Speex encoder implements several rate control mechanisms that can be used independently or in combination:

# Fixed Quality Mode

Uses the Quality parameter to maintain consistent quality:

var settings = new SpeexEncoderSettings {
    Quality = 8.0f, // Range: 0-10, default: 8
    VBR = false    // Disable VBR for pure quality-based encoding
};

# Variable Bit Rate (VBR)

Dynamically adjusts bitrate based on content complexity:

var settings = new SpeexEncoderSettings {
    VBR = true,
    Quality = 8.0f  // Acts as the target quality for VBR
};

# Average Bit Rate (ABR)

Maintains a target average bitrate over time:

var settings = new SpeexEncoderSettings {
    ABR = 15.0f,   // Target bitrate in kbps
    VBR = true     // ABR requires VBR to be enabled
};

# Fixed Bitrate

Uses a constant bitrate throughout encoding:

var settings = new SpeexEncoderSettings {
    Bitrate = 24.6f,  // Fixed bitrate in kbps
    VBR = false
};

The supported bitrates range from 2.15 kbps to 24.6 kbps:

2.15 kbps
3.95 kbps
5.95 kbps
8.00 kbps
11.0 kbps
15.0 kbps
18.2 kbps
24.6 kbps

# Advanced Features

# Voice Activity Detection (VAD)

Detects presence of speech in the audio:

var settings = new SpeexEncoderSettings {
    VAD = true,    // Enable voice activity detection
    DTX = true     // Usually enabled with VAD for bandwidth efficiency
};

# Discontinuous Transmission (DTX)

Reduces bandwidth usage during silence periods:

var settings = new SpeexEncoderSettings {
    DTX = true     // Enable discontinuous transmission
};

# Encoding Complexity

Controls the trade-off between encoding quality and CPU usage:

var settings = new SpeexEncoderSettings {
    Complexity = 3  // Range: 1-10, default: 3
};

# Complete Usage Example

Here's a comprehensive example showing how to configure and use the Speex encoder:

// Check if Speex encoder is available
if (!SpeexEncoderSettings.IsAvailable())
{
    throw new InvalidOperationException("Speex encoder is not available on this system.");
}

// Create encoder settings
var encoderSettings = new SpeexEncoderSettings
{
    // Basic configuration
    Mode = SpeexEncoderMode.UltraWideBand,
    SampleRate = 32000,
    Channels = 1,
    
    // Quality settings
    Quality = 8.0f,
    Complexity = 3,
    
    // Rate control
    VBR = true,
    ABR = 15.0f,
    
    // Voice optimization
    VAD = true,
    DTX = true,
    
    // Frame configuration
    NFrames = 1
};

Add the Speex output to the VideoCaptureCoreX instance:

// Create a Video Capture SDK core instance
var core = new VideoCaptureCoreX();

// Add the Speex output
core.Outputs_Add(encoderSettings, true);

Set the output format for the Video Edit SDK core instance:

// Create a Video Edit SDK core instance
var core = new VideoEditCoreX();

// Set the output format
core.Output_Format = encoderSettings;

Create a Media Blocks OPUS output instance:

// Create a Speex encoder instance
var speexEncoder = new SpeexEncoderBlock(encoderSettings);

# Performance Considerations

When configuring the Speex encoder, consider these performance factors:

Higher complexity values provide better quality but require more CPU resources
VBR with VAD and DTX provides optimal bandwidth usage for speech content
The NFrames parameter affects latency and processing efficiency
Ultra Wide Band mode provides the highest quality but requires more bandwidth
Using ABR helps maintain consistent bandwidth usage while allowing quality variations

This implementation of the Speex encoder is particularly well-suited for VoIP applications, podcast encoding, and other speech-focused audio applications where bandwidth efficiency and speech quality are primary concerns.