STreaM32

Aim

The project aims to solve the challenge of deploying real-time video compression on highly resource-constrained embedded systems.

The core problems include:

Deploying machine learning models on microcontrollers with limited memory and compute capacity
Designing a compact and efficient video encoder using lightweight neural architectures (e.g., autoencoders/CNNs)
Achieving real-time frame acquisition, compression, and transmission under strict latency constraints
Enabling reliable wireless streaming over low-bandwidth communication channels
Developing a decoder capable of reconstructing compressed frames in real time on a host machine

Methodology

The system is implemented as a pipeline consisting of model design, embedded deployment, and system integration.

Model Design

A lightweight autoencoder-based architecture is used for video compression. The encoder compresses input frames into a compact latent representation. The model is trained offline using standard deep learning frameworks. Post-training, the model is quantized (e.g., INT8) using TensorFlow Lite to reduce memory and computation requirements

Embedded Deployment

The quantized encoder model is deployed on an STM32N6 board. Input frames are captured via a camera module interfaced with the MCU. Each frame is processed in real time through the encoder.

Optimization techniques include:

Fixed-point arithmetic
Memory reuse
Layer-wise optimization for inference speed

Communication Pipeline

Compressed data is transmitted wirelessly using Serial. Data framing and synchronization mechanisms are implemented to ensure reliable streaming. Bandwidth constraints are handled through aggressive compression

Decoder Implementation

A Python-based decoder runs on a host system. It receives compressed latent representations. The decoder reconstructs frames using the corresponding neural network decoder. Real-time display is achieved using efficient buffering and rendering.

Model Architecture

Results

Conclusion

Viability on Edge Devices:deep-learning-based compression is a strong alternative for resource-constrained hardware.

Scalable Blueprint: Established a foundation for deploying high-quality, low-latency video streaming on the STM32.

Future Work

Hybrid Architecture: Combine Method 1’s residual logic with Method 2’s codebook to encode only the changes in discrete indices for maximum compression.
Perceptual Optimization: Replace standard MSE loss with SSIM based or adversarial loss to improve visual sharpness and reduce blur at low bitrates
Transfer data wirelessly using ESP WiFi module

Mentees:

Varun Uthej Reddy

Hardhik Thiriveedi

Joseph

Mentors:

Asrith Singampalli

Guhan Balaji

Dammu Chaitanya

Virtual Expo 2026

Abstract