Overview

The primary objective of the SeeMos Project is to propose a research platform dedicated to active vision and in particular to the early vision process. Thus, the SeeMos platform which constitutes an embedded system is composed of a modular hardware and a software design. The main feature of this platform is an architecture based on FPGA, CMOS imager, Inertial devices, High speed communication and a DSP-based Codesigned board.

 

Place the mouse over the module for details

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Architectural features

The main purpose of our architecture is to allow the implementation of early vision processes as in the human or primate visual system. In these systems it is well known that the first neural layers (in the retina) pre-filter the visual data flow in order to select only the conspicious information.
This Prefiltering step needs strong parallelization, on one hand to respect real-time constraints, and on the other hand because of the intrinsic characteristics of the algorithms. For exemple, some classical algorithms in an attention task used to elaborate an efficient saliency map are motion detection, Gabor filters, and color segmentation. However, the characteristics of particular visual tasks may require dedicated image processing and only an FPGA approach allows such flexibility. For architectures such as these, a Stratix EP1S60 from Altera has been chosen. The need for strong parallelization was what led us to connect 5x2MB SRAM synchronous memory blocks. In our approach, the high-level processing has to be performed on a host computer rather than on the embedded system.

 

Architecture Motivations: The active vision approach

An artificial active vision system uses observer-controlled input sensors. Its main goal must be to actively extract the requested information in order to solve a given task. In this spirit, an active vision system can be divided in 2 two main parts: Early vision step and Cognitive processes. In the design of our sensor, we only consider the Early vision step which can be divided into two successive tasks:

  • Attention: This is the initializing step of the process. Whole images are grabbed while waiting for the building of the saliency maps. These maps are built in parallel and represent/code conspicuity within the visual field along particular dimensions (e.g. color, orientation, or motion). The result of this step is a set of ROIs (Regions Of Interest).

.

.

Tracking of a gray scale template (32x32) at 2000fr/s

Implementation approach

The implementation of such an approach requires management, sequential and concurrent execution of the routines previously described. Indeed, all task-oriented execution (attention, focusing) is controlled by supplied results and these layers possibly have to share areas of interest. Moreover, the information bottleneck located in the imager level should be continuously optimized to ensure an efficient performance. In our hardware architecture, these functions are carried out by what we term a "Sequencer'' (noted M0 on the figure below) and are performed on the NIOS soft core processor. This solution has two main advantages:

  • We benefit from software flexibility to define the routines' interactions, and
  • The soft core processor allows an efficient architectural matching with the other parts of the supervision unit.

An internal RAM (noted R0 on the figure below) is used to store the instruction sequences which define the sequencer behavior according to the task under consideration. The host computer which uses our embedded system communicates with it through a standard communication bus (actually USB 2.0 protocol) and sends requests in order to indicate to the sequencer to adopt the relevant behavior.

More precisely, the controls are accordingly (and potentially a set of parameters) stored in a dedicated stack, then the sequencer M0 chooses pre-established interactions between the modules (noted P4 on the figure below) which constitute a dedicated processing chain.

 


A number of these modules, due to environmental adaptation (Processing No. 1 to i), modify the pixel flow which is going to be used by the attention, focusing (Processing No. j to j+1). The different data flows (corrected windows of interest and
inertial measurements) can be used by these modules to perform computation. The set of results that are provided by these processing modules are collected in a buffer. This is how the sequencer selects results to send to the host computer. The sequencer is going to use a part of these results to perform visual feedback on sensing devices (noted S0 on the figure below) using dedicated control modules (noted P0, P1 and P3 in the figure below ). We note in Figure above the module P2 which works the external RAM R1. This module performs the Fixed Pattern Noise Correction which is absolutely essential for the image sensor technology we use. Lastly, the dedicated communication module P5 is a multiplexer that synchronizes the corrected pixels flow and the sequencer results flow for sending to the host computer.