System Architecture

SARAS is built on a modular architecture that emphasizes reliability, cross-platform compatibility, and a clear separation of concerns.

Main Controller

The central orchestrator (main.py) that initializes all subsystems, manages the main event loop, and coordinates between components.

RobotController

Speech-to-Text

Converts voice input to text using a local Vosk model, featuring continuous listening and voice activity detection.

speech_to_text.py

Text-to-Speech

Converts text responses to high-quality speech using the Piper TTS engine, ensuring a natural and responsive voice.

text_to_speech.py

AI Processor

Manages AI interactions, switching between the primary OpenAI API and a local GGUF model for fallback processing.

ai_processor.py

Motor Controller

Abstracts motor control, handling GPIO on Raspberry Pi and simulating movement on Windows for cross-platform development.

motor_controller.py

Sensor Manager

Reads and processes data from three ultrasonic sensors for real-time obstacle detection and environmental awareness.

sensors.py

Data Flow

The system follows a clear, sequential data flow for handling user interaction and responding to environmental stimuli.

Voice Command Pipeline

User Voice -> Speech-to-Text (Vosk) -> Command Processor -> AI Processor -> Response Generation

Response Pipeline

AI Response -> Text-to-Speech (Piper) -> Audio Output & Face Display Update