System Architecture
SARAS is built on a modular architecture that emphasizes reliability, cross-platform compatibility, and a clear separation of concerns.
Main Controller
The central orchestrator (main.py
) that initializes all subsystems, manages the main event loop, and coordinates between components.
RobotController
Speech-to-Text
Converts voice input to text using a local Vosk model, featuring continuous listening and voice activity detection.
speech_to_text.py
Text-to-Speech
Converts text responses to high-quality speech using the Piper TTS engine, ensuring a natural and responsive voice.
text_to_speech.py
AI Processor
Manages AI interactions, switching between the primary OpenAI API and a local GGUF model for fallback processing.
ai_processor.py
Motor Controller
Abstracts motor control, handling GPIO on Raspberry Pi and simulating movement on Windows for cross-platform development.
motor_controller.py
Sensor Manager
Reads and processes data from three ultrasonic sensors for real-time obstacle detection and environmental awareness.
sensors.py
Data Flow
The system follows a clear, sequential data flow for handling user interaction and responding to environmental stimuli.
Voice Command Pipeline
User Voice -> Speech-to-Text (Vosk) -> Command Processor -> AI Processor -> Response Generation
Response Pipeline
AI Response -> Text-to-Speech (Piper) -> Audio Output & Face Display Update