Project
# | Title | Team Members | TA | Documents | Sponsor |
---|---|---|---|---|---|
45 | AI-based Meeting Transcription Device |
Chang Liu Gao Gao Ziyang Huang |
Jiankun Yang | proposal1.pdf |
|
## Team Members: - **Ziyang Huang** (ziyangh3) - **Gao Gao** (xgao54) - **Chang Liu** (changl21) ## Problem During the pandemic, we found Zoom’s live transcription very useful, as it helped the audience catch up quickly with the lecturer. In many professional and academic settings, real-time transcription of spoken communication is essential for note-taking. Additionally, individuals with hearing impairments face challenges in following spoken conversations, especially in environments where captions are unavailable. Existing solutions, such as Zoom’s live transcription or mobile speech-to-text apps, require an internet connection and are often tied to specific platforms. To address this, we propose a standalone, portable transcription device that can capture, transcribe, and display spoken text in real time. The device will be helpful since it provides a distraction-free way to record and review conversations without relying on a smartphone or laptop. ## Solution Our **Smart Meeting Transcription Device** will be a portable, battery-powered device that records with a microphone, converts speech into real-time text, and displays it on an LCD screen. The system consists of the following key components: 1. **A microphone module** to capture audio input. 2. **A speech processing unit** (Jetson Nano/Raspberry Pi/Arduino) running the Vosk speech-to-text model to transcribe the captured speech. 3. **An STM32 microcontroller**, which serves as the central controller for managing user interactions, processing text display, and storing transcriptions. 4. **An LCD screen** to display transcriptions in real-time. 5. **External memory** (SD card or NOR flash) for saving transcribed conversations. 6. **A power system** (battery with efficient power management) to enable portability. --- ## Solution Components ### **Subsystem 1: Speech Processing Unit** - **Function:** Captures audio and converts speech into text using an embedded speech-to-text model. - **Microphone Module:** Adafruit Electret Microphone Amplifier (MAX9814) - **Processing Board:** Jetson Nano / Raspberry Pi 4B - **Speech Recognition Model:** Vosk Speech-to-Text Model - **Memory Expansion (if required):** SD card (SanDisk Ultra 32GB) ### **Subsystem 2: STM32 Central Controller** - **Function:** Manages the user interface, processes the transcribed text, and sends data to the LCD screen. - **Microcontroller:** STM32F4 Series MCU - **Interface Components:** Buttons for navigation and text saving - **Memory Module:** SPI-based NOR Flash (W25Q128JV) ### **Subsystem 3: Display Module** - **Function:** Displays real-time transcriptions and allows users to scroll through previous text. - **LCD Screen:** 2.8-inch TFT Display (ILI9341) - **Controller Interface:** SPI Communication with STM32 ### **Subsystem 4: Power Management System** - **Function:** Provides reliable and portable power for all components. - **Battery:** 3.7V Li-ion Battery (Adafruit 2500mAh) - **Power Regulation:** TP4056 Li-ion Charger + 5V Boost Converter - **Power Optimization:** Sleep mode for STM32 to enhance battery life --- ## **Criterion for Success** 1. The device must accurately transcribe speech to text with reasonable latency. 2. The LCD screen must display real-time transcriptions clearly. 3. The STM32 must successfully manage system operations and communicate with peripheral components. 4. The system should support local storage for saving transcriptions. 5. The battery life should last at least **2-3 hours** under normal usage conditions. |