Introduction
With the rapid development of artificial intelligence technologies, voice recognition has greatly improved over the years. Many applications have integrated this new functionality to enhance user experience. In this project, you will build a Raspberry Pi-based smartphone that functions like a typical phone but includes additional voice control abilities.
You will call your device a “Piphone,” as it uses a Raspberry Pi as the core computing component. The key functions you aim to achieve through voice commands include making phone calls and playing music. To realize the concept of a portable device, you’ll utilize off-the-shelf electronics and simple circuit designs to construct a compact body that minimizes wiring tangles. Overall, your DIY Piphone will provide a fun demonstration of integrating voice interactions into common smartphone tasks.
Hardware Design
Circuit Design
The first step is to design the necessary circuitry. A critical component is the FONA GSM module, which enables network communication protocols. You’ll also use a 1200mAh lithium polymer battery for power. Since the Pi requires 5V but the FONA module and battery output are only 3.7V, you’ll add a DC boost converter to step up the voltage.
To minimize size, all electronics will be assembled onto a compact PCB. Connectivity will be established through appropriate pin breakouts and headers. You’ll integrate the antenna, SIM slot, audio jack, and battery connector as well. Thanks to onboard battery management circuitry, only a single battery will be needed.
Hardware Implementation:
A DC-DC boost converter is an important inclusion to step up the 3.7V battery voltage to the 5V required by the Raspberry Pi, allowing you to utilize a single portable battery. Header pins will facilitate quick prototyping and modifications during testing and debugging. SMD soldering skills will be needed for the final miniaturized board assembly.
Status indicator LEDs will help you gauge battery and connectivity levels and troubleshoot issues without console access. Additionally, electromagnetic shielding of the antenna will aid performance, as wireless function can be susceptible to nearby metal structures.
Circuit Wiring
You’ll create wiring diagrams to properly connect all the components. The FONA module will interface with the Pi via UART serial. An external antenna will be required along with a functioning SIM card inserted at the back. Debugging tools like a USB console cable will also be incorporated for initial testing.
Key aspects of the wiring were:
- The Vio pin for logic level shifting between 3.7V and 5V domains.
- The key pin that keeps the module permanently powered on.
- A reset pin to enable software module resets.
- Proper orientation of battery and charger connections.
Hardware Testing
You will conduct various tests to verify hardware functionality, including:
- Battery voltage readings
- SIM card identification
- Network signal strength checks
- Battery charging status monitoring
Assembly
All components will be assembled onto a prototyping board with headers. If desired, you can 3D print or laser cut additional supporting brackets. The compact multi-layer design will allow your DIY phone prototype to function portably.
Voice Recognition
Online Speech API
You will first explore utilizing Google’s speech recognition API for voice control. Its cloud-based deep learning models provide high accuracy, but internet dependency may slow response times. To address this, you’ll implement a button trigger instead of continuous listening. The system will support basic commands for calling names, numbers, and simple phrases.
- Offline cross-correlation analysis involved raw audio sampling, normalization, FFT conversion, and smoothing before frequency domain matching calculations.
- Limited vocabulary recognition was partially due to limited sample training data and constraints of the correlation coefficient approach without classification layers.
- Further machine learning techniques like Hidden Markov Models, LSTMRNNs, or CNNs could boost language modeling capacity if implemented on-device.
Offline Implementation
For offline capability, you will develop your speech recognition using Python. Raw audio will be recorded and compared against stored samples using FFT cross-correlation in the frequency domain. While distinguishing a large vocabulary may prove challenging without training, you will primarily rely on the online API.
Phone Application
Phone App Functionality:
Your phone app will feature additional menus for dialing voicemail, viewing call logs, and configuring advanced settings for the FONA module. Serial port interruption pin monitoring will facilitate incoming call alerts and status queries. A database of simulated contacts will streamline storage and retrieval compared to linear lists with duplicate entries.
Hardware Setup
After inserting the SIM card and ensuring adequate network registration, you’ll configure the FONA GSM module. UART communication to send AT commands will be verified using the serial monitor tool Minicom.
Design Overview
Your phone app will feature basic call functionality through various GUI screens for number dialing, connecting states, and incoming calls. Information like call times and phone numbers will be dynamically displayed.
Interface Implementation
You will use Pygame to code GUI elements and interactions. Touch buttons for dial-pad keys, additional controls, and updated status displays will be created on a resized surface clipped to the PiTFT size. Scrolling will allow navigation through longer contact lists.
Voice Integration
Voice commands could initiate the dialing process by speech-to-text conversion through our trigger function, complementing touch input. Holding the “call” button would initiate dial-out through AT commands sent over serial to the FONA module.
Call Management
You will monitor active calls by reading serial port responses of the CLCC response strings. Status codes will distinguish between dialing, connecting, and call end states, allowing screen transitions at appropriate times. A hang-up command will end interactions.
Music Player App
Music Player Features
Your music player will store song collections locally on the SD card instead of streaming external media. Additional database fields will track play counts, ratings, etc., while SQL aggregation functions will compile statistics. Sound effects will enhance the app interface experience beyond the core voice and GUI implementation.
Database Setup
A MySQL database will organize the music library and playlists more efficiently than raw file scanning. Tables will store song metadata and playlists as customizable entities.
Python Connector
The mysql-python module will facilitate database queries and management directly from your code. Specialized functions will perform SQL operations on the tables for various app needs.
GUI Design
You will implement a scrolling surface approach to handle overflow lists since PiTFT screen space is limited. Progress bars will track playback position updates. Volume control will utilize PWM signals sent over GPIO, and menus will switch between lists, playlists, and playback views.
Voice Commands
Voice commands will allow common music queries like genre and artist searches to extract relevant song IDs from the database for instant playlist generation. Playback commands will also use voice triggers for contact-free control.
Results and Conclusion
Your Piphone will successfully demonstrate functional voice-controlled apps in action. The online speech API will provide sufficient accuracy while avoiding training costs. The device’s portability will be realized through its compact design and single-cell operation.
Potential improvements could include expanded vocabulary support with data-driven speech models, cloud database syncing, contact list integration, and additional voice commands. Overall, you will be pleased with your DIY smartphone prototype and gain valuable experience with embedded Linux, circuit design, and cross-discipline project work.
Code Appendix
import serial import pygame from pygame.locals import * import RPi.GPIO as GPIO import time import subprocess import os import speech_recognition as sr from os import path os.putenv('SDL_VIDEODRIVER', 'fbcon') # Display on piTFT os.putenv('SDL_FBDEV', '/dev/fb1') # os.putenv('SDL_MOUSEDRV', 'TSLIB') # Track mouse clicks on piTFT os.putenv('SDL_MOUSEDEV', '/dev/input/touchscreen') GPIO.setmode(GPIO.BCM) GPIO.setup(27, GPIO.IN, pull_up_down=GPIO.PUD_UP) #quit def GPIO27_callback(channel): exit (0) GPIO.add_event_detect(27,GPIO.FALLING,callback=GPIO27_callback) def returnnumber(st): n="" for char in st: if char.isdigit(): n=n+char return n # get incoming number def get_incoming(st): n="" in_flag = 0 for char in st: if in_flag<2 and in_flag>0: n=n+char if char == '"': in_flag = in_flag +1 return n[:-1] # get connecting status def get_status(st): n=0 for char in st: if char.isdigit(): n=n+1 if n==3: return char def check_over(st): return st[0] # get battery status def get_battery(st): n="" in_flag = 0 for char in st: if in_flag<2 and in_flag>0: n=n+char if char == ',': in_flag = in_flag +1 return n[:-1] # translate time to minute:second mode def get_progress(t): minutes=int(t/60) second=int(t-minutes*60) if second<10: time_string=str(minutes)+" : 0"+str(second) else: time_string=str(minutes)+" : "+str(second) return time_string class Background(pygame.sprite.Sprite): def __init__(self, image_file, location): pygame.sprite.Sprite.__init__(self) #call Sprite initializer self.image = pygame.image.load("/home/pi/project/"+image_file) self.rect = self.image.get_rect() self.rect.left, self.rect.top = location pygame.init() pygame.mouse.set_visible(False) clock=pygame.time.Clock() size = width, height = 240,320 black = 0, 0, 0 WHITE = 255, 255, 255 screen = pygame.display.set_mode(size) background = Background("background.png",[0,0]) call = Background("call.png",[10,10]) music = Background("music.png",[80,10]) one = Background("1.png",[30,70]) two = Background("2.png",[100,70]) three = Background("3.png",[170,70]) four = Background("4.png",[30,130]) five = Background("5.png",[100,130]) six = Background("6.png",[170,130]) seven = Background("7.png",[30,190]) eight = Background("8.png",[100,190]) nine = Background("9.png",[170,190]) zero = Background("0.png",[100,250]) micro = Background("microphone.png",[40,25]) micro2 = Background ("microphone2.png",[40,25]) back = Background("back.png",[0,10]) call2 = Background("call2.png",[40,260]) delete = Background("delete.png",[180,260]) hang_up = Background("hang_up.png",[90,220]) hang_up2 = Background("hang_up.png",[130,220]) answer = Background("answer.png",[50,220]) battery = Background("battery.png",[200,10]) calling={'Calling...':(100,70)} ring={'Ring...':(100,70)} x1 = [20,40,60,80,100,120,140,160,180,200,220] y1 = 40 my_font = pygame.font.Font(None, 15) my_font2 = pygame.font.Font(None,25) menu = 0 number = " " press = 0 voice_recog=0 old_time=0 status='1' previous_status=0 serialport = serial.Serial("/dev/ttyS0", 9600, timeout=0.5) serialport.write("AT\r") response = serialport.readlines(None) serialport.write("ATE0\r") response = serialport.readlines(None) serialport.write("AT\r") response = serialport.readlines(None) serialport.write("AT+CBC\r") response = serialport.readline() while len(response)<5: response = serialport.readline() battery_n=get_battery(response) while 1: #main menu if menu == 0: screen.blit(background.image,background.rect) screen.blit(call.image,call.rect) screen.blit(music.image,music.rect) screen.blit(battery.image,battery.rect) screen.blit(my_font.render(battery_n,True,(0,0,0)),(208,35)) pygame.display.flip() for event in pygame.event.get(): if(event.type is MOUSEBUTTONDOWN): pos = pygame.mouse.get_pos() x,y = pos if y>10 and y<70 and x>10 and x<70: menu = 1 if y>10 and y<70 and x>80 and x<145: GPIO.cleanup() import m_player2 #phone call menu if menu == 1: screen.fill(WHITE) screen.blit(one.image,one.rect) screen.blit(two.image,two.rect) screen.blit(three.image,three.rect) screen.blit(four.image,four.rect) screen.blit(five.image,five.rect) screen.blit(six.image,six.rect) screen.blit(seven.image,seven.rect) screen.blit(eight.image,eight.rect) screen.blit(nine.image,nine.rect) screen.blit(zero.image,zero.rect) screen.blit(back.image,back.rect) screen.blit(call2.image,call2.rect) screen.blit(delete.image,delete.rect) if voice_recog == 0: screen.blit(micro.image,micro.rect) screen.blit(my_font2.render(number,True,(0,0,0)),(80,30)) pygame.display.flip() #voice recognition to dial if voice_recog == 1: screen.blit(micro2.image,micro2.rect) pygame.display.flip() cmd = 'arecord -d 8 -D plughw:1,0 output.wav' print subprocess.check_output(cmd, shell=True) voice_recog = 0 AUDIO_FILE = path.join(path.dirname('\home\pi\project'), 'output.wav') r=sr.Recognizer() with sr.AudioFile(AUDIO_FILE) as source: audio = r.record(source) # read the entire audio file try: number = number + returnnumber(r.recognize_google(audio)) except sr.UnknownValueError: number = number except sr.RequestError as e: number = number for event in pygame.event.get(): if(event.type is MOUSEBUTTONDOWN): pos = pygame.mouse.get_pos() x,y = pos if x>0 and x<20 and y>0 and y<30: menu = 0 if x>30 and x<80 and y>65 and y<115: number = number+'1' if x>100 and x<150 and y>65 and y<115: number = number+'2' if x>170 and x<220 and y>65 and y<115: number = number+'3' if x>30 and x<80 and y>125 and y<175: number = number+'4' if x>100 and x<150 and y>125 and y<175: number = number+'5' if x>170 and x<220 and y>125 and y<175: number = number+'6' if x>30 and x<80 and y>185 and y<235: number = number+'7' if x>100 and x<150 and y>185 and y<235: number = number+'8' if x>170 and x<220 and y>185 and y<235: number = number+'9' if x>100 and x<150 and y>245 and y<295: number = number+'0' if x>170 and x<220 and y>245 and y<295: number = number[:-1] #make call if x>30 and x<80 and y>245 and y<295: menu = 2 voice_recog=0 print("Calling " + number); serialport.write("AT\r") response = serialport.readlines(None) serialport.write("ATD " + number + ';\r') response = serialport.readlines(None) print response if x>40 and x<70 and y>20 and y<60: voice_recog = 1 #calling menu if menu == 2: while len(s)<5: serialport.write("AT+CLCC\r") s=serialport.readline() status=get_status(s) if status == '0': previous_status=1 if old_time==0: old_time=time.time() else: current_time=time.time()-old_time screen.fill(WHITE) screen.blit(my_font2.render(number,True,(0,0,0)),(80,30)) screen.blit(my_font2.render(get_progress(current_time),True,(0,0,0)),(100,60)) screen.blit(hang_up.image,hang_up.rect) pygame.display.flip() else: if previous_status==1: menu=1 previous_status=0 else: old_time=0 screen.fill(WHITE) screen.blit(my_font2.render(number,True,(0,0,0)),(80,30)) for my_text, text_pos in calling.items(): text_surface = my_font2.render(my_text, True, black) rect = text_surface.get_rect(center=text_pos) screen.blit(text_surface, rect) screen.blit(hang_up.image,hang_up.rect) pygame.display.flip() #hang up for event in pygame.event.get(): if(event.type is MOUSEBUTTONDOWN): pos = pygame.mouse.get_pos() x,y = pos print (x,y) if x>100 and x<160 and y>220 and y<270: old_time=0 previous_status=0 menu = 1 n=0 while n<2000: serialport.write("ATH\r") n=n+1 #incoming call detection s = serialport.readline() if (s =='RING\r\n'): menu = 3 serialport.write("AT+CLCC\r") s = serialport.readline() while len(s)<10: s = serialport.readline() incoming_number=get_incoming(s) number=incoming_number #incoming menu if menu == 3: screen.fill(WHITE) screen.blit(my_font2.render(incoming_number,True,(0,0,0)),(60,90)) for my_text, text_pos in ring.items(): text_surface = my_font2.render(my_text, True, black) rect = text_surface.get_rect(center=text_pos) screen.blit(text_surface, rect) screen.blit(hang_up2.image,hang_up2.rect) screen.blit(answer.image,answer.rect) pygame.display.flip() for event in pygame.event.get(): if(event.type is MOUSEBUTTONDOWN): pos = pygame.mouse.get_pos() x,y = pos if x>50 and x<110 and y>220 and y<270: menu = 2 n=0 while n<800: serialport.write("ATA\r") n=n+1 status='0' if x>130 and x<190 and y>220 and y<270: menu = 0 n=0 while n<2000: serialport.write("ATH\r") n=n+1
For further details or future updates on this project visit: A DIY Voice Controlled Smartphone using Raspberry Pi