In this tutorial, we will learn how to build your own smart speaker using Google Assistant, Google Cloud, Actions on Google and ReSpeaker Core v2.0. In addition, this is a multi-part tutorial series on how to develop and build your own smart speaker using various tools, SDKs, APIs, and hardware.
Seeed’s ReSpeaker Core v2.0 is designed for voice interface applications. So, it is based on the Rockchip RK3229, a quad-core ARM Cortex A7, running up to 1.5GHz, with 1GB RAM. Also, the board features a six-microphone array with speech algorithms including DoA (Direction of Arrival), BF (Beam-Forming), AEC (Acoustic Echo Cancellation), etc. Hence, we will be using this as a foundation to build our smart speaker.
Step 1: Download the latest version of the Debian image for the ReSpeaker Core v2.0. For this tutorial, the image used is respeaker-debian-9-lxqt-sd-20180801-4gb.img.xz .
You can download the images for the ReSpeaker Core v2.0 here.
Step 2. Plug the SD card into your PC or MAC with an SD card reader. So, you need an SD card with a capacity of more than 4GB.
Step 3. Click to download Etcher here, and burn the *.img.xz file directly to your SD card with Etcher. Or unzip the *.img.xz file to a *.img file, then burn it to SD card with other image writing tools.
After that, click the Plus icon to add the image you just download, the software will automatically select the SD card you plug. Then click Flash! to start burning. It will take about 10 minutes to finish.
Step 4. Hence, after writing the image to the SD card, insert the SD card in your ReSpeaker Core v2.0. Also, power the board using the PWR_IN micro USB port and DO NOT remove the SD card after powering on. Finally, ReSpeaker Core v2.0 will boot from the SD card, and you can see USER1 and USER2 LEDs light up. USER1 is typically configured at boot to blink in a heartbeat pattern and USER2 is typically configured at boot to light during SD card accesses.
Step 1: Connect the ReSpeaker Core v2.0 board to a monitor, keyboard, and mouse using the HDMI port and USB ports respectively. Once the image is booted up, you will see a screen similar to the following.
Step 2: Open QTerminal from the System Tools menu and type the following command to setup the WIFI.
Step 3: Configure your ReSpeaker's network with the Network Manager tool, nmtui. nmtui will already be installed on the ReSpeaker image.
sudo nmtui
The default password is respeaker and the username is respeaker as well. You will be prompted to enter the password. Enter respeaker to continue.
Select Activate a connection option from the list and press Enter.
Select your Wi-Fi for ReSpeaker v2.0, press Enter key and type your Wi-Fi password and press Enter key again. When you see a * mark, it means that your ReSpeaker has successfully connected to your Wi-Fi network. Furthermore, tap Esc key twice to leave the network manager config tool.
Step 4: Once the WiFi connection is successful, we can remotely access the ReSpeaker Core v2.0 board using VNC Viewer. Before we do that, we need to know the IP address of the device.
Step 5: Within the terminal type the following command and copy the IP address.
ip address
Step 6: Download VNC Viewer to your computer or mac and enter the IP address and remotely connect to the system using the following.
IP Address: <ENTER THE IP ADDRESS OF THE RESPEAKER BOARD>
Username: respeaker
Password: respeaker
Step 7: Therefore, we should now be connected to our ReSpeaker Core v2.0 board remotely using the VNC Viewer. Therefore, you can now access and control the device using the VNC viewer remotely instead of actually connecting the monitor, keyboard, and mouse to the board directly and using it.
For the purpose of this tutorial, we will use a normal active speaker and connect to the ReSpeaker Core v2.0 board using its 3.5mm audio jack to output audio. Firstly, you can plug active speakers or headphones into this port. Secondly, note that you also have the option to connect the board to Bluetooth speakers or a speaker with a JST 2.0 pin.
Step 1: List the microphones. We can check this with the following command in the Terminal.
arecord -l
Make sure to note down the card number and device number. In this case, the card number is 0, and the device number is 0. Furthermore, find the capture card whose name has seeed prefix. For the example above, the playback device is hw:0,0, which means card 0/device 0.
Step 2: Check the list of all playback devices on the device. We can check this with the following command in the Terminal.
aplay -l
Make sure to note down the card number and device number. In this case, the card number is 0 and the device number is 1. Find the sound card whose name has seeed prefix. For the example above, the playback device is hw:0,1, which means card 0/device 1.
Step 3: Record and Play: Test recording and playing sounds with the following commands in the terminal.
[snippet slug=record-playback-audio-using-arecord-aplay-respeaker-core-v2-0 lang=bash]
Step 4: Create a new file named .asoundrc in the home directory (/home/respeaker). So, make sure it has the right slave definitions for microphone and speaker; use the configuration below but replace <card number> and <device number>with the numbers, you wrote down in the previous step. Hence, do this for both pcm.mic and pcm.speaker.
sudo nano .asoundrc
Similarly, copy-paste the code from below. Replace it with the correct <card number> and <device number> based on your configuration and save the file.
[snippet slug=configure-audio-soundrc lang=bash]
In this example based on our configuration, .asoundrc file looks like this.
Now that we have our hardware setup, we have almost set up our smart speaker. Let's move ahead and enable Google Assistant for our ReSpeaker hardware.
For the purpose of this tutorial and the upcoming series of tutorials regarding the same topic, we will add a new project in Google Cloud Platform.
Step 1: Sign into your google/gmail account.
Step 2: First of all, we need to create a Google Cloud Platform Project using https://console.cloud.google.com/
Step 1: Import a new project in Actions on Google Console. Click on Add/import project.
Step 2: Choose the same project name that we created in the Google Cloud console. In this case, MySmartSpeaker.
Step 3: Click on Device registration: Enable Google Assistant for your hardware.
Step 4: Click on REGISTER MODEL to embed Google Assistant to your hardware.
Step 5: Then, fill in your product info and click REGISTER MODEL.
Step 6: Download OAuth 2.0 credentials and click NEXT.
Step 7: Select All 7 traits and click on SAVE TRAITS.
Step 8: Copy the Model ID. In the next screen, click on the Project Name and take a note of your Model ID because we will need this model ID for the next steps. In this case, our Model ID is something like mysmartspeaker-xxxx-my-smart-speaker-xxxxx. Yours will be different from the one shown on the Action on Google Console.
Step 9: Copy the Project ID. Furthermore, from the console, please click the gear in the upper left corner of the Action on Google Console, click Project Settings, remember the Project ID.
Step 10: Rename the client_secret_xxxxxxx.json file that we downloaded to our computer in Step 6 to credentials.json.
Step 11: Move the credentials.json file from our computer to the ReSpeaker Core v2.0 board's path /home/respeaker. We can do this by using the following command or any preferred choice of tool for the same. Since we already know the IP address of the device from the previous steps, we can directly connect to the device and move the file using the following commands from your computer. In this case, my MacBook's Terminal.
[snippet slug=copy-credentials-json-files-mac-respeaker-device lang=bash]
Enter yes to continue and enter the password as respeaker.
We have successfully copied the credentials.json file from our computer to the ReSpeaker Core v2.0 device in the /home/respeaker path.
Most importantly, as of this step, we have the following.
Project ID: mysmartspeaker-xxxxx
Model ID: mysmartspeaker-xxxx-my-smart-speaker-xxxxx
credentials.json file in path /home/respeaker of the ReSpeaker Core v2.0 device.
Step 1: Enable the Google Assistant API on the project you selected. So, we need to do this on the Google Cloud Platform Console.
Just click Here to enable the Google Assistant API. Or you can click on Navigation Menu -> APIs and Services -> Library and Search for Google Assistant. Click ENABLE.
Make sure to visit Activity Controls and ensure the following toggle switches are enabled (blue):
For the remaining part of this tutorial, we need to execute the scripts directly inside the ReSpeaker Core v2.0 device using our VNC Viewer.
We can do this phase with 2 options Python 2.7 or Python 3. For the sake of this tutorial, we will be choosing Python 3. You can also refer to Google Assistant SDK Documentation for more details. Also, this will be the heart of what makes our device an intelligent smart speaker.
For Python 3
Step 1: Configure the environment. Open the QTerminal again in our ReSpeaker Core V2.0 device.
Step 2: Execute the following scripts as shown below one by one. Consequently, you will be prompted to enter the respeaker password. The password is respeaker.
[snippet slug=python-3-configure-environment-google-assistant-sdk lang=bash]
The Google Assistant SDK package contains all the code required to get the Google Assistant running on the device, including the sample code.
Step 1: Install the package's system dependencies: Make sure to run the script one by one.
[snippet slug=get-package-google-assistant lang=bash]
Install or update the authorization tool:
[snippet slug=install-update-authorization-tool lang=bash]
Make the target folder.
[snippet slug=make-target-directory-assistant-sdk lang=bash]
Use the command below to copy credentials.json to the target location.
[snippet slug=copy-credentials-json-file-target-location-assistant-sdk lang=bash]
Tap the command below to get the token generate code. Hence, we should see a URL displayed in the terminal:
[snippet slug=generate-token-google-assistant-sdk lang=bash]
Copy the URL and paste it into a browser (this can be done on any machine). The page will ask you to sign in to your Google account. Sign in to the Google account that created the developer project.
Copy the code and paste it in the Terminal.
If the authorization was successful, you will see a response similar to the following:
credentials saved: /home/respeaker/.config/google-oauthlib-tool/credentials.json.
If instead, you see InvalidGrantError, then an invalid code was entered. Try again, taking care to copy and paste the entire code.
Enter the following commands to install respeakerd
[snippet slug=install-respeakerd lang=bash]
Remember the two IDs we noted before? Now it's time to use them.
Change the command googlesamples-assistant-pushtotalk --project-id <my-dev-project> --device-model-id <my-model> with your own IDs.
So, for the command above, change <my-dev-project> into your project-id and change <my-model> into your Model ID.
For this demo, it should be like
googlesamples-assistant-pushtotalk --project-id mysmartspeaker-xxxxx --device-model-id mysmartspeaker-xxxx-my-smart-speaker-xxxxx
Finally, press Enter and Ask the Assistant any questions like "What's the weather in San Francisco" or "Sing a song" or "Tell me a joke" etc.
To conclude, we have successfully embedded the Google Assistant into our ReSpeaker Core v2.0 using Google Assistant as a Service. So to summarize, we installed, setup and configured the ReSpeaker Core v2.0. In addition, we also created a project in the Google Cloud Platform and Actions on Google Console, Enabled the Google Assistant API, installed the SDK and sample code and ran the Google Assistant as a service. With the above steps, we have successfully build a good foundation for our smart speaker.
Furthermore, we are just scratching the surface in terms of the full capabilities of the ReSpeaker Core v2.0. So, watch out for Part 2 of this smart speaker tutorial series. Also, in addition, check out my other tutorial posts on medium and on my website techwithsach.com.