Building Serverless Actions for Google Assistant with Google Cloud Functions, Cloud Datastore, and Cloud Storage

In this post, we will create an Action for Google Assistant using the ‘Actions on Google’ development platform, Google Cloud Platform’s serverless Cloud Functions, Cloud Datastore, and Cloud Storage, and the current LTS version of Node.js. According to Google, Actions are pieces of software, designed to extend the functionality of the Google Assistant, Google’s virtual personal assistant, across a multitude of Google-enabled devices, including smartphones, cars, televisions, headphones, watches, and smart-speakers.

Here is a brief YouTube video preview of the final Action for Google Assistant, we will explore in this post, running on an Apple iPhone 8.

If you want to compare the development of an Action for Google Assistant with those of AWS and Azure, in addition to this post, please read my previous two posts in this series, Building and Integrating LUIS-enabled Chatbots with Slack, using Azure Bot Service, Bot Builder SDK, and Cosmos DB and Building Asynchronous, Serverless Alexa Skills with AWS Lambda, DynamoDB, S3, and Node.js. All three of the article’s demonstrations are written in Node.js, all three leverage their cloud platform’s machine learning-based Natural Language Understanding services, and all three take advantage of NoSQL database and storage services available on their respective cloud platforms.

Google Technologies

Here is a brief overview of the key technologies we will incorporate into our architecture.

Actions on Google

Dialogflow

We will use the Dialogflow web-based development platform and version 2 of the Dialogflow API, which became GA in April 2018, to build our Action for Google Assistant’s rich, natural-language conversational interface.

Google Cloud Functions

Node.js LTS

Node 8, also known as Project Carbon, was the first Long Term Support (LTS) version of Node to support async/await with Promises. Async/await is the new way of handling asynchronous operations in Node.js. We will make use of async/await and Promises within our Action’s Cloud Function.

Google Cloud Datastore

Google Cloud Storage

Demonstration

Source Code

Development Process

Building the Action will involve the following steps.

  • Design the Action’s conversation model;
  • Import the Azure Facts Entities into Cloud Datastore on GCP;
  • Create and upload the images to Cloud Storage on GCP;
  • Create the new Actions project using the Actions on Google console;
  • Develop the Action’s Intent using the Dialogflow console;
  • Bulk import the Action’s Entities using the Dialogflow console;
  • Configure the Dialogflow Actions on Google Integration;
  • Develop and deploy the Cloud Function to GCP;
  • Test the Action using Actions on Google Simulator;

Let’s explore each step in more detail.

Conversational Model

Each fact returned by Google Assistant will include a Simple Response, Basic Card and Suggestions response types for devices with a display, as shown below. The user may continue to ask for additional facts or choose to cancel the Action at any time.

Lastly, as part of the conversational model, we will include the option of asking for a random fact, as well as asking for help. Examples of both are shown below. Again, Google Assistant responds to the user, vocally and, optionally, visually, for display-enabled devices.

GCP Account and Project

# Authenticate with the Google Cloud SDK
export PROJECT_ID="<your_project_id>"

gcloud beta auth login
gcloud config set project ${PROJECT_ID}

# Update components or new runtime nodejs8 may be unknown
gcloud components update

Google Cloud Storage

#!/usr/bin/env sh

# author: Gary A. Stafford
# site: https://programmaticponderings.com
# license: MIT License

set -ex

# Set constants
PROJECT_ID="<your_project_id>"
REGION="<your_region>"
IMAGE_BUCKET="<your_bucket_name>"

# Create GCP Storage Bucket
gsutil mb \
-p ${PROJECT_ID} \
-c regional \
-l ${REGION} \
gs://${IMAGE_BUCKET}

# Upload images to bucket
for file in pics/image-*; do
gsutil cp ${file} gs://${IMAGE_BUCKET}
done

# Make all images public in bucket
gsutil iam ch allUsers:objectViewer gs://${IMAGE_BUCKET}

From the Storage Console on GCP, you should observe the images all have publicly accessible URLs. This will allow the Cloud Function to access the bucket, and retrieve and display the images. There are more secure ways to store and display the images from the function. However, this is the simplest method since we are not concerned about making the images public.

We will need the URL of the new Storage bucket, later, when we develop to our Action’s Cloud Function. The bucket URL can be obtained from the Storage Console on GCP, as shown below in the Link URL.

Google Cloud Datastore

There are a number of ways to create the Datastore entities for our Action, including manually from the Datastore console on GCP. However, to automate the process, we will use a script, written in Node.js and using the Google Cloud Datastore Node.js Client, to create the entities. We will use the Client API’s Datastore Class upsert method, which will create or update an entire collection of entities with one call and returns a callback. The script , upsert-entities.js, is included in source control and can be run with the following command. Below is a snippet of the script, which shows the structure of the entities (gist).

# Upload Google Datastore entities
cd data
npm install
node ./upsert-entities.js

Once the upsert command completes successfully, you should observe a collection of ‘AzureFact’ Type Datastore Entities in the Datastore console on GCP.

Below, we see the structure of a single Datastore Entity, the ‘certifications’ Entity, containing the fact response, title, and name of the image, which is stored in our Google Storage bucket.

New Actions on Google Project

The Directory Information tab is where we define all the metadata about the Action. This information determines how the Action will look in the Actions directory and is required to publish your Action. The directory is where users discover published Actions on the web and mobile devices.

Actions and Intents

The first thing you will notice when switching to Dialogflow is what was referred to as Actions in the Actions on Google console now appears to be referred to as Intents in Dialogflow. According to Google, an Action is ‘an interaction you build for the Assistant that supports a specific intent and has a corresponding fulfillment that processes the intent.’ There is a direct relationship between an Action and an Intent. The word Intent, used by Dialogflow, is standard terminology across other voice-assistant platforms, such as Alexa and LUIS. All we need to know is that we are building Intents to support Actions — the Azure Facts Intent, Welcome Intent, and the Fallback Intent.

Below, we see the Azure Facts Intent. The Azure Facts Intent is the main Intent, responsible for handling our user’s requests for facts about Azure. The Intent includes a fair number, but certainly not an exhaustive list, of training phrases. These represent all the possible ways a user might express intent when invoking the Action. According to Google, the greater the number of natural language examples in the Training Phrases section of Intents, the better the classification accuracy.

Intent Entities

Synonyms

Although our Azure Facts Action is a simple example, typical Actions might contain hundreds of entities or more, each with several synonyms. Dialogflow provides the option of copy and pasting bulk entities, in either JSON or CSV format. The project’s source code includes both JSON or CSV formats, which may be input in this manner.

Automated Expansion

In order to allow this, you must enable Allow Automated Expansion. According to Google, this option allows an Agent to recognize values that have not been explicitly listed in the entity. Google describes Agents as NLU (Natural Language Understanding) modules.

Actions on Google Integration

The Dialogflow’s Actions on Google integration configuration is simple, just choose the Azure Facts Intent as our Action’s Implicit Invocation intent, in addition to the default Welcome Intent, which is our Action’s Explicit Invocation intent. According to Google, integration allows our Action to reach users on every device where the Google Assistant is available.

Action Fulfillment

Google Cloud Functions

Our function, index.js, is divided into four sections: constants, intent handlers, helper functions, and the function’s entry point. The Cloud Function attempts to follow many of the coding practices from Google’s code examples on Github.

Constants

// author: Gary A. Stafford
// site: https://programmaticponderings.com
// license: MIT License

'use strict';

/* CONSTANTS */

const {
dialogflow,
Suggestions,
BasicCard,
SimpleResponse,
Image,
} = require('actions-on-google');

const functions = require('firebase-functions');
const Datastore = require('@google-cloud/datastore');
const datastore = new Datastore({});

const app = dialogflow({debug: true});

app.middleware(conv => {
conv.hasScreen =
conv.surface.capabilities.has('actions.capability.SCREEN_OUTPUT');
conv.hasAudioPlayback =
conv.surface.capabilities.has('actions.capability.AUDIO_OUTPUT');
});

const IMAGE_BUCKET = process.env.IMAGE_BUCKET;

const SUGGESTION_1 = 'tell me a random fact';
const SUGGESTION_2 = 'help';
const SUGGESTION_3 = 'cancel';

The npm package dependencies declared in the constants section, are defined in the dependencies section of the package.json file. Function dependencies include Actions on Google, Firebase Functions, and Cloud Datastore (gist).

"dependencies": {
"@google-cloud/datastore": "^1.4.1",
"actions-on-google": "^2.2.0",
"dialogflow": "^0.6.0",
"dialogflow-fulfillment": "^0.5.0",
"firebase-admin": "^6.0.0",
"firebase-functions": "^2.0.2"
}

Intent Handlers

/* INTENT HANDLERS */

app.intent('Welcome Intent', conv => {
const WELCOME_TEXT_SHORT = 'What would you like to know about Microsoft Azure?';
const WELCOME_TEXT_LONG = `What would you like to know about Microsoft Azure? ` +
`You can say things like: \n` +
` _'tell me about Azure certifications'_ \n` +
` _'when was Azure released'_ \n` +
` _'give me a random fact'_`;
const WELCOME_IMAGE = 'image-16.png';

conv.ask(new SimpleResponse({
speech: WELCOME_TEXT_SHORT,
text: WELCOME_TEXT_SHORT,
}));

if (conv.hasScreen) {
conv.ask(new BasicCard({
text: WELCOME_TEXT_LONG,
title: 'Azure Tech Facts',
image: new Image({
url: `${IMAGE_BUCKET}/${WELCOME_IMAGE}`,
alt: 'Azure Tech Facts',
}),
display: 'WHITE',
}));

conv.ask(new Suggestions([SUGGESTION_1, SUGGESTION_2, SUGGESTION_3]));
}
});

app.intent('Fallback Intent', conv => {
const FACTS_LIST = "Certifications, Cognitive Services, Competition, Compliance, First Offering, Functions, " +
"Geographies, Global Infrastructure, Platforms, Categories, Products, Regions, and Release Date";
const WELCOME_TEXT_SHORT = 'Need a little help?';
const WELCOME_TEXT_LONG = `Current facts include: ${FACTS_LIST}.`;
const WELCOME_IMAGE = 'image-15.png';

conv.ask(new SimpleResponse({
speech: WELCOME_TEXT_LONG,
text: WELCOME_TEXT_SHORT,
}));

if (conv.hasScreen) {
conv.ask(new BasicCard({
text: WELCOME_TEXT_LONG,
title: 'Azure Tech Facts Help',
image: new Image({
url: `${IMAGE_BUCKET}/${WELCOME_IMAGE}`,
alt: 'Azure Tech Facts',
}),
display: 'WHITE',
}));

conv.ask(new Suggestions([SUGGESTION_1, SUGGESTION_2, SUGGESTION_3]));
}
});

app.intent('Azure Facts Intent', async (conv, {facts}) => {
let factToQuery = facts.toString();
let fact = await buildFactResponse(factToQuery);

const AZURE_TEXT_SHORT = `Sure, here's a fact about ${fact.title}`;

conv.ask(new SimpleResponse({
speech: fact.response,
text: AZURE_TEXT_SHORT,
}));

if (conv.hasScreen) {
conv.ask(new BasicCard({
text: fact.response,
title: fact.title,
image: new Image({
url: `${IMAGE_BUCKET}/${fact.image}`,
alt: fact.title,
}),
display: 'WHITE',
}));

conv.ask(new Suggestions([SUGGESTION_1, SUGGESTION_2, SUGGESTION_3]));
}
});

The Welcome Intent handler handles explicit invocations of our Action. The Fallback Intent handler handles both help requests, as well as cases when Dialogflow cannot match any of the user’s input. Lastly, the Azure Facts Intent handler handles implicit invocations of our Action, returning a fact to the user from Cloud Datastore, based on the user’s requested fact.

Helper Functions

/* HELPER FUNCTIONS */

function selectRandomFact() {
const FACTS_ARRAY = ['description', 'released', 'global', 'regions',
'geographies', 'platforms', 'categories', 'products', 'cognitive',
'compliance', 'first', 'certifications', 'competition', 'functions'];

return FACTS_ARRAY[Math.floor(Math.random() * FACTS_ARRAY.length)];
}

function buildFactResponse(factToQuery) {
return new Promise((resolve, reject) => {
if (factToQuery.toString().trim() === 'random') {
factToQuery = selectRandomFact();
}

const query = datastore
.createQuery('AzureFact')
.filter('__key__', '=', datastore.key(['AzureFact', factToQuery]));

datastore
.runQuery(query)
.then(results => {
resolve(results[0][0]);
})
.catch(err => {
console.log(`Error: ${err}`);
reject(`Sorry, I don't know the fact, ${factToQuery}.`);
});
});
}


/* ENTRY POINT */

exports.functionAzureFactsAction = functions.https.onRequest(app);

Async/Await, Promises, and Callbacks

The buildFactResponse function returns a Promise, as seen on line 28. The Promise’s payload contains the results of the successful callback from the Datastore API’s runQuery function. The runQuery function returns a callback, which is then resolved and returned by the Promise, as seen on line 40 (gist).

app.intent('Azure Facts Intent', async (conv, {facts}) => {
let factToQuery = facts.toString();
let fact = await buildFactResponse(factToQuery);

const AZURE_TEXT_SHORT = `Sure, here's a fact about ${fact.title}`;

conv.ask(new SimpleResponse({
speech: fact.response,
text: AZURE_TEXT_SHORT,
}));

if (conv.hasScreen) {
conv.ask(new BasicCard({
text: fact.response,
title: fact.title,
image: new Image({
url: `${IMAGE_BUCKET}/${fact.image}`,
alt: fact.title,
}),
display: 'WHITE',
}));

conv.ask(new Suggestions([SUGGESTION_1, SUGGESTION_2, SUGGESTION_3]));
}
});

function buildFactResponse(factToQuery) {
return new Promise((resolve, reject) => {
if (factToQuery.toString().trim() === 'random') {
factToQuery = selectRandomFact();
}

const query = datastore
.createQuery('AzureFact')
.filter('__key__', '=', datastore.key(['AzureFact', factToQuery]));

datastore
.runQuery(query)
.then(results => {
resolve(results[0][0]);
})
.catch(err => {
console.log(`Error: ${err}`);
reject(`Sorry, I don't know the fact, ${factToQuery}.`);
});
});
}

The payload returned by Google Datastore, through the resolved Promise to the intent handler, will resemble the example response, shown below. Note the image, response, and title key/value pairs in the textPayload section of the response payload. These are what are used to format the SimpleResponse and BasicCard responses (gist).

{
title: 'Azure Functions',
image: 'image-14.png',
response: 'According to Microsoft, Azure Functions is a serverless compute service that enables you to run code on-demand without having to explicitly provision or manage infrastructure.',
[Symbol(KEY)]: Key {
namespace: undefined,
name: 'functions',
kind: 'AzureFact',
path: [Getter]
}
}

Cloud Function Deployment

#!/usr/bin/env sh

# author: Gary A. Stafford
# site: https://programmaticponderings.com
# license: MIT License

set -ex

# Set constants
REGION="<your_region>"
FUNCTION_NAME="<your_function_name>"

# Deploy the Google Cloud Function
gcloud beta functions deploy ${FUNCTION_NAME} \
--runtime nodejs8 \
--region ${REGION} \
--trigger-http \
--memory 256MB \
--env-vars-file .env.yaml

The creation or update of the Cloud Function can take up to two minutes. Note the .gcloudignore file referenced in the verbose output below. This file is created the first time you deploy a new function. Using the the .gcloudignore file, you can limit the deployed files to just the function (index.js) and the package.json file. There is no need to deploy any other files to GCP.

If you recall, the URL endpoint of the Cloud Function is required in the Dialogflow Fulfillment tab. The URL can be retrieved from the deployment output (shown above), or from the Cloud Functions Console on GCP (shown below). The Cloud Function is now deployed and will be called by the Action when a user invokes the Action.

Simulation Testing and Debugging

Below, in the Action Simulation console, we see the successful display of the initial Azure Tech Facts containing the expected Simple Response, Basic Card, and Suggestions, triggered by a user’s explicit invocation of the Action.

The simulated response indicates that the Google Cloud Function was called, and it responded successfully. It also indicates that the Google Cloud Function was able to successfully retrieve the correct image from Google Cloud Storage.

Below, we see the successful response to the user’s implicit invocation of the Action, in which they are seeking a fact about Azure’s Cognitive Services. The simulated response indicates that the Google Cloud Function was called, and it responded successfully. It also indicates that the Google Cloud Function was able to successfully retrieve the correct Entity from Google Cloud Datastore, as well as the correct image from Google Cloud Storage.

If we had issues with the testing, the Action Simulation console also contains tabs containing the request and response objects sent to and from the Cloud Function, the audio response, a debug console, and any errors.

Logging and Analytics

We also have the ability to view basic Analytics about our Action from within the Dialogflow Analytics console. Analytics displays metrics, such as the number of sessions, the number of queries, the number of times each Intent was triggered, how often users exited the Action from an intent, and Sessions flows, shown below.

In simple Action such as this one, the Session flow is not very beneficial. However, in more complex Actions, with multiple Intents and a variety potential user interactions, being able to visualize Session flows becomes essential to understanding the user’s conversational path through the Action.

Conclusion

We have seen how Google is quickly maturing their serverless functions, to compete with AWS and Azure, with the recently announced support of LTS version 8 of Node.js and Python, to create an Actions for Google Assistant.

Impact of Serverless

¹Azure is a trademark of Microsoft

All opinions expressed in this post are my own and not necessarily the views of my current or past employers, their clients, or Google and Microsoft.

Originally published at programmaticponderings.com on August 11, 2018.

AWS Senior Solutions Architect | AWS Certified Pro | Polyglot Developer | Data Analytics | DataOps | DevOps

AWS Senior Solutions Architect | AWS Certified Pro | Polyglot Developer | Data Analytics | DataOps | DevOps