9 minutes
Implementing chatbots in your Unity project with Dialogflow V2
12/01/2022 UPDATE: this little project seems to have gained more attention than I expected, and I am very happy of that. Unfortunately, Dialogflow changed a lot during these years, and I have no time anymore to keep up to date and help with troubleshooting. I am not mantaining this project anymore (as you may have probably already noticed), but if you manage to implement some missing feature or fix an error, feel free to make a pull request on github. Usually missing features (e.g., voice selection) just need to have the corresponding C# classes implemented to match the expected data structs sent by Google in JSON format.
A few months ago I started working with the Google Dialogflow service for implementing a conversational agent in a Unity project. Dialogflow is a powerful tool for easily designing and deploying chatbots and I think it could be used to enhance the user experience in some games. The problem is that, on October 2019, Dialogflow will officially switch to the version V2 of its API, thus making obsolete all the previously implemented SDKs for designing Dialogflow V2 client programs. There already is a project of a DFV2 client library on Unity, but it lacks many of the functionalities I needed for my work, like the possibility to specify additional input contexts. Also, the project looks abandoned. For these reasons, I made the decision to implement my own Unity DFV2 client library. I also take the opportunity to talk about the Google JWT authentication policy, in case you will need to use another Google Cloud service like Dialogflow.
If you are just interested in the plugin I developed for Unity, you can download it from my github. If you also want to know about the JWT authentication and the chatbot design through Dialogflow, you can keep reading.
Dialogflow V2 client: the easy way
There actually is a way to implement DFV2 client functionalities which is a lot faster than the work I did with my plugin: it simply requires to download the DFV2 SDK via the NuGet console in Visual Studio and copy the DLLs in your Unity project Plugins folder. This process would have caused issues with the oldest versions of Unity, that did not use to support libraries built for a .NET version greater than 3.5, but now Unity supports most of the C# libraries downloadable from the NuGet package manager, so you can do this without problems. So, why have I decided to implement my own library?
- I wanted my DFV2 client implementation to be lightweight and with as few dependencies as possible. The Google.Cloud.Dialogflow.V2 DLL has a long chain of dependencies, due to the fact that it makes use of Google protocol buffers. At the end of the day, we would end up filling the Plugins folder with more than 7 MBs of binaries (lot of DLLs) and we would use only a little part of it. It is not a big problem (actually, for non-mobile projects it is not a problem at all), but it still feels… dirty and, most of all, I don’t like to have so many dependencies around. It is possible to query Dialogflow chatbots through simple HTTP requests and eliminate dependencies with a minimal additional effort in coding, so why should not we do this?
- The official SDK is designed for general purpose, not for games. Even if I had decided to use it, I would have had to include gamedev-friendly functionalities. Moreover, I was only interested in a subset of the DLL code: the set of methods that allow to query a bot and receive an answer. All the other functionalities (creating intents and entities, deleting them, etc.) should not be accessible to the players, in my opinion. Moreover, assuming complete control on the DFV2 API invocation gives me the ability to focus my code on efficiency.
- The official SDK requires to set an environment variable with the path of a JSON file which exposes some authentication data, which consequently should be shipped with the game. That JSON file holds sensible authentication keys for accessing the bot: yes, of course you would create a bot client account with limited priviliges and ship its key instead of the one associated to your admin account, but you would still need to expose private keys in a plain text file. Again, it is not a problem, but… urgh, exposing information that the user does not need to know does not feel right. Why don’t we try to hide that file by placing it in the Resources folder of our Unity project?
Creating a smart RPG shop
Let’s create a chatbot for a RPG shop, which should allow players to buy and sell items through a chat. For example, we want the user to be able to have this kind of conversation with the bot:
User: "hi"
Bot: "Greetings, sir. How can I help?"
User: "Give me 3 potions, please"
Bot: "3 units of potion? That will be 300 coins."
I will not deal with the details about the creation of the chatbot on Dialogflow, since there already are valid tutorials for that. Note that I will use many of typical Dialogflow terms in this tutorial: for this reason, if you are not familiar with the terms “contexts”, “intent” and “entity”, I suggest you have a look on this tutorial before you go on with this post.
The Google JWT authentication procedure
The Json Web Token (JWT) authentication procedure provides a secure signing mechanism for exchanging data between your client and a Google Cloud service like DF. This data exchange is based on a preliminary request for an access token, that can be demanded to Google only if you have a service account with the permission to use the GCP service. Let’s start by creating a new bot and a service account.
Reach the Dialogflow console at
https://console.dialogflow.com/api-client/#/login and sign in with your Google account. After having created the bot and tested the conversation (I will skip this step, as anticipated above), let’s set up the authentication for our client. From the Dialogflow bot Settings page (accessible through the gear-shaped icon on the menu at the left-side of the DF bot page) click on the service account name, right under the “Google Project” section.
In order to limit the client’s power on our chatbot, we will create a new service account with limited permissions. From the page you reached after having clicked the service account name, select “Create Service Account”.
The Service Account creation procedure will start. In the first step, you will be required to choose a name (e.g., “rpgshopclient”).
In the second step you will need to select a role for the service account. If you want limited permissions, select “Dialogflow API Client”.
In the last step, select a service account for user permissions and one for admin permissions, if you want. After that, click “Create key” and select the .P12 format. Take note of the password (you can leave ’notasecret’) and save the .p12 file somewhere in your project Resources folder.
IMPORTANT: in order to make Unity correctly read this file, you have to change its extension from .p12 to .bytes, otherwise Unity will not be able to read it as a binary file.
This file contains the information required by our DF client to make a JWT request in order to obtain the token necessary to interact with the bot. What the DF client should do is:
- Make a JWT request by using the .p12 file;
- Receive the token;
- Include the token in all the subsequent HTTP requests to the chatbot, until the token expires.
Leonardo Cavaletti wrote a very useful script for implementing JWT requests in Unity. I used it in my DF client library project, with some minor edits: firstly, the provided scope address will not work with Dialogflow. This page reports all the available scopes for each Google Cloud service. Secondly, I based the requests on the UnityWebRequest class instead of the WWW class, which is deprecated. I also implemented a new class for providing more detailed output (e.g., the token expiration time). You can have a look on my version of the script in the github project.
First HTTP requests to the chatbot
Once we have retrieved a valid access token, all we need to do is including it in all our HTTP requests. Let’s make an intent detection request with Postman, starting by setting the Content Type to “application/json”:
In the Authorization tab, set the Bearer Token to the received JWT. Remember that each token has the default duration of one hour.
Finally, write the body. In this test, we are interested in testing the default welcome intent. The JSON format must follow the specifications provided in the DF2 documentation.
It works! All we need to do now is translating this Postman operations to a C# script. Let’s implement this into a Unity coroutine, so that we can deal with the HTTP request asynchronously. The coroutine firstly tries to retrieve an access token from the cache: if it does not find it, it asks for a new one, thus refreshing the cache. After that, the coroutine sets the proper HTTP parameters, makes the POST request and reads the response.
IEnumerator DetectIntent(DF2QueryInput queryInput, string session)
{
// Gets the JWT access token.
string accessToken = string.Empty;
while (!JwtCache.TryGetToken(accessSettings.ServiceAccount, out accessToken))
yield return JwtCache.GetToken(
accessSettings.CredentialsFileName,
accessSettings.ServiceAccount);
// Prepares the HTTP request.
var settings = new JsonSerializerSettings();
settings.NullValueHandling = NullValueHandling.Ignore;
settings.ContractResolver = new CamelCasePropertyNamesContractResolver();
DF2Request request = new DF2Request(session, queryInput);
string jsonInput = JsonConvert.SerializeObject(request, settings);
byte[] body = Encoding.UTF8.GetBytes(jsonInput);
string url = string.Format("https://dialogflow.googleapis.com/v2/projects/{0}/agent/sessions/{1}:detectIntent",
accessSettings.ProjectId, session);
UnityWebRequest df2Request = new UnityWebRequest(url, "POST");
df2Request.SetRequestHeader("Authorization", "Bearer " + accessToken);
df2Request.SetRequestHeader("Content-Type", "application/json");
df2Request.uploadHandler = new UploadHandlerRaw(body);
df2Request.downloadHandler = new DownloadHandlerBuffer();
yield return df2Request.SendWebRequest();
// Processes response.
if (df2Request.isNetworkError || df2Request.isHttpError)
Debug.LogError(JsonConvert.DeserializeObject<DF2ErrorResponse>(df2Request.downloadHandler.text));
else
{
string response = Encoding.UTF8.GetString(df2Request.downloadHandler.data);
Debug.Log(response);
}
}
The JwtCache static class stores the access token retrieved for each service account and is used to prevent the client from requesting a new token for each HTTP request to the bot.
The library
I designed the library not to simply re-map all the DF2 functionalities to a Unity package, but for providing an easy-to-use tool designed for integration in videogames. It is possible to provide input to the chatbot by sending an explicit dialog string or an event ID, as well as additional input contexts and entities. Moreover, it is possible to set callbacks to make an agent react to a specific output context.
The data model has been reproduced through a set of C# classes that avoid the use of Google protocol buffers, which are a great tool but we do not need their complexity for simple intent detection.
The library is available on my github. As I anticipated above, it does not implement all the typical functionalities of Dialogflow, but it provides a set of essential methods for implementing a simple DFV2 client. Feel free to make a pull request if you wish to implement additional functionalities.
1834 Words
2019-04-20 11:00
Did you like this post? Did you hate it? Wanna talk about it? Contact me at alessandro.tironi7 at gmail dot com.