visit
I love when I work on one demo, hit an issue, discover something else, and get joyfully distracted into learning something completely different. In this case, it was a suggestion to help with an issue I was having with output from a prompt, and while it wasn't a good solution for what I was doing, it was an eye-opening look at a really cool feature of Generative AI - Function Calling.
Function calling is a bit different. Instead of returning text, the idea is to return the intent of your prompt mapped to a particular function.
That probably doesn't make sense, but I think a good way of thinking about it is how Alexa works. Now, I haven't done any Alexa development in a couple of years, but it was incredibly cool. When building an Alexa skill, you define the various "intents" that should be supported by it.
Alexa would map all of these to one intent - making an order. Even better, it would then determine the product:
{
"intent":"order",
"product":"coffee"
}
This worked really well (at least when I last used it) and kinda maps to what function calling does.
Each of these has arguments. get_showtimes
as an example, has arguments for the location, movie, theater, and date. find_theaters
is simpler and just requires a location and movie.
If I use a prompt like Which theaters in Mountain View show Barbie movie?
then, the API attempts to map that to a function and figure out the arguments. Here is a portion of the result for that call demonstrating this:
"content": {
"parts": [
{
"functionCall": {
"name": "find_theaters",
"args": {
"movie": "Barbie",
"location": "Mountain View, CA"
}
}
}
]
},
The important thing to note here is that the result is not the end! Rather, you are expected, much like in Alexa, to take this and implement that logic yourself. The API has handled the parsing and figured out the intent from the prompt, so the hard parts are done. Now it's up to you to actually do the boring logic bit.
So I played with this a bit, of course, and built a simple demo. First off, note that the does not support using this feature yet. However, the REST API was so trivial I'm almost tempted not to ever return to the SDK.
const API_KEY = process.env.GOOGLE_AI_KEY;
async function runGenerate(prompt) {
let data = {
contents: {
role: "user",
parts: {
"text": prompt
}
},
tools: {
function_declarations: [
{
"name":"order_product",
"description":"order or buy a product",
"parameters": {
"type":"object",
"properties":{
"product": {
"type":"string",
"description":"The product to be ordered."
},
"quantity":{
"type":"number",
"description":"The amount of the product to be ordered."
}
},
"required":["product"]
}
}
]
}
}
let resp = await fetch(`//generativelanguage.googleapis.com/v1beta/models/gemini-pro:generateContent?key=${API_KEY}`, {
method:'post',
headers:{
'Content-Type':'application/json'
},
body:JSON.stringify(data)
});
return await resp.json();
}
In the tools
section, you can see one function, order_product
, with a description and two parameters - product and quantity. I've also specified that only the product is required.
const prompts = [
'I want to order a coffee.',
'I want to buy an espresso.',
'I want to get two espressos',
'Whats on the menu?',
'What time is love?',
'May I buy ten cats?',
'I want to order ten cats'
];
for(let p of prompts) {
let result = await runGenerate(p);
//console.log(JSON.stringify(result,null,'\t'));
console.log(`For prompt: ${p}\nResponse: ${JSON.stringify(result.candidates[0].content,null,'\t')}\n`);
console.log('------------------------------------------');
}
For prompt: I want to order a coffee.
Response: {
"parts": [
{
"functionCall": {
"name": "order_product",
"args": {
"product": "coffee"
}
}
}
],
"role": "model"
}
Note that quantity is not present. It's optional so that's ok, and I'd expect you would just default to 1. I tried to set a default in my function declaration, but it wasn't supported (or, more likely, I did it wrong).
For prompt: I want to buy an espresso.
Response: {
"parts": [
{
"functionCall": {
"name": "order_product",
"args": {
"quantity": 1,
"product": "espresso"
}
}
}
],
"role": "model"
}
Notice this time it did supply a quantity.
For prompt: I want to get two espressos
Response: {
"parts": [
{
"functionCall": {
"name": "order_product",
"args": {
"quantity": 2,
"product": "espresso"
}
}
}
],
"role": "model"
}
For prompt: Whats on the menu?
Response: {
"parts": [
{
"text": "Sorry, I do not have access to that information."
}
],
"role": "model"
}
So this is a good example of a failure. It did return a human-like textual response. I suppose how you handle this depends on the application. A simple thing to do would be to ignore a text parts
response and consider it an error in general, and perhaps ask for a new prompt.
For prompt: What time is love?
Response: {
"parts": [
{
"text": "I cannot fulfill this request because I lack the corresponding tools."
}
],
"role": "model"
}
For prompt: May I buy ten cats?
Response: {
"parts": [
{
"text": "I am sorry but that is beyond my capabilities."
}
],
"role": "model"
}
For prompt: I want to order ten cats
Response: {
"parts": [
{
"text": "I am sorry, but I cannot process your request. I am not able to order or sell pets such as cats."
}
],
"role": "model"
}
{
"name":"product_price",
"description":"return the price of a product",
"parameters": {
"type":"object",
"properties":{
"product": {
"type":"string",
"description":"The product to be queried."
}
},
"required":["product"]
}
}
Unfortunately, this made every previous good response switch to this version, even before I added prompts to ask for a price instead. I think... maybe... I didn't quite differentiate them enough for the API to be able to properly route them, but I'd love to hear from others what they think.
Also published .