OpenAI Function Calling, iOS Implementation

Published in

If let swift = Programming!

12 min readNov 15, 2023

Updates on Function Calling were a major highlight at OpenAI DevDay. The new models now have the ability to perform multiple function calls.

But what exactly is a function call in this context? Function calling is simply a way to connect large language models to external tools. The OpenAI dev day demo (starting at 33:32) will help provide a better understanding of function calls if you are new to the concept.

How does this work? When making a call to the Chat Completions API, you have the ability to specify functions, allowing the AI model to intelligently determine the optimal way to execute them. Rather than directly calling the function or providing a text response, the model generates a JSON object. This JSON includes all necessary arguments for one or more function calls, containing structured data from your prompt. It’s important to remember that the Chat Completions API doesn’t execute the function itself; instead, it provides the JSON output, which you can then utilize in your own code to make actual calls to external APIs.

A simple example: In a previous version of the ChatGPT app, you had to use the dropdown to select the Dalle model to generate images. If you asked the chat for an image, it would respond that it does not generate images. Now, that’s changed. The app can generate images directly from the chat without having to select the Dalle model from a dropdown. I doubt this means that the new GPT model is able to generate images itself. Is it more likely that the OpenAI team is relying on Function Calling for this update?

In this blog post, we will explore how you can use the Function Calling with the latest updates in the OpenAI completion API that now uses tools instead of the deprecated function's parameter in a chat app to mimic the current experience provided by the ChatGPT app when an image is requested, in addition, we will show a fun fact of the object associated with the image. The final result looks like this:

The basic sequence of steps for function calling is:

a - Define the functions for the tools parameter.

b - Call the model with the user prompt and pass the tools defined in step 1.

c - If the model considers that the user prompt matches a description in any of the functions you send, the content returned won't be a response as a chat message, instead it will return a stringified JSON object adhering to your custom schema (note: the model may hallucinate parameters).

d — You must parse the string into JSON in your code, and use the provided arguments, if any, as parameters for your external APIs.

e — Call the model again and use the response from your external API to create a ‘tool message’ type. At this step, you MUST pass three messages to the model: the user message, the assistant message, and now the tool’s message.

It might seem a bit confusing, but we’ll go through it step by step. We’ll use this Python example from the OpenAI documentation as a reference for our implementation.

import openai
import json

# Example dummy function hard coded to return the same weather
# In production, this could be your backend API or an external API
def get_current_weather(location, unit="fahrenheit"):
    """Get the current weather in a given location"""
    if "tokyo" in location.lower():
        return json.dumps({"location": location, "temperature": "10", "unit": "celsius"})
    elif "san francisco" in location.lower():
        return json.dumps({"location": location, "temperature": "72", "unit": "fahrenheit"})
    else:
        return json.dumps({"location": location, "temperature": "22", "unit": "celsius"})

def run_conversation():
    # Step 1: send the conversation and available functions to the model
    messages = [{"role": "user", "content": "What's the weather like in San Francisco, Tokyo, and Paris?"}]
    tools = [
        {
            "type": "function",
            "function": {
                "name": "get_current_weather",
                "description": "Get the current weather in a given location",
                "parameters": {
                    "type": "object",
                    "properties": {
                        "location": {
                            "type": "string",
                            "description": "The city and state, e.g. San Francisco, CA",
                        },
                        "unit": {"type": "string", "enum": ["celsius", "fahrenheit"]},
                    },
                    "required": ["location"],
                },
            },
        }
    ]
    response = openai.chat.completions.create(
        model="gpt-3.5-turbo-1106",
        messages=messages,
        tools=tools,
        tool_choice="auto",  # auto is default, but we'll be explicit
    )
    response_message = response.choices[0].message
    tool_calls = response_message.tool_calls
    # Step 2: check if the model wanted to call a function
    if tool_calls:
        # Step 3: call the function
        # Note: the JSON response may not always be valid; be sure to handle errors
        available_functions = {
            "get_current_weather": get_current_weather,
        }  # only one function in this example, but you can have multiple
        messages.append(response_message)  # extend conversation with assistant's reply
        # Step 4: send the info for each function call and function response to the model
        for tool_call in tool_calls:
            function_name = tool_call.function.name
            function_to_call = available_functions[function_name]
            function_args = json.loads(tool_call.function.arguments)
            function_response = function_to_call(
                location=function_args.get("location"),
                unit=function_args.get("unit"),
            )
            messages.append(
                {
                    "tool_call_id": tool_call.id,
                    "role": "tool",
                    "name": function_name,
                    "content": function_response,
                }
            )  # extend conversation with function response
        second_response = openai.chat.completions.create(
            model="gpt-3.5-turbo-1106",
            messages=messages,
        )  # get a new response from the model where it can see the function response
        return second_response
print(run_conversation())

As demonstrated in the example above, we need to adhere to the previously outlined steps to convert this into Swift. The complete iOS implementation for this post is available here. To execute the demo project, you must supply a valid OpenAI API key.

This demo uses `gpt-3.5-turbo-1106` and `gpt-4–1106-preview` which are the latest models and the ones that also support parallel function calling. And also uses the SwiftOpenAI wrapper to communicate with the OpenAI Endpoints. This wrapper contains models that you can use to create parameters for your requests. e.g ChatCompletionParameters

In this post, we’re not going to dive deep into the sample project. We’ll focus more on how the function call works. Ok, let's go step by step.

Step 1, we need to define the function parameters and send them in the user message request.

You can model your functions as an enum like this…

enum FunctionCallDefinition: String, CaseIterable {
   
   case createImage = "create_image"
   // Add more functions if needed, parallel function calling is supported.

   var functionTool: ChatCompletionParameters.Tool {
      switch self {
      case .createImage:
         return .init(function: .init(
            name: self.rawValue,
            description: "call this function if the request asks to generate an image",
            parameters: .init(
               type: .object,
               properties: [
                  "prompt": .init(type: .string, description: "The exact prompt passed in."),
                  "count": .init(type: .integer, description: "The number of images requested")
               ],
               required: ["prompt", "count"])))
      }
   }
}

This will allow you to extend this enum by adding more functions as needed. Let’s examine what is required for a function in detail:

Name: This is the name of the function that will be triggered if the user’s prompt matches the description.
Description: A short description of the model. In this example, we are telling the model to trigger this function if the user requests an image.
Parameters: You can specify the types of parameters you want in your response.
Required: This defines which parameters are essential to have in the response.

Now you can craft a user message and include it in the parameters using the tools we created earlier.


   /// To be used on a new request when extending the conversation is needed.
   private var chatMessageParameters: [ChatCompletionParameters.Message] = []
  /// Stores the function calls that we want to execute.
   private var availableFunctions: [FunctionCallDefinition: (@MainActor (String) async throws -> String)] = [:]

   func startChat(
      prompt: String)
      async throws
   {
      defer {
         chatMessageParameters = []
      }
      
      await startNewUserDisplayMessage(prompt) // UI updates, non relevant for this tutorial
      
      await startNewAssistantEmptyDisplayMessage() // UI updates, no relevant for this tutorial
      
      /// # Step 1: send the conversation and available functions to the model
      let userMessage = ChatCompletionParameters.Message(role: .user, content: .text(prompt))

      chatMessageParameters.append(userMessage)
      
      let tools = FunctionCallDefinition.allCases.map { $0.functionTool }
      
      let parameters = ChatCompletionParameters(
         messages: chatMessageParameters,
         model: .gpt41106Preview,
         toolChoice: ChatCompletionParameters.ToolChoice.auto,
         tools: tools)
      
      do {
/// The SwiftOpenAIService request.
         let chat = try await service.startChat(parameters: parameters)
/// More code to come.
}

Here, you can see how to create a user message with the given prompt, set up completion parameters, and also include the function call tools. The ‘start chat’ function is triggered when the user adds a prompt and concludes upon receiving a response from the assistant. Throughout this operation, we need to keep track of the messages. To do this, we will append them to the chatMessageParameters array. This will enable us to use the array to provide context to the model in a new request if a function call is executed.

Step 2, Check if the model intends to call a function. If the message returned by the model includes, this indicates that the model determined the user’s prompt matches the tool description of our function.

      do {
         let chat = try await service.startChat(parameters: parameters)
         
         guard let assistantMessage = chat.choices.first?.message else { return }
         
         let content = assistantMessage.content ?? ""

      /// UI update not relevant for this tutorial.
         await updateLastAssistantMessage(.init(content: .content(.init(text: content)), origin: .received(.gpt)))
         
         /// # Step 2: check if the model wanted to call a function
         if let toolCalls = assistantMessage.toolCalls {
            /// Handle tool calls...
         }

Step 3, We have already defined a function call parameter for the request, but this alone is not sufficient. The model doesn’t trigger an action; it simply alerts us that the user’s prompt matches criteria we can use to initiate a certain action. Now, we need to define a function that we will execute in response to a tool call. This is how function calls operate: the model returns a signal with structured data, which we defined in the parameters. We can then use this payload as input for an external API. This approach enhances the model’s capabilities and improves the user experience. We will use this payload to request an image to the Dalle API.

First, we will define the Dalle API request function:

   @MainActor
   func generateImage(arguments: String) async throws -> String {
      guard let dictionary = arguments.toDictionary() else { return "" } 
      let prompt = dictionary["prompt"] as? String 
      let count = (dictionary["count"]  as? Int) ??  1
      /// UI update, not relevant for this demo.
      let assistantMessage = ChatMessageDisplayModel(
         content: .content(.init(text: "Generating images...")),
         origin: .received(.gpt))
      updateLastAssistantMessage(assistantMessage)
      
      let urls = try await service.createImages(parameters: .init(prompt: prompt, numberOfImages: count)).data.compactMap(\.url)
      
      let dalleAssistantMessage = ChatMessageDisplayModel(
         content: .content(.init(text: nil, urls: urls)),
         origin: .received(.dalle))
      updateLastAssistantMessage(dalleAssistantMessage)
      
      return prompt // We return the prompt so we can use it later for a tool message.
   }

This code makes a request to the Dalle API, which is also supported by the SwiftOpenAI package, the arguments passed on this function are the arguments returned by the assistant, is a JSON string and we need to convert it into a dictionary so we can get the proper arguments.

With this defined, we can now move on to Step 4. Here, we’ll create tool messages and append them to the chatMessageParameters array for our next request, which aims to obtain a new message from the assistant.

       if let toolCalls = assistantMessage.toolCalls {
            
            /// # Step 3: Define the available functions
            availableFunctions = [.createImage: generateImage(arguments:)]
            // Append the `assistantMessage` in to the `chatMessageParameters` to extend the conversation
            let parameterAssistantMessage = ChatCompletionParameters.Message(
               role: .assistant,
               content: .text(content), toolCalls: assistantMessage.toolCalls)
            
            chatMessageParameters.append(parameterAssistantMessage)
            
            /// # Step 4: send the info for each function call and function response to the model
            for toolCall in toolCalls {
               if
                  let name = toolCall.function.name,
                  let function = FunctionCallDefinition(rawValue: name),
                  let functionToCall = availableFunctions[function]
               {
                  let id = toolCall.id
                  let arguments = toolCall.function.arguments
                  let content = try await functionToCall(arguments)
                  let toolMessage = ChatCompletionParameters.Message(
                     role: .tool,
                     content: .text(content),
                     name: name,
                     toolCallID: id)
                  chatMessageParameters.append(toolMessage)
               }
            }

Lastly, we can use the chatMessageParameters array to initiate a new chat request. The final implementation of the start chat method looks like this…

   func startChat(
      prompt: String)
      async throws
   {
      defer {
         chatMessageParameters = []
      }
      
      await startNewUserDisplayMessage(prompt)
      
      await startNewAssistantEmptyDisplayMessage()
      
      
      /// # Step 1: send the conversation and available functions to the model
      let userMessage = ChatCompletionParameters.Message(role: .user, content: .text(prompt))
      chatMessageParameters.append(userMessage)
      
      let tools = FunctionCallDefinition.allCases.map { $0.functionTool }
      
      let parameters = ChatCompletionParameters(
         messages: chatMessageParameters,
         model: .gpt41106Preview,
         toolChoice: ChatCompletionParameters.ToolChoice.auto,
         tools: tools)
      
      do {
         let chat = try await service.startChat(parameters: parameters)
         
         guard let assistantMessage = chat.choices.first?.message else { return }
         
         let content = assistantMessage.content ?? ""

         await updateLastAssistantMessage(.init(content: .content(.init(text: content)), origin: .received(.gpt)))
         
         /// # Step 2: check if the model wanted to call a function
         if let toolCalls = assistantMessage.toolCalls {
            
            /// # Step 3: call the function
            availableFunctions = [.createImage: generateImage(arguments:)]
            // Append the `assistantMessage` in to the `chatMessageParameters` to extend the conversation
            let parameterAssistantMessage = ChatCompletionParameters.Message(
               role: .assistant,
               content: .text(content), toolCalls: assistantMessage.toolCalls)
            
            chatMessageParameters.append(parameterAssistantMessage)
            
            /// # Step 4: send the info for each function call and function response to the model
            for toolCall in toolCalls {
               if
                  let name = toolCall.function.name,
                  let function = FunctionCallDefinition(rawValue: name),
                  let functionToCall = availableFunctions[function]
               {
                  let id = toolCall.id
                  let arguments = toolCall.function.arguments
                  let content = try await functionToCall(arguments)
                  let toolMessage = ChatCompletionParameters.Message(
                     role: .tool,
                     content: .text(content),
                     name: name,
                     toolCallID: id)
                  chatMessageParameters.append(toolMessage)
               }
            }
            
            /// # get a new response from the model where it can see the function response
            await continueChat()
         }
         
      } catch let error as APIError {
         // If an error occurs, update the UI to display the error message.
         await updateLastAssistantMessage(.init(content: .error("\(error.displayDescription)"), origin: .received(.gpt)))
      }

The continue chat function looks like this…

   func continueChat() async {
      /// Some prompt engineering so we can get a better answer.
      let systemMessage = ChatCompletionParameters.Message(role: .system, content: .text("You are an artist powered by AI, if the messages has a tool message you will weight that bigger in order to create a response, and you are providing me an image, you always respond in readable language and never providing URLs of images, most of the times you add an emoji on your responses if makes sense, do not describe the image. also always offer more help"))
            
/// We insert it in the `chatMessageParameters` before the already stored, user, assitant and tools messages.
      chatMessageParameters.insert(systemMessage, at: 0)
      
/// Notice that this time we are not passing functiuon calls parameters as we want
/// A conversation response.
      let paramsForChat = ChatCompletionParameters(
         messages: chatMessageParameters,
         model: .gpt41106Preview) /// this can also be
      do {
         let chat = try await service.startChat(parameters: paramsForChat)
         guard let assistantMessage = chat.choices.first?.message else { return }
         await updateLastAssistantMessage(.init(content: .content(.init(text: assistantMessage.content)), origin: .received(.gpt)))
      } catch {
         // If an error occurs, update the UI to display the error message.
         await updateLastAssistantMessage(.init(content: .error("\(error)"), origin: .received(.gpt)))
      }
   }

This is a significant amount of information, so let me summarize the key points:

In this context, function calls are specific instructions or operations, predefined by us, which are sent as parameters with a request to an AI model. These function calls direct the AI on how to process user inputs and potentially interact with external systems, based on the defined criteria in these calls. They are more than mere specifications; they are actionable directives that enable the AI to extend its capabilities beyond basic responses, allowing for dynamic interaction with external APIs and systems based on user prompts.
If a function call definition matches the user’s prompt, the model does not execute any action. Instead, it returns structured data as defined in the function. In this case, the model does not provide a conversational response but only the data containing the arguments, rather than the content of a message.
You must define an action in your code to handle the signal provided by the model. This action can trigger a request to an external API, where you can utilize the parameters from the tool call.
If you wish to continue the conversation with the model after executing an action, such as calling an external API, you must store the messages from the user, assistant, and tools. These should be used as parameters for a new chat request. Doing so provides the model with context and enables it to respond appropriately. This is the sequence of operations for a function call chat ‘session’:

The user creates a prompt and a message with the role ‘user’ is created, which we store locally.
2. The assistant responds with a tool call instead of content. We can use this tool call to trigger an external API, but there won’t be a message to show the user since the content is empty. We also need to store this message to extend the conversation after the custom action is executed.
We create messages with the role ‘tool’ that contain relevant information, such as the tool call ID, which the model will use for context later. These are also stored in the messages array.
We initiate a new chat request using these messages as parameters. With this context, the model can now return an answer like, ‘Here is an image of a cat! Hope you like it…’”

And that is how function calls work, and how you can use them in your iOS apps. A whole demo can be found here. You can also find the SwiftOpenAI wrapper that supports parallel function calls here.

As a bonus, I also added a demo to handle function calls in a streamed chat API request. The whole implementation can be found here and looks like this…

Stackademic

Thank you for reading until the end. Before you go:

Please consider clapping and following the writer! 👏
Follow us on Twitter(X), LinkedIn, and YouTube.
Visit Stackademic.com to find out more about how we are democratizing free programming education around the world.

OpenAI Function Calling, iOS Implementation

Stackademic

Written by James Rochabrun