Anthropic iOS SDK
In this post, we are going to learn more about how to interact with Claude from Anthropic and how we can use this model within our iOS applications.
The end goal of this tutorial is not to learn how to build a UI chat interface; instead, we aim to familiarize ourselves with Anthropic’s APIs and how to use the SwiftAnthropic Swift Package. 🚀
Disclaimer: An API Key is required to access Claude, and currently, you must join a waitlist. Visit https://www.anthropic.com/earlyaccess 🥲
If you don’t have an API key yet, I still encourage you to keep reading this post, as access may become more widely available soon! 🤞
On their website, you can view the available clients; currently, they offer only Python and TypeScript libraries. That’s why I built a Swift version that iOS developers can easily integrate into their apps, eliminating the need to construct their own. (It is open-source, so contributions are welcome!) Let’s start by exploring the available APIs to access Claude.
The current APIs offered by Anthropic are two: Text Completions and Messages.
As you can see, both Text Completions and Messages offer options for streaming and non-streaming. Text Completions is a legacy API, and Anthropic recommends using the Messages API instead.
Therefore, we will focus on explaining how to use the Messages API.
To create a message we need to create a POST request using the messages endpoint https://api.anthropic.com/v1/messages.
We can customize the parameters to specify which model we want to use, the messages we wish to send, the maximum token limit, and so forth. The list of parameters is as follows:
/// The model that will complete your prompt.
// As we improve Claude, we develop new versions of it that you can query. The model parameter controls which version of Claude responds to your request. Right now we offer two model families: Claude, and Claude Instant. You can use them by setting model to "claude-2.1" or "claude-instant-1.2", respectively.
///See [models](https://docs.anthropic.com/claude/reference/selecting-a-model) for additional details and options.
let model: String
/// Input messages.
/// Our models are trained to operate on alternating user and assistant conversational turns. When creating a new Message, you specify the prior conversational turns with the messages parameter, and the model then generates the next Message in the conversation.
/// Each input message must be an object with a role and content. You can specify a single user-role message, or you can include multiple user and assistant messages. The first message must always use the user role.
/// If the final message uses the assistant role, the response content will continue immediately from the content in that message. This can be used to constrain part of the model's response.
let messages: [Message]
/// The maximum number of tokens to generate before stopping.
/// Note that our models may stop before reaching this maximum. This parameter only specifies the absolute maximum number of tokens to generate.
/// Different models have different maximum values for this parameter. See [input and output](https://docs.anthropic.com/claude/reference/input-and-output-sizes) sizes for details.
let maxTokens: Int
/// System prompt.
///A system prompt is a way of providing context and instructions to Claude, such as specifying a particular goal or role. See our [guide to system prompts](https://docs.anthropic.com/claude/docs/how-to-use-system-prompts).
let system: String?
/// An object describing metadata about the request.
let metadata: MetaData?
/// Custom text sequences that will cause the model to stop generating.
/// Our models will normally stop when they have naturally completed their turn, which will result in a response stop_reason of "end_turn".
/// If you want the model to stop generating when it encounters custom strings of text, you can use the stop_sequences parameter. If the model encounters one of the custom sequences, the response stop_reason value will be "stop_sequence" and the response stop_sequence value will contain the matched stop sequence.
let stopSequences: [String]?
/// Whether to incrementally stream the response using server-sent events.
///See [streaming](https://docs.anthropic.com/claude/reference/messages-streaming for details.
var stream: Bool
/// Amount of randomness injected into the response.
/// Defaults to 1. Ranges from 0 to 1. Use temp closer to 0 for analytical / multiple choice, and closer to 1 for creative and generative tasks.
let temperature: Double?
/// Only sample from the top K options for each subsequent token.
/// Used to remove "long tail" low probability responses. [Learn more technical details here](https://towardsdatascience.com/how-to-sample-from-language-models-682bceb97277).
let topK: Int?
/// Use nucleus sampling.
///In nucleus sampling, we compute the cumulative distribution over all the options for each subsequent token in decreasing probability order and cut it off once it reaches a particular probability specified by top_p. You should either alter temperature or top_p, but not both.
let topP: Double?
struct Message: Encodable {
let role: String
let content: String
enum Role {
case user
case assistant
}
}
struct MetaData: Encodable {
// An external identifier for the user who is associated with the request.
// This should be a uuid, hash value, or other opaque identifier. Anthropic may use this id to help detect abuse. Do not include any identifying information such as name, email address, or phone number.
let userId: UUID
}
A request might look something like this:
curl -X POST https://api.anthropic.com/v1/messages \
--header "x-api-key: $ANTHROPIC_API_KEY" \
--header "anthropic-version: 2023-06-01" \
--header "content-type: application/json" \
--data \
'{
"model": "claude-2.1",
"max_tokens": 1024,
"messages": [
{"role": "user", "content": "Hello, Claude"}
]
}'
In the headers, you need to include the API KEY, the content type, and the anthropic-version
. The last one is added automatically if you use one of their client libraries or if you use SwiftAnthropic 😎
So, how do we interact with Claude using SwiftAnthropic? It's very straightforward, and all the details for streaming and non-streaming, and even for legacy text completions, are available in the SwiftAnthropic README file.
You just need to follow these simple steps:
- Add the package dependency to your app and include the necessary import in your source file.
- Provide an API key.
- Create a service.
- Define the parameters for the request.
- Call the create message or stream message method.
// 1
import SwiftAnthropic
// 2
let apiKey = "YOUR_ANTHROPIC_API_KEY"
// 3
let service = AnthropicServiceFactory.service(apiKey: apiKey)
// 4
let model = "claude-2.1"
let maxTokensToSample = 1024
let messageParameter = MessageParameter.Message(role: "user", content: "Hello, Claude")
let parameters = MessageParameter(model: model, messages: [messageParameter], maxTokens: maxTokens)
// 5
let message = try await service.createMessage(parameters)
/// If you want to stream 🚀 the response you call this method:
/// let message = try await service.streamMessage(parameters)
That is all you need to interact with Claude! Now, let’s explore what the response will look like:
public struct MessageResponse: Decodable {
/// Unique object identifier.
///
/// The format and length of IDs may change over time.
public let id: String
/// e.g: "message"
public let type: String
/// The model that handled the request.
public let model: String
/// Conversational role of the generated message.
///
/// This will always be "assistant".
public let role: String
/// Array of Content objects representing blocks of content generated by the model.
///
/// Each content block has a `type` that determines its structure, with "text" being the currently available type.
///
/// - Example:
/// ```
/// [{"type": "text", "text": "Hi, I'm Claude."}]
/// ```
///
/// The response content seamlessly follows from the last turn if the request input ends with an assistant turn. This allows for a continuous output based on the last interaction.
///
/// - Example Input:
/// ```
/// [
/// {"role": "user", "content": "What's the Greek name for Sun? (A) Sol (B) Helios (C) Sun"},
/// {"role": "assistant", "content": "The best answer is ("}
/// ]
/// ```
///
/// - Example Output:
/// ```
/// [{"type": "text", "text": "B)"}]
/// ```
///
/// This structure facilitates the integration and manipulation of model-generated content within your application.
public let content: [Content]
/// indicates why the process was halted.
///
/// This property can hold one of the following values to describe the stop reason:
/// - `"end_turn"`: The model reached a natural stopping point.
/// - `"max_tokens"`: The requested `max_tokens` limit or the model's maximum token limit was exceeded.
/// - `"stop_sequence"`: A custom stop sequence provided by you was generated.
///
/// It's important to note that the values for `stopReason` here differ from those in `/v1/complete`, specifically in how `end_turn` and `stop_sequence` are distinguished.
///
/// - In non-streaming mode, `stopReason` is always non-null, indicating the reason for stopping.
/// - In streaming mode, `stopReason` is null in the `message_start` event and non-null in all other cases, providing context for the stoppage.
///
/// This design allows for a detailed understanding of the process flow and its termination points.
public let stopReason: String?
/// Which custom stop sequence was generated.
///
/// This value will be non-null if one of your custom stop sequences was generated.
public let stopSequence: String?
/// Container for the number of tokens used.
public let usage: Usage
public struct Content: Decodable {
public let type: String
public let text: String
}
public struct Usage: Decodable {
/// The number of input tokens which were used.
public let inputTokens: Int
/// The number of output tokens which were used.
public let outputTokens: Int
}
}
In this response, the “answer” to your prompt is located in the text property of the content object. Now, let’s take a look at what a stream response looks like…
public struct MessageStreamResponse: Decodable {
public let type: String
public let index: Int?
public let contentBlock: ContentBlock?
public let message: MessageResponse?
public let delta: Delta?
public struct Delta: Decodable {
public let type: String?
public let text: String?
public let stopReason: String?
public let stopSequence: String?
}
public struct ContentBlock: Decodable {
public let type: String
public let text: String
}
}
This is a chunked object of the whole response. SwiftAnthropic uses the AsyncThrowingStream API to provide an asynchronous sequence. You don’t need to worry about these details because they are handled by the library!
This is what you need to retrieve the text message output.
let params = MessageParameter(
model: "claude-2.1",
messages: [.init(role: "user", content: "How much is 2 +2")],
maxTokens: 1024)
let messageRequest = try await service.streamMessage(params)
var messageTextResponse = ""
for try await result in messageRequest {
let content = result.delta?.text ?? ""
messageTextResponse += content
print("chunked output: \(messageTextResponse)")
}
print("The final output is: \(messageTextResponse)")
/// This will print:
/// chunked output: 2
/// chunked output: 2 +
/// chunked output: 2 + 2
/// chunked output: 2 + 2 =
/// chunked output: 2 + 2 = 4
/// The final output is: 2 + 2 = 4
And that’s it; that’s all you need to build your apps using Claude.
You can also run the Demo project located in the Examples folder on this Package.
If you’re interested in learning more about Anthropic and Claude, I strongly recommend checking out these resources:
Nova DasSarma on why information security may be critical to the safe development of AI systems
Chris Olah on what the hell is going on inside neural networks.