Run LLM’s locally in Swift

2 min read5 days ago

I was not expecting to be so easy to be able to run open source models locally in a mac os app, this will be a short tutorial on how to do so using Ollama and the Swift library SwiftOpenAI.

This year, Ollama built compatibility for the OpenAI Chat Completions API. This allows you to use local models with the same API that you would use to interact with OpenAI models.

You can use powerful models such as llama3 or Mistral in your apps by following these simple steps!

Step 1:

In order to use Ollama you need to first download it from their official website here.

Step 2:

Now you need to download the model you want to use. For example, this will download Llama 3: ollama pull llama3

Step 3:

Now you can use your terminal if you want to interact with the LLM. You just need to do the following: ollama run llama3

Step 4:

If you want to use this in an app instead, you can use SwiftOpenAI in your client. All you need to do is add the package as a dependency in your project and then…

import SwiftOpenAI

// Instantiate a service and use the localhost URL provided by Ollama.
let service = OpenAIServiceFactory.service(baseURL: "http://localhost:11434")

Then you can use the completions API as follows:

let prompt = "Tell me a joke"
let parameters = ChatCompletionParameters(messages: [.init(role: .user, content: .text(prompt))], model: .custom("llama3"))
let chatCompletionObject = service.startStreamedChat(parameters: parameters)

That is all you need to run local models inside your own apps! For demos in how to use it in iOS check the examples project in this repo.

Run LLM’s locally in Swift

Written by James Rochabrun