Connecting a local model
There are several ways to deploy a language model locally. Popular solutions include Ollama, LM Studio, KoboldCpp, and others. As an example, let’s consider installation using KoboldCpp.
Download the LLM you want to deploy in
.ggufformat. You can find them, for example, at https://huggingface.co/.Download and open “KoboldCpp.” In the launcher, specify the path to the downloaded model. Check the “Remote Tunnel” option and click “Launch.”
After launching, a command line window will appear. Find the line that says “Your remote OpenAI Compatible API...” — it will contain a temporary URL (for example:
https://john-loving-cm-lows.trycloudflare.com/v1). Copy it.Return to our site, go to the model catalog, and select the “Hosts” tab. Click “Add Host.”

In the window that opens, paste the copied link into the “Endpoint URL” field, and add
/chat/completionsat the end. In my example, the link will behttps://john-loving-cm-lows.trycloudflare.com/v1/chat/completions. Fill out the other fields as you prefer.
Select the “Models” tab and click “Add Model.”

In the “Host” field, select the host you created earlier. In “Display Name,” enter the name that will appear in the catalog. In “Model Name,” enter the exact name of the
.gguffile you downloaded. In “Description,” describe the strengths and weaknesses of the model.
Below, specify the maximum context size, privacy settings, one or more functionality tags, and additional settings supported by the model.

Click “Create Model” to create the model. Wait a moment, and the model will appear in the list.

Last updated