Tune Studio is a playground and sandbox for developers integrating large language models (LLMs) into their workflows. Providing a unified dashboard and API, Tune Studio gives you access to over 100 pre-integrated LLMs, including popular models like Llama 3, Qwen, Open Hermes, and Gemma, and it’s easily integrated with additional services like Weights & Biases and Hugging Face.

With Tune Studio, you can:

  • Manage and train custom models in the dynamic sandbox environment.
  • Test multiple models in the playground.
  • Upload custom datasets for fine-tuning.
  • Download usage and performance logs.

When you’re satisfied with its performance, you can deploy a model and integrate it with your services using the Tune Studio API.

Security is a priority at Tune Studio and the platform is SOC 2 Type 2, HIPAA, and ISO 27001 compliant.

For enterprises looking for tailored solutions, Tune Studio Enterprise can be privately hosted to align seamlessly with your hybrid cloud strategy.

For further inquiries or information, please reach out to our sales team.

Signing up

You can use Tune Studio model APIs and integrations with a free account. You will need a Pro account to fine-tune and deploy private models.

Sign up to use Tune Studio here.

To set up a team, navigate to the members page. With a free account, you can invite up to three members to a team. Upgrade to a Pro account if you need more members in your team. Take a look at the Tune Studio pricing for more information.

Getting your Organization ID

You will need your Tune Studio Organization ID to use the API. The Organization ID is used in the HTTP header when accessing the API in certain methods.

Find your Organization ID on the settings page, which you can navigate to by clicking your organization icon at the top right and selecting Organization settings from the dropdown.

Organization ID

Getting an API key

You’ll need an API key to use the Tune Studio API to integrate prompts, deploy tuned models, and configure settings.

Tune Studio generates a default API key for your account on sign-up. Find your API key by clicking your organization icon at the top right and selecting View API keys from the dropdown or by clicking Access Keys from the sidebar in the Organization view.

View API keys

Create a new API key by clicking + Create new key, entering a name for the new API key, and clicking Create.

Copy an API key by clicking the Copy icon next to the key in the API key table. Remember to securely save the API key and never commit it to a public repository.

Using the Tune Studio API

The Tune Studio API can be used to:

  • Produce chat and text completions.
  • Retrieve and list models (public models and custom deployed models).
  • Create, update, terminate, and delete custom models.
  • Start and terminate models.
  • Log usage and performance metrics for custom models.

In the following examples, we’ll call the HTTP API directly using the Python built-in requests package and not the tuneapi Python package.

A POST request is needed with each request sent.

Take a look at the API reference page for a comprehensive description of the methods available to use with the Tune Studio API.

Creating a list models request

You can view the public LLMs available to use with Tune Studio in the Models tab of the Tune Studio dashboard.

Public Models

You can also list the public models available to use with Tune Studio using the API.

In the list models request, only one HTTP header is needed and the Content-Type should be set to application/json to indicate that the body of the request contains JSON data.

The body of the list models request has the following required fields:

  • A page object containing pagination details.
    • limit specifying the number of rows to return.
    • prevPageToken containing the previous page token (empty allowed).
    • nextPageToken containing the next page token (empty allowed).
    • totalPage specifying the total number of pages to return.

We can use the curl CLI to test the request to the API:

curl --request POST \
  --url https://studio.tune.app/tune.Studio/ListPublicModels \
  --header 'Content-Type: application/json' \
  --data '{
  "page": {
    "limit": 123,
    "prevPageToken": "",
    "nextPageToken": "",
    "totalPage": 123

The API returns a list of the models that are publicly available to use. Here’s an example of one model:

    "models": [
            "id": "o0vfjaz6",
            "name": "Meta-Llama-3-8B-Instruct",
            "createdAt": "2023-04-18T16:32:24.231Z",
            "updatedAt": "2024-05-03T06:28:08.835Z",
            "meta": {
                "metadata": {
                    "base_model_id": "meta-llama/Meta-Llama-3-8B-Instruct",
                    "description": {
                        "model_size": "8B",
                        "model_tags": [
                        "model_type": "instruct"
                    "extra_args": {
                        "download-dir": "/root/.cache/model/huggingface/hub_cache",
                        "max-total-tokens": "4096",
                        "num-shard": "1",
                        "served-model-name": "model"
                "quantization": "QUANTIZATION_FP16",
                "shutdownSettings": {
                    "autoShutdownDuration": "-1s"
                "modality": "MODALITY_TEXT"
            "resource": {
                "gpu": "nvidia-l4",
                "gpuCount": "1",
                "maxRetries": 1
            "replicas": 1,
            "state": {
                "state": "READY"
            "public": true,
            "uri": "rohan/Meta-Llama-3-8B-Instruct",
            "endpoint": {
                "id": "rohan/Meta-Llama-3-8B-Instruct",
                "url": "https://proxy.tune.app/rohan/Meta-Llama-3-8B-Instruct",
                "host": "https://proxy.tune.app"

We can identify the model’s URI (the ID to use to specify the model in requests) from this JSON object, in this case, rohan/Meta-Llama-3-8B-Instruct. We will use this URI when we build the request for a chat completion in the next section.

List models using Python

Here’s how you can list available models using Python.

  1. In your working directory, create a new file named list_models.py.
  2. Paste the following code into the new file:
import json
import requests

def list_models():
        headers = {
            "Content-Type": "application/json",

        data = {
            "page": {
                "limit": 123,
                "prevPageToken": "",
                "nextPageToken": "",
                "totalPage": 123

        url = "https://studio.tune.app/tune.Studio/ListPublicModels"
        response = requests.post(url, headers=headers, json=data)
        print(json.dumps(response.json(), sort_keys=True, indent=4))
    except IOError:
        print("Error in list_models: I/O error with API occurred")

if __name__ == '__main__':
    print("Listing available public models")
  1. Run the Python script with the following command:
python list_models.py

You should get a successful response with the list of models:

Listing available public models
    "models": [
            "createdAt": "2022-02-14T11:54:32.214Z",
            "endpoint": {
                "host": "https://proxy.tune.app",
                "id": "kaushikaakash04/tune-blob",
                "url": "https://proxy.tune.app/kaushikaakash04/tune-blob"
            "id": "32w5s7az",
            "meta": {
                "metadata": {
                    "base_model_id": "mixtral-8x7b-instruct",
                    "description": {
                        "model_size": "Assistant",
                        "model_tags": [

You can list your privately deployed models in a similar way. See the list models API reference for more info.

Creating a chat completion request

A request for a chat completion methods must include the following HTTP headers:

  • Authorization containing your Tune Studio API key.
  • Content-Type set to application/json to indicate JSON data in the body of the request.

The JSON body of the request must have the following structure:

  • messages - An object array containing a list of messages that require completion. Each message item requires role and content fields.
  • model - The model to be used for the chat completion, specified as a string.
  • stream - A boolean indicating how the API should return responses. When set to true, messages are streamed (returned one at a time). When set to false, a complete list of messages is returned.
  • penalty - A float number value that penalizes new tokens if they are found to have appeared in previous generated text.
  • max_tokens - Indicating the maximum number of tokens to generate per output sequence.

Here’s an example chat completion API request using the curl CLI to send to the Tune Studio API (Replace <YOUR_API_KEY> with your actual API key):

curl -X POST "https://proxy.tune.app/chat/completions" \
-H "Authorization: <YOUR_API_KEY>" \
-H "Content-Type: application/json" \
-d '{
  "temperature": 0.8,
  "messages": [
    "role": "system",
    "content": "Your role is to supply the weather info (for the city Barcelona) when a user asks for it. For context, today will be sunny, no rain and a maximum of 30 degrees celsius is expected."
    "role": "user",
    "content": "Hi, can you tell me if it is going to rain today in Barcelona?"
    "model": "rohan/Meta-Llama-3-8B-Instruct",
    "stream": false,
    "penalty": 0,
    "max_tokens": 900

Here we specify one of the public models available in Tune Studio with the model ID rohan/Meta-Llama-3-8B-Instruct, and supply the weather forecast for Barcelona as context. Note that the temperature field controls the level of randomness in the LLM response (in Tune Studio, the available range for temperature is 0-2).

We get the following response from the model:

    "id": "cmpl-ccdedd0ec34e49aa9b1162b0c7b5c37e",
    "object": "chat.completion",
    "created": 1715164162,
    "model": "model",
    "usage": {
        "prompt_tokens": 75,
        "completion_tokens": 41,
        "total_tokens": 116
    "choices": [
            "index": 0,
            "message": {
                "role": "assistant",
                "content": "No, you're in luck! According to the forecast, it's going to be a beautiful day in Barcelona with clear skies and not a cloud in sight. You can expect plenty of sunshine today!"
            "finish_reason": "stop"

We know the API request is working successfully, as we get a valid response from the model.

Creating a chat completion request using Python

Here’s how you can implement a chat completion request using Python.

  1. Create a file named tunestudio_api.py and open it in your IDE or editor.
  2. Paste the following code in the new file (Replace YOUR-API-KEY with your actual API key):
import json
import requests

def send_receive_req(headers, data, stream):
    response_msgs = []
        url = "https://proxy.tune.app/chat/completions"
        response = requests.post(url, headers=headers, json=data)
        if stream:
            for line in response.iter_lines():
                if line:
                    msg = line[6:]
                    if msg != b'[DONE]':
    except IOError:
        print("Error in send_receive_req: I/O error with API occurred")

    return response_msgs

def build_request_data(system_context, user_question, stream):
    # For security, it's recommended to load the API key from your environment, for this example we will hard code for simplicity
    api_key = 'YOUR-API-KEY'

    headers = {
        "Authorization": api_key,
        "Content-Type": "application/json",

    data = {
        "temperature": 0.80,
        "messages": [
                "role": "system",
                "content": system_context
                "role": "user",
                "content": user_question
        "model": "rohan/Meta-Llama-3-8B-Instruct",
        "stream": stream,
        "penalty": 0,
        "max_tokens": 900

    return headers, data

if __name__ == '__main__':
    print("Welcome to Getting Started with the TuneStudio API")
    # Let's give our model some context of real life data. In this case today's expected weather of Barcelona.
    real_life_context = ('Your role is to supply the weather info (for the city of Barcelona) when a user asks. '
                         'For context - today will be sunny, no rain and a maximum of 30 degrees celsius is expected.')
    # Question from user
    user_question_example = 'Hi, can you tell me if it is going to rain today?'
    # define if we want to stream our messages one by one in the response
    stream_msgs = False
    # Let's build our request's headers and data
    built_headers, built_data = build_request_data(real_life_context, user_question_example, stream_msgs)

    print("==> Sending user question: ")
    # Send the HTTP request to the API
    responses = send_receive_req(built_headers, built_data, stream_msgs)

    # Print out the first msg in the response
    if len(responses) > 0:
        print("<== Model response: ")
        print("Could not retrieve a valid response")
  1. Run the script with the following command:
python tunestudio_api.py

LLM responses will differ each time they are generated. Here is the output we received when we ran the script:

Welcome to Getting Started with the TuneStudio API
==> Sending user question:
Hi, can you tell me if it is going to rain today?
<== Model response:
Good morning! According to the forecast, it's looking like a beautiful day in Barcelona! You won't need your umbrella today, as there is no chance of rain expected. The sky will be sunny with plenty of sunshine.

Tune API Python package

The Tune API Python package makes integrating Tune Studio into your Python projects easy.

Install the Tune API Python package with the following command:

pip install tuneapi

At the time of writing the tuneapi Python package is still in beta phase. Feel free to test it out but we recommend you to use the HTTP API directly using requests for production purposes.

You can request new features and report bugs on the Tune Studio GitHub issues repository.

Tune Studio Advanced use

Let’s take a closer look at some of the more advanced functionality of Tune Studio.

Chat completion vs. text completion

In the Tune Studio playground, you can switch between Chat and Text Complete interaction modes with LLMs. Selecting Chat sets the interaction mode to chat completion, and Text Complete to text completion.

Chat completion is optimized for back-and-forth interactions, similar to a conversation. This interaction mode maintains context over a thread of messages and can generate responses that depend on the conversation history, so it’s ideal for applications needing more complex dialog, like interactive chatbots or virtual assistants.

Text completion generates single responses based on a prompt, and is useful for tasks that need a single, concise output, for example, completing a story, generating a list, or answering specific questions.

What are custom or private models?

Your Tune Studio account gives you access to the available public models that are predeployed on the platform, or you can deploy models from OpenAI, Anthropic, Mistral, Open Router, Groq, Bucket, or Hugging Face if you have an API key for the provider.

Alternatively, you can deploy pre-trained custom models that are tailored to meet your unique needs. These pre-trained language models (LMs) are private, and you can use Tune Studio to manage, train, and fine-tune your LM to enhance performance for specific tasks or domains.

Explore the model options available or deploy a custom model on the models page.

Fine-tuning custom models

Tune Studio provides streamlined fine-tuning tools that help you leverage your own data to train your custom model so that it’s optimized for its specific purpose. Fine-tuning improves custom models’ performance and optimizes latency to reduce costs.

When your fine-tuning job is complete, you can download the fine-tuned weights for further use or deploy your fine-tuned model to use in the Tune Studio playground and integrate into your applications and workflows.

Hugging Face integration

Integrate Tune Studio with your Hugging Face account to access your custom model data and automatically push fine-tuned weights to your repository.

Take a look at the integrations documentation for more information.

Getting and using logs

Logs are useful resources for assessing your models’ performance, identifying problems, and helping to fine-tune results.

Enable logs for your model in the model Settings section.

Generated logs for custom private models that you have deployed can be accessed using the API or in the dashboard. See the get logs documentation in the Tune Studio API reference for more information.

Logs for integrated models (like ChatGPT and Groq) are currently only accessible in the dashboard.

Logging is not enabled for public models that are accessible to all users.

How to get support

If you have any issues while using Tune Studio, need further information, or need help with your account or billing, click the Support link at the top-right of the docs to send us an email.

We’d also love to hear your feedback about Tune Studio. Get in touch using the Support link, or find us on the following channels:

  • Join our vibrant community of users and developers on our Discord server.
  • Follow and engage with us on X.
  • Connect with us on LinkedIn.