completions

Otherwise known as chat completions. See the litellm documention.

The two required fields for completions are model and message. Some optional arguments are:

Properties of messages

Each message in the messages array can include the following fields:

  • role: str (required) - The role of the message’s author. Roles can be: system, user, assistant, function or tool.
  • content: Union[str,List[dict],None] (required) - The contents of the message. It is required for all messages, but may be null for assistant messages with function calls.
  • name: str - The name of the author of the message. It is required if the role is “function”. The name should match the name of the function represented in the content. It can contain characters (a-z, A-Z, 0-9), and underscores, with a maximum length of 64 characters.
  • function_call: object - The name and arguments of a function that should be called, as generated by the model.
  • tool_call_id: str - Tool call that this message is responding to.

Explanation of roles

  • system: Sets assistant context. Example: { "role": "system", "content": "You are a helpful assistant." }
  • user: End user input. Example: { "role": "user", "content": "What's the weather like today?" }
  • assistant: AI response. Example: { "role": "assistant", "content": "The weather is sunny and warm." }
  • function: Function call/result (name required). Example: { "role": "function", "name": "get_weather", "content": "{\"location\": \"San Francisco\"}" }
  • tool: Tool/plugin interaction (tool_call_id required). Example: { "role": "tool", "tool_call_id": "abc123", "content": "Tool response here" }

Simplified completions: prompt

Use the llm.prompt (async: llm.async_prompt) to perform a simplified single-turn completion.

completion

completion(
   *args,
   cache_enabled: bool,
   cache_path: typing.Union[str, pathlib.Path, NoneType],
   cache_key_prefix: typing.Optional[str],
   include_model_in_cache_key: bool,
   return_cache_key: bool,
   enable_retries: bool,
   retry_on_exceptions: typing.Optional[list[Exception]],
   retry_on_all_exceptions: bool,
   max_retries: typing.Optional[int],
   retry_delay: typing.Optional[int],
   **kwargs
)

This function is a wrapper around a corresponding function in the litellm library, see this for a full list of the available arguments.


response = completion(
    model="gpt-4o-mini",
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "What is the capital of France?"}
    ],
)
response.choices[0].message.content
'The capital of France is Paris.'
class Recipe(BaseModel):
    name: str
    ingredients: List[str]
    steps: List[str]

response = completion(
    model="gpt-4o-mini",
    messages=[
        {"role": "system", "content": "You are a helpful cooking assistant."},
        {"role": "user", "content": "Give me a simple recipe for pancakes."}
    ],
    response_format=Recipe
)

Recipe.model_validate_json(response.choices[0].message.content).model_dump()
{'name': 'Simple Pancakes',
 'ingredients': ['1 cup all-purpose flour',
  '2 tablespoons sugar',
  '2 teaspoons baking powder',
  '1/2 teaspoon salt',
  '1 cup milk',
  '1 egg',
  '2 tablespoons melted butter (or vegetable oil)',
  '1 teaspoon vanilla extract (optional)'],
 'steps': ['In a large bowl, whisk together the flour, sugar, baking powder, and salt.',
  'In another bowl, combine the milk, egg, melted butter, and vanilla extract, and whisk until smooth.',
  'Pour the wet ingredients into the dry ingredients and stir until just combined. Do not overmix; a few lumps are okay.',
  'Preheat a non-stick pan or griddle over medium heat and lightly grease it with butter or oil.',
  'Pour about 1/4 cup of batter for each pancake onto the pan.',
  'Cook until bubbles form on the surface of the pancake, about 2-3 minutes, then flip and cook for an additional 1-2 minutes until golden brown.',
  'Serve warm with your favorite toppings such as syrup, fruits, or whipped cream.']}

You can save costs during testing using mock responses:

response = completion(
    model="gpt-4o-mini",
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "What is the capital of Sweden?"}
    ],
    mock_response = "Stockholm"
)
response.choices[0].message.content
'Stockholm'

async_completion (async)

async_completion(
   *args,
   cache_enabled: bool,
   cache_path: typing.Union[str, pathlib.Path, NoneType],
   cache_key_prefix: typing.Optional[str],
   include_model_in_cache_key: bool,
   return_cache_key: bool,
   enable_retries: bool,
   retry_on_exceptions: typing.Optional[list[Exception]],
   retry_on_all_exceptions: bool,
   max_retries: typing.Optional[int],
   retry_delay: typing.Optional[int],
   timeout: typing.Optional[int],
   **kwargs
)

response = await async_completion(
    model="gpt-3.5-turbo",
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "What is the capital of France?"}
    ],
)
response.choices[0].message.content
'The capital of France is Paris.'

single

single(prompt: str, model: str|None, system: str|None, *args, **kwargs)

single(
    model='gpt-4o-mini',
    system='You are a helpful assistant.',
    prompt='What is the capital of France?',
)
'The capital of France is Paris.'
class Recipe(BaseModel):
    name: str
    ingredients: List[str]
    steps: List[str]

response = single(
    model="gpt-4o-mini",
    system="You are a helpful cooking assistant.",
    prompt="Give me a simple recipe for pancakes.",
    response_format=Recipe
)

Recipe.model_validate_json(response)
Recipe(name='Simple Pancakes', ingredients=['1 cup all-purpose flour', '2 tablespoons sugar', '2 teaspoons baking powder', '1/2 teaspoon salt', '1 cup milk', '1 egg', '2 tablespoons melted butter (or vegetable oil)', '1 teaspoon vanilla extract (optional)'], steps=['In a large bowl, whisk together the flour, sugar, baking powder, and salt.', 'In another bowl, combine the milk, egg, melted butter, and vanilla extract, and whisk until smooth.', 'Pour the wet ingredients into the dry ingredients and stir until just combined. Do not overmix; a few lumps are okay.', 'Preheat a non-stick pan or griddle over medium heat and lightly grease it with butter or oil.', 'Pour about 1/4 cup of batter for each pancake onto the pan.', 'Cook until bubbles form on the surface of the pancake, about 2-3 minutes, then flip and cook for an additional 1-2 minutes until golden brown.', 'Serve warm with your favorite toppings such as syrup, fruits, or whipped cream.'])

Can do multi-turn completions using get_msgs=True and passing the messages to the prev argument:

res, _ctx = single(
    model='gpt-4o-mini',
    system='You are a helpful assistant.',
    prompt='Add 1 and 1',
    multi=True
)
print(res)

res, _ctx = single(
    prompt='Multiply that by 10',
    multi=_ctx,
)
print(res)
1 and 1 equals 2.
2 multiplied by 10 equals 20.

async_single (async)

async_single(prompt: str, model: str|None, system: str|None, *args, **kwargs)

await async_single(
    model='gpt-4o-mini',
    system='You are a helpful assistant.',
    prompt='What is the capital of France?',
)
'The capital of France is Paris.'

You can execute a batch of prompt calls using adulib.asynchronous.batch_executor

results = await batch_executor(
    func=async_single,
    constant_kwargs=as_dict(model='gpt-4o-mini', system='You are a helpful assistant.'),
    batch_kwargs=[
        { 'prompt': 'What is the capital of France?' },
        { 'prompt': 'What is the capital of Germany?' },
        { 'prompt': 'What is the capital of Italy?' },
        { 'prompt': 'What is the capital of Spain?' },
        { 'prompt': 'What is the capital of Portugal?' },
    ],
    concurrency_limit=2,
)

print("Results:", results)

Processing:   0%|                                                                                                                                                                            | 0/5 [00:00<?, ?it/s]
Processing: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 5/5 [00:00<00:00, 3331.46it/s]
Results: ['The capital of France is Paris.', 'The capital of Germany is Berlin.', 'The capital of Italy is Rome.', 'The capital of Spain is Madrid.', 'The capital of Portugal is Lisbon.']