Ollamex: use LLMs on self-hosted ollama API from Elixir

Ollamex: use LLMs on self-hosted ollama API from Elixir

10 January, 2024 2 min read
elixir, software, open source, API, LLM

For a few months now I have been using ollama to self-host LLMs (Large Language Models) on my development server and to occassionally utilize the ollama chat interface. The real motivation of self-hosting, however, has been to be able to use LLM text generation in other projects, such as in Changelogrex or Pragmata (a global assets/spare-parts management Phoenix LiveView web app in development).

Concretely, the first use case for me would to prompt Mistral-7b to “explain the following text of a commit to the Linux kernel in a succinct paragraph, determining why it is important, and generate a comma-separated list of tags that describe it”, thus providing users of Changelogrex with a user-friendly interpretation of commit body texts that are often cryptic, especially when they contain code.

Ollama provides a REST API and the following endpoints are relevant for my needs:

One problem I noticed by trying those out is that sometimes a call to the /generate or the /chat endpoint will never return, as the LLM keeps generating content such as an endless stream of \n or \t, often separated with whitespace.

The main motivation of writing an Elixir wrapper was therefore to have a built-in solution to such a possibility, which I solved by looking at Elixir’s documentation for the Task modules, and for Task.yield/2 , where I found out how to implement a timeout:

If you intend to shut the task down if it has not responded within timeout milliseconds, you should chain this together with shutdown/1, like so:

case Task.yield(task, timeout) || Task.shutdown(task) do
 {:ok, result} ->
   result

 nil ->
   Logger.warning("Failed to get a result in #{timeout}ms")
   nil
end

In any case, Ollamex v0.1.0 is now released with an Apache 2.0 license.