For a few months now I have been using ollama to self-host LLMs (Large Language Models) on my development server and to occassionally utilize the ollama chat interface. The real motivation of self-hosting, however, has been to be able to use LLM text generation in other projects, such as in Changelogrex or Pragmata (a global assets/spare-parts management Phoenix LiveView web app in development).
Concretely, the first use case for me would to prompt Mistral-7b to “explain the following text of a commit to the Linux kernel in a succinct paragraph, determining why it is important, and generate a comma-separated list of tags that describe it”, thus providing users of Changelogrex with a user-friendly interpretation of commit body texts that are often cryptic, especially when they contain code.
Ollama provides a REST API and the following endpoints are relevant for my needs:
One problem I noticed by trying those out is that sometimes a call to the /generate
or the /chat
endpoint will never return, as the LLM keeps generating content such as an endless stream of \n
or \t
, often separated with whitespace.
The main motivation of writing an Elixir wrapper was therefore to have a built-in solution to such a possibility, which I solved by looking at Elixir’s documentation for the Task
modules, and for Task.yield/2
, where I found out how to implement a timeout:
If you intend to shut the task down if it has not responded within
timeout
milliseconds, you should chain this together withshutdown/1
, like so:case Task.yield(task, timeout) || Task.shutdown(task) do {:ok, result} -> result nil -> Logger.warning("Failed to get a result in #{timeout}ms") nil end
In any case, Ollamex v0.1.0 is now released with an Apache 2.0 license.
- Source: on GitHub
- Package: in Hex
- Documentation: on HexDocs