Ollama JavaScript Library The Ollama JavaScript library provides the easiest way to integrate your JavaScript project with Ollama. Getting Started npm i ollama Usage import ollama from 'ollama' const response = await ollama.chat({ model: 'llama3.1', messages: [{ role: 'user', content: 'Why is the sky blue?' }], }) console.log(response.message.content) Browser Usage To use the library without node, import the browser module. import ollama from 'ollama/browser' Streaming responses Response streaming can be enabled by setting stream: true, modifying function calls to return an AsyncGenerator where each part is an object in the stream. import ollama from 'ollama' const message = { role: 'user', content: 'Why is the sky blue?' } const response = await ollama.chat({ model: 'llama3.1', messages: [message], stream: true }) for await (const part of response) { process.stdout.write(part.message.content) } API The Ollama JavaScript library's API is designed around the Ollama REST API chat ollama.chat(request) request : The request object containing chat parameters. model The name of the model to use for the chat. messages : Array of message objects representing the chat history. role : The role of the message sender ('user', 'system', or 'assistant'). content : The content of the message. images : (Optional) Images to be included in the message, either as Uint8Array or base64 encoded strings. format : (Optional) Set the expected format of the response (json). stream : (Optional) When true an AsyncGenerator is returned. keep_alive : (Optional) How long to keep the model loaded. A number (seconds) or a string with a duration unit suffix ("300ms", "1.5h", "2h45m", etc.) tools : (Optional) A list of tool calls the model may make. options : (Optional) Options to configure the runtime. Returns: generate ollama.generate(request) request : The request object containing generate parameters. model The name of the model to use for the chat. prompt : The prompt to send to the model. suffix : (Optional) Suffix is the text that comes after the inserted text. system : (Optional) Override the model system prompt. template : (Optional) Override the model template. raw : (Optional) Bypass the prompt template and pass the prompt directly to the model. images : (Optional) Images to be included, either as Uint8Array or base64 encoded strings. format : (Optional) Set the expected format of the response (json). stream : (Optional) When true an AsyncGenerator is returned. keep_alive : (Optional) How long to keep the model loaded. A number (seconds) or a string with a duration unit suffix ("300ms", "1.5h", "2h45m", etc.) options : (Optional) Options to configure the runtime. Returns: pull ollama.pull(request) request : The request object containing pull parameters. model The name of the model to pull. insecure : (Optional) Pull from servers whose identity cannot be verified. stream : (Optional) When true an AsyncGenerator is returned. Returns: push ollama.push(request) request : The request object containing push parameters. model The name of the model to push. insecure : (Optional) Push to servers whose identity cannot be verified. stream : (Optional) When true an AsyncGenerator is returned. Returns: create ollama.create(request) request : The request object containing create parameters. model The name of the model to create. from : The base model to derive from. stream : (Optional) When true an AsyncGenerator is returned. quantize : Quanization precision level (q8_0, q4_K_M, etc.). template : (Optional) The prompt template to use with the model. license : (Optional) The license(s) associated with the model. system : (Optional) The system prompt for the model. parameters >: (Optional) Additional model parameters as key-value pairs. messages : (Optional) Initial chat messages for the model. adapters >: (Optional) A key-value map of LoRA adapter configurations. Returns: Note: The files parameter is not currently supported in ollama-js. delete ollama.delete(request) request : The request object containing delete parameters. model The name of the model to delete. Returns: copy ollama.copy(request) request : The request object containing copy parameters. source The name of the model to copy from. destination The name of the model to copy to. Returns: list ollama.list() Returns: show ollama.show(request) request : The request object containing show parameters. model The name of the model to show. system : (Optional) Override the model system prompt returned. template : (Optional) Override the model template returned. options : (Optional) Options to configure the runtime. Returns: embed ollama.embed(request) request : The request object containing embedding parameters. model The name of the model used to generate the embeddings. input | : The input used to generate the embeddings. truncate : (Optional) Truncate the input to fit the maximum context length supported by the model. keep_alive : (Optional) How long to keep the model loaded. A number (seconds) or a string with a duration unit suffix ("300ms", "1.5h", "2h45m", etc.) options : (Optional) Options to configure the runtime. Returns: ps ollama.ps() Returns: abort ollama.abort() This method will abort all streamed generations currently running with the client instance. If there is a need to manage streams with timeouts, it is recommended to have one Ollama client per stream. All asynchronous threads listening to streams (typically the for await (const part of response)) will throw an AbortError exception. See examples/abort/abort-all-requests.ts for an example. Custom client A custom client can be created with the following fields: host : (Optional) The Ollama host address. Default: "http://127.0.0.1:11434". fetch : (Optional) The fetch library used to make requests to the Ollama host. import { Ollama } from 'ollama' const ollama = new Ollama({ host: 'http://127.0.0.1:11434' }) const response = await ollama.chat({ model: 'llama3.1', messages: [{ role: 'user', content: 'Why is the sky blue?' }], }) Building To build the project files run: npm run build import ollama from 'ollama'; import { z } from 'zod'; import { zodToJsonSchema } from 'zod-to-json-schema'; /* Ollama structured outputs capabilities It parses the response from the model into a structured JSON object using Zod */ // Define the schema for friend info const FriendInfoSchema = z.object({ name: z.string().describe('The name of the friend'), age: z.number().int().describe('The age of the friend'), is_available: z.boolean().describe('Whether the friend is available') }); // Define the schema for friend list const FriendListSchema = z.object({ friends: z.array(FriendInfoSchema).describe('An array of friends') }); async function run(model: string) { // Convert the Zod schema to JSON Schema format const jsonSchema = zodToJsonSchema(FriendListSchema); /* Can use manually defined schema directly const schema = { 'type': 'object', 'properties': { 'friends': { 'type': 'array', 'items': { 'type': 'object', 'properties': { 'name': { 'type': 'string' }, 'age': { 'type': 'integer' }, 'is_available': { 'type': 'boolean' } }, 'required': ['name', 'age', 'is_available'] } } }, 'required': ['friends'] } */ const messages = [{ role: 'user', content: 'I have two friends. The first is Ollama 22 years old busy saving the world, and the second is Alonso 23 years old and wants to hang out. Return a list of friends in JSON format' }]; const response = await ollama.chat({ model: model, messages: messages, format: jsonSchema, // or format: schema options: { temperature: 0 // Make responses more deterministic } }); // Parse and validate the response try { const friendsResponse = FriendListSchema.parse(JSON.parse(response.message.content)); console.log(friendsResponse); } catch (error) { console.error("Generated invalid response:", error); } } run('llama3.1:8b').catch(console.error); Make a sophisticated , feature-rich LLM interaction workstation for Ollama