Building a Motoku LLM Retrieval System Using Internet Computer Protocol, Motoko, and Node.js
The rise of the Internet Computer Protocol (ICP) has revolutionized how developers build decentralized applications. Integrating Motoko, a powerful language designed specifically for ICP, with Node.js can yield a robust and scalable Large Language Model (LLM) retrieval system. This article will guide you through building such a system, highlighting key components such as embedding storage and retrieval.
Prerequisites
Before diving into the implementation, ensure you have the following tools and knowledge:
- Basic understanding of Motoko and Node.js.
- Node.js and npm installed on your machine.
- DFINITY SDK installed.
- Basic knowledge of RESTful APIs.
Step 1: Setting Up the Motoko Canister
First, we'll create a Motoko canister to store and retrieve embeddings.
1.1 Define the EmbeddingStore Actor
Create a new file named EmbeddingStore.mo
and define the EmbeddingStore
actor as follows:
import Array "mo:base/Array";
import Nat "mo:base/Nat";
import Time "mo:base/Time";
import Error "mo:base/Error";
actor EmbeddingStore {
type Embedding = {
text: Text;
embedding: [Float];
createdAt: Int;
};
stable var embeddings: [Embedding] = [];
stable let secretKey: Text = "8529741360"; // Replace with your actual secret key
public shared func storeEmbedding(key: Text, text: Text, embedding: [Float]) : async () {
if (key == secretKey) {
let timestamp = Time.now();
embeddings := Array.append(embeddings, [{
text = text;
embedding = embedding;
createdAt = timestamp;
}]);
} else {
throw Error.reject("Invalid key. Access denied.");
}
};
public query func getEmbeddings() : async [Embedding] {
return embeddings;
};
};
1.2 Deploy the Canister
Deploy the EmbeddingStore
canister to the Internet Computer:
- Open a terminal and navigate to your project directory.
- Run
dfx start
to start the local replica. - Create a new canister by adding the
EmbeddingStore
configuration to yourdfx.json
file. - Deploy the canister using
dfx deploy
.
Step 2: Setting Up the Node.js Server
Next, we'll set up a Node.js server to interact with the Motoko canister.
2.1 Initialize the Project
- Create a new directory for your Node.js project.
- Initialize the project by running
npm init -y
. - Install the necessary dependencies:
npm install express body-parser @dfinity/agent dotenv
2.2 Create the Server Script
Create a new file named server.js
and add the following code:
const express = require('express');
const bodyParser = require('body-parser');
const { HttpAgent, Actor } = require('@dfinity/agent');
const { idlFactory } = require('./idl/embedding_store.did.js');
require('dotenv').config();
const app = express();
const port = 3000;
app.use(bodyParser.json());
const canisterId = process.env.CANISTER_ID;
const host = process.env.HOST;
// Initialize the agent
const agent = new HttpAgent({ host });
agent.fetchRootKey(); // Necessary for local development
// Create an actor instance
const embeddingStore = Actor.createActor(idlFactory, {
agent,
canisterId,
});
// Helper function to convert BigInt to a string for JSON serialization
const serializeBigInt = (obj) => {
if (typeof obj === 'bigint') {
return obj.toString();
} else if (Array.isArray(obj)) {
return obj.map(serializeBigInt);
} else if (typeof obj === 'object' && obj !== null) {
return Object.fromEntries(
Object.entries(obj).map(([k, v]) => [k, serializeBigInt(v)])
);
}
return obj;
};
app.post('/storeEmbedding', async (req, res) => {
const { key, text, embedding } = req.body;
try {
if (key !== process.env.SECRET_KEY) {
throw new Error('Invalid key');
}
// Convert embedding to float64 if not already
const embeddingFloat64 = embedding.map(Number);
await embeddingStore.storeEmbedding(key, text, embeddingFloat64);
res.status(200).send('Embedding stored successfully.');
} catch (error) {
res.status(500).send(`Error: ${error.message}`);
}
});
app.get('/getEmbeddings', async (req, res) => {
try {
const embeddings = await embeddingStore.getEmbeddings();
res.status(200).json(serializeBigInt(embeddings));
} catch (error) {
res.status(500).send(`Error: ${error.message}`);
}
});
app.listen(port, () => {
console.log(`Server is running on http://localhost:${port}`);
});
2.3 Environment Configuration
Create a .env
file in your project directory and add the following environment variables:
CANISTER_ID=<your-canister-id>
HOST=http://localhost:8000
SECRET_KEY=8529741360
Replace <your-canister-id>
with the actual canister ID obtained from the deployment step.
Step 3: Testing the System
With the canister deployed and the server set up, it's time to test the embedding storage and retrieval functionality.
3.1 Storing an Embedding
Use curl
or a tool like Postman to store an embedding:
curl -X POST http://localhost:3000/storeEmbedding \
-H "Content-Type: application/json" \
-d '{"key": "8529741360", "text": "example text", "embedding": [0.1, 0.2, 0.3]}'
3.2 Retrieving Embeddings
Retrieve stored embeddings by accessing the following endpoint:
curl http://localhost:3000/getEmbeddings
Conclusion
Congratulations! You have successfully built a Motoku LLM retrieval system using Internet Computer Protocol, Motoko, and Node.js. This system allows you to store and retrieve text embeddings securely, leveraging the decentralized capabilities of the Internet Computer. As a next step, consider adding more features such as advanced search capabilities, authentication mechanisms, and integration with a frontend application.