qingy2024
/

RCJ-0.00022B-Instruct

@@ -48,87 +48,10 @@ The primary purpose of this model is to provide varied and persona-specific utte
 ## How to Use (as part of the `interactive_chat.py` script)
-This model is primarily intended to be used within the provided `interactive_chat.py` script. Here's how it's loaded and used:
-**1. Model Definition (Python):**
-```python
-import torch
-import torch.nn as nn
-# (Ensure VOCAB_SIZE, NUM_PERSONS, etc. are defined based on your training)
-class ConditionalCharLSTM(nn.Module):
-    def __init__(self, vocab_size, embedding_dim, hidden_dim, condition_dim, num_layers=1, dropout=0.1):
-        super().__init__()
-        self.vocab_size = vocab_size
-        self.embedding_dim = embedding_dim
-        self.hidden_dim = hidden_dim
-        self.condition_dim = condition_dim
-        self.num_layers = num_layers
-        # Assuming CHAR_TO_IDX[PAD_TOKEN] is defined
-        self.embedding = nn.Embedding(vocab_size, embedding_dim, padding_idx=CHAR_TO_IDX[PAD_TOKEN])
-        self.lstm = nn.LSTM(embedding_dim + condition_dim, hidden_dim, num_layers,
-                            batch_first=True, dropout=dropout if num_layers > 1 else 0)
-        self.fc_out = nn.Linear(hidden_dim, vocab_size)
-        self.dropout = nn.Dropout(dropout)
-    def forward(self, input_chars, condition_vector, hidden_state=None):
-        embedded_chars = self.dropout(self.embedding(input_chars))
-        # Expand condition_vector to match sequence length for concatenation
-        condition_expanded = condition_vector.unsqueeze(1).repeat(1, embedded_chars.size(1), 1)
-        lstm_input = torch.cat((embedded_chars, condition_expanded), dim=2)
-        if hidden_state is None: # Initialize hidden state if not provided
-            batch_size = input_chars.size(0)
-            dev = input_chars.device
-            h0 = torch.zeros(self.num_layers, batch_size, self.hidden_dim).to(dev)
-            c0 = torch.zeros(self.num_layers, batch_size, self.hidden_dim).to(dev)
-            hidden_state = (h0, c0)
-        lstm_out, hidden_state = self.lstm(lstm_input, hidden_state)
-        lstm_out_dropped = self.dropout(lstm_out)
-        output_logits = self.fc_out(lstm_out_dropped)
-        return output_logits, hidden_state
-```
-**2. Loading the Pretrained Weights:**
-```python
-# In interactive_chat.py
-# (Define VOCAB_SIZE, embedding_dim, hidden_dim, NUM_PERSONS, etc. to match training)
-model_lstm = ConditionalCharLSTM(
-    vocab_size=VOCAB_SIZE,
-    embedding_dim=args.lstm_embedding_dim, # from command line
-    hidden_dim=args.lstm_hidden_dim,   # from command line
-    condition_dim=NUM_PERSONS,
-    # ... other params
-).to(device)
-model_lstm.load_state_dict(torch.load(args.lstm_model_path, map_location=device))
-model_lstm.eval()
-```
-**3. Generating Text (Streaming):**
-```python
-# In interactive_chat.py (simplified)
-def generate_text_lstm_stream(model_lstm, person_id_int, device, temperature=0.7, max_len=100):
-    model_lstm.eval()
-    condition_vector = torch.zeros(1, NUM_PERSONS, dtype=torch.float).to(device)
-    condition_vector[0, person_id_int] = 1.0 # One-hot encode the person_id
-    current_char_idx = torch.tensor([[CHAR_TO_IDX[SOS_TOKEN]]], dtype=torch.long).to(device)
-    hidden_state = None
-    for _ in range(max_len - 1):
-        output_logits, hidden_state = model_lstm(current_char_idx, condition_vector, hidden_state)
-        # Apply temperature and sample next character
-        probabilities = torch.softmax(output_logits.squeeze(0).squeeze(0) / temperature, dim=0)
-        next_char_idx = torch.multinomial(probabilities, 1).item()
-        if next_char_idx == CHAR_TO_IDX[EOS_TOKEN]: break
-        char = IDX_TO_CHAR.get(next_char_idx, "")
-        yield char
-        current_char_idx = torch.tensor([[next_char_idx]], dtype=torch.long).to(device)
 ```
 ## Training Data 📚

 ## How to Use (as part of the `interactive_chat.py` script)
+This model is primarily intended to be used within the provided `rcj_inference.py` script. Here's how it's used:
+```bash
+python rcj_inference.py --classifier_model_path /path/to/personify-67m/ --lstm_model_path rcj_lstm.pth
 ```
 ## Training Data 📚