Spaces:
Running
on
Zero
Running
on
Zero
Commit
Β·
ba1a951
1
Parent(s):
aa08197
Enhanced app functionality by adding new UI elements for summary file management, put in Bedrock model toggle, and refined logging messages. Updated Dockerfile and requirements for better compatibility and added install guide to readme. Removed deprecated code and unnecessary comments.
Browse files- Dockerfile +5 -4
- README.md +118 -4
- app.py +30 -25
- requirements.txt +5 -6
- requirements_aws.txt +2 -3
- requirements_cpu.txt +23 -0
- requirements_gpu.txt +4 -2
- tools/aws_functions.py +0 -62
- tools/combine_sheets_into_xlsx.py +3 -7
- tools/config.py +2 -1
- tools/dedup_summaries.py +16 -8
- tools/llm_api_call.py +16 -6
- tools/llm_funcs.py +1 -3
- windows_install_llama-cpp-python.txt +34 -44
Dockerfile
CHANGED
|
@@ -1,3 +1,4 @@
|
|
|
|
|
| 1 |
# Stage 1: Build dependencies and download models
|
| 2 |
FROM public.ecr.aws/docker/library/python:3.11.13-slim-bookworm AS builder
|
| 3 |
|
|
@@ -7,7 +8,7 @@ RUN apt-get update && apt-get install -y \
|
|
| 7 |
gcc \
|
| 8 |
g++ \
|
| 9 |
cmake \
|
| 10 |
-
libopenblas-dev \
|
| 11 |
pkg-config \
|
| 12 |
python3-dev \
|
| 13 |
libffi-dev \
|
|
@@ -18,9 +19,9 @@ WORKDIR /src
|
|
| 18 |
|
| 19 |
COPY requirements_aws.txt .
|
| 20 |
|
| 21 |
-
# Set environment variables for OpenBLAS
|
| 22 |
-
ENV OPENBLAS_VERBOSE=1
|
| 23 |
-
ENV CMAKE_ARGS="-DGGML_BLAS=ON -DGGML_BLAS_VENDOR=OpenBLAS"
|
| 24 |
|
| 25 |
RUN pip install --no-cache-dir --target=/install torch==2.7.1+cpu --extra-index-url https://download.pytorch.org/whl/cpu \
|
| 26 |
&& pip install --no-cache-dir --target=/install https://github.com/seanpedrick-case/llama-cpp-python-whl-builder/releases/download/v0.1.0/llama_cpp_python-0.3.16-cp311-cp311-linux_x86_64.whl \
|
|
|
|
| 1 |
+
# This Dockerfile is optimised for AWS ECS using Python 3.11, and assumes CPU inference with OpenBLAS for local models.
|
| 2 |
# Stage 1: Build dependencies and download models
|
| 3 |
FROM public.ecr.aws/docker/library/python:3.11.13-slim-bookworm AS builder
|
| 4 |
|
|
|
|
| 8 |
gcc \
|
| 9 |
g++ \
|
| 10 |
cmake \
|
| 11 |
+
#libopenblas-dev \
|
| 12 |
pkg-config \
|
| 13 |
python3-dev \
|
| 14 |
libffi-dev \
|
|
|
|
| 19 |
|
| 20 |
COPY requirements_aws.txt .
|
| 21 |
|
| 22 |
+
# Set environment variables for OpenBLAS - not necessary if not building from source
|
| 23 |
+
# ENV OPENBLAS_VERBOSE=1
|
| 24 |
+
# ENV CMAKE_ARGS="-DGGML_BLAS=ON -DGGML_BLAS_VENDOR=OpenBLAS"
|
| 25 |
|
| 26 |
RUN pip install --no-cache-dir --target=/install torch==2.7.1+cpu --extra-index-url https://download.pytorch.org/whl/cpu \
|
| 27 |
&& pip install --no-cache-dir --target=/install https://github.com/seanpedrick-case/llama-cpp-python-whl-builder/releases/download/v0.1.0/llama_cpp_python-0.3.16-cp311-cp311-linux_x86_64.whl \
|
README.md
CHANGED
|
@@ -11,7 +11,7 @@ license: agpl-3.0
|
|
| 11 |
|
| 12 |
# Large language model topic modelling
|
| 13 |
|
| 14 |
-
Extract topics and summarise outputs using Large Language Models (LLMs, Gemma 3 4b/GPT-OSS 20b if local (see config.py), Gemini 2.5, or Bedrock models e.g. (Claude 3 Haiku/Claude Sonnet 3.7). The app will query the LLM with batches of responses to produce summary tables, which are then compared iteratively to output a table with the general topics, subtopics, topic sentiment, and relevant text rows related to them. The prompts are designed for topic modelling public consultations, but they can be adapted to different contexts (see the LLM settings tab to modify).
|
| 15 |
|
| 16 |
Instructions on use can be found in the README.md file. Try it out with this [dummy development consultation dataset](https://huggingface.co/datasets/seanpedrickcase/dummy_development_consultation/tree/main), which you can also try with [zero-shot topics](https://huggingface.co/datasets/seanpedrickcase/dummy_development_consultation/tree/main). Try also this [dummy case notes dataset](https://huggingface.co/datasets/seanpedrickcase/dummy_case_notes/tree/main).
|
| 17 |
|
|
@@ -25,6 +25,120 @@ Basic use:
|
|
| 25 |
2. Select the relevant open text column from the dropdown.
|
| 26 |
3. If you have your own suggested (zero shot) topics, upload this (see examples folder for an example file)
|
| 27 |
4. Write a one sentence description of the consultation/context of the open text.
|
| 28 |
-
5. Extract topics.
|
| 29 |
-
6.
|
| 30 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 11 |
|
| 12 |
# Large language model topic modelling
|
| 13 |
|
| 14 |
+
Extract topics and summarise outputs using Large Language Models (LLMs, Gemma 3 4b/GPT-OSS 20b if local (see tools/config.py to modify), Gemini 2.5, or Bedrock models e.g. (Claude 3 Haiku/Claude Sonnet 3.7). The app will query the LLM with batches of responses to produce summary tables, which are then compared iteratively to output a table with the general topics, subtopics, topic sentiment, and relevant text rows related to them. The prompts are designed for topic modelling public consultations, but they can be adapted to different contexts (see the LLM settings tab to modify).
|
| 15 |
|
| 16 |
Instructions on use can be found in the README.md file. Try it out with this [dummy development consultation dataset](https://huggingface.co/datasets/seanpedrickcase/dummy_development_consultation/tree/main), which you can also try with [zero-shot topics](https://huggingface.co/datasets/seanpedrickcase/dummy_development_consultation/tree/main). Try also this [dummy case notes dataset](https://huggingface.co/datasets/seanpedrickcase/dummy_case_notes/tree/main).
|
| 17 |
|
|
|
|
| 25 |
2. Select the relevant open text column from the dropdown.
|
| 26 |
3. If you have your own suggested (zero shot) topics, upload this (see examples folder for an example file)
|
| 27 |
4. Write a one sentence description of the consultation/context of the open text.
|
| 28 |
+
5. Click 'All in one - Extract topics, deduplicate, and summarise'. This will run through the whole analysis process from topic extraction, to topic deduplication, to topic-level and overall summaries.
|
| 29 |
+
6. A summary xlsx file workbook will be created on the front page in the box 'Overall summary xlsx file'. This will combine all the results from the different processes into one workbook.
|
| 30 |
+
|
| 31 |
+
# Installation guide
|
| 32 |
+
|
| 33 |
+
Here is a step-by-step guide to clone the repository, create a virtual environment, and install dependencies from a relevant `requirements` file. This guide assumes you have **Git** and **Python 3.11** installed.
|
| 34 |
+
|
| 35 |
+
-----
|
| 36 |
+
|
| 37 |
+
### Step 1: Clone the Git Repository
|
| 38 |
+
|
| 39 |
+
First, you need to copy the project files to your local machine. Navigate to the directory where you want to store the project using the `cd` (change directory) command. Then, use `git clone` with the repository's URL.
|
| 40 |
+
|
| 41 |
+
1. **Clone the repo:**
|
| 42 |
+
|
| 43 |
+
```bash
|
| 44 |
+
git clone https://github.com/example-user/example-repo.git
|
| 45 |
+
```
|
| 46 |
+
|
| 47 |
+
*Replace the URL with your repository's URL.*
|
| 48 |
+
|
| 49 |
+
2. **Navigate into the new project folder:**
|
| 50 |
+
|
| 51 |
+
```bash
|
| 52 |
+
cd example-repo
|
| 53 |
+
```
|
| 54 |
+
-----
|
| 55 |
+
|
| 56 |
+
### Step 2: Create and Activate a Virtual Environment
|
| 57 |
+
|
| 58 |
+
A virtual environment is a self-contained directory that holds a specific Python interpreter and its own set of installed packages. This is crucial for isolating your project's dependencies.
|
| 59 |
+
|
| 60 |
+
1. **Create the virtual environment:** We'll use Python's built-in `venv` module. It's common practice to name the environment folder `.venv`.
|
| 61 |
+
|
| 62 |
+
```bash
|
| 63 |
+
python -m venv .venv
|
| 64 |
+
```
|
| 65 |
+
|
| 66 |
+
*This command tells Python to create a new virtual environment in a folder named `.venv`.*
|
| 67 |
+
|
| 68 |
+
2. **Activate the environment:** You must "activate" the environment to start using it. The command differs based on your operating system and shell.
|
| 69 |
+
|
| 70 |
+
* **On macOS / Linux (bash/zsh):**
|
| 71 |
+
|
| 72 |
+
```bash
|
| 73 |
+
source .venv/bin/activate
|
| 74 |
+
```
|
| 75 |
+
|
| 76 |
+
* **On Windows (Command Prompt):**
|
| 77 |
+
|
| 78 |
+
```bash
|
| 79 |
+
.\.venv\Scripts\activate
|
| 80 |
+
```
|
| 81 |
+
|
| 82 |
+
* **On Windows (PowerShell):**
|
| 83 |
+
|
| 84 |
+
```powershell
|
| 85 |
+
.\.venv\Scripts\Activate.ps1
|
| 86 |
+
```
|
| 87 |
+
|
| 88 |
+
You'll know it's active because your command prompt will be prefixed with `(.venv)`.
|
| 89 |
+
|
| 90 |
+
-----
|
| 91 |
+
|
| 92 |
+
### Step 3: Install Dependencies
|
| 93 |
+
|
| 94 |
+
Now that your virtual environment is active, you can install all the required packages listed in the relevant project `requirements.txt` file using `pip`.
|
| 95 |
+
|
| 96 |
+
1. **Choose the relevant requirements file**
|
| 97 |
+
|
| 98 |
+
Llama-cpp-python version 3.16 is compatible with Gemma 3 and GPT-OSS models, but does not at the time of writing have relevant wheels for CPU inference or for Windows. A sister repository contains [llama-cpp-python 3.16 wheels for Python version 3.11/10](https://github.com/seanpedrick-case/llama-cpp-python-whl-builder/releases/tag/v0.1.0) so that users can avoid having to build the package from source. If you prefer to build from source, then please refer to the llama-cpp-python documentation [here](https://github.com/abetlen/llama-cpp-python). I also have a guide to building the package on a Windows system [here](https://github.com/seanpedrick-case/llm_topic_modelling/blob/main/windows_install_llama-cpp-python.txt).
|
| 99 |
+
|
| 100 |
+
The repo provides several requirements files that are relevant for different situations. I would advise using requirements_gpu.txt for GPU environments, and requirements_cpu.txt for CPU environments:
|
| 101 |
+
|
| 102 |
+
- **requirements_gpu.txt**: Used for Python 3.11 GPU-enabled environments. Uncomment the last requirement under 'Windows' for Windows compatibility (CUDA 12.4)
|
| 103 |
+
- **requirements_cpu.txt**: Used for Python 3.11 CPU-only environments. Uncomment the last requirement under 'Windows' for Windows compatibility
|
| 104 |
+
- **requirements.txt**: Used for the Python 3.10 GPU-enabled environment on Hugging Face spaces (CUDA 12.4)
|
| 105 |
+
- **requirements_aws**: Used in conjunction with the Dockerfile for Python 3.11, CPU-only environments.
|
| 106 |
+
|
| 107 |
+
2. **Install packages from the requirements file:**
|
| 108 |
+
```bash
|
| 109 |
+
pip install -r requirements_gpu.txt
|
| 110 |
+
```
|
| 111 |
+
*This command reads every package name listed in the file and installs it into your `.venv` environment.*
|
| 112 |
+
|
| 113 |
+
You're all set\! β
Your project is cloned, and all dependencies are installed in an isolated environment.
|
| 114 |
+
|
| 115 |
+
When you are finished working, you can leave the virtual environment by simply typing:
|
| 116 |
+
|
| 117 |
+
```bash
|
| 118 |
+
deactivate
|
| 119 |
+
```
|
| 120 |
+
|
| 121 |
+
### Step 4: Verify CUDA compatibility (if using a GPU environment)
|
| 122 |
+
|
| 123 |
+
Install the relevant toolkit for CUDA 12.4 from here: https://developer.nvidia.com/cuda-12-4-0-download-archive
|
| 124 |
+
|
| 125 |
+
Restart your computer
|
| 126 |
+
|
| 127 |
+
Ensure you have the latest drivers for your NVIDIA GPU. Check your current version and memory availability by running nvidia-smi
|
| 128 |
+
|
| 129 |
+
In command line, CUDA compatibility can be checked by running nvcc --version
|
| 130 |
+
|
| 131 |
+
|
| 132 |
+
### Step 5: Ensure you have compatible NVIDIA drivers
|
| 133 |
+
|
| 134 |
+
Make sure you have the latest NVIDIA drivers installed on your system for your GPU (be careful in particular if using WSL that you have drivers compatible with this). Official drivers can be found here: https://www.nvidia.com/en-us/drivers
|
| 135 |
+
|
| 136 |
+
Current drivers can be found by running nvidia-smi in command line
|
| 137 |
+
|
| 138 |
+
### Step 6: Run the app
|
| 139 |
+
|
| 140 |
+
Go to the app project directory. Run python app.py
|
| 141 |
+
|
| 142 |
+
### Step 7: (optional) change default configuration
|
| 143 |
+
|
| 144 |
+
A number of configuration options can be seen the tools/config.py file. You can either pass in these variables as environment variables, or you can create a file in config/app_config.env to read this into the app on initialisation.
|
app.py
CHANGED
|
@@ -12,7 +12,7 @@ from tools.custom_csvlogger import CSVLogger_custom
|
|
| 12 |
from tools.auth import authenticate_user
|
| 13 |
from tools.prompts import initial_table_prompt, prompt2, prompt3, system_prompt, add_existing_topics_system_prompt, add_existing_topics_prompt, verify_titles_prompt, verify_titles_system_prompt, two_para_summary_format_prompt, single_para_summary_format_prompt
|
| 14 |
from tools.verify_titles import verify_titles
|
| 15 |
-
from tools.config import RUN_AWS_FUNCTIONS, HOST_NAME, ACCESS_LOGS_FOLDER, FEEDBACK_LOGS_FOLDER, USAGE_LOGS_FOLDER, RUN_LOCAL_MODEL, FILE_INPUT_HEIGHT, GEMINI_API_KEY, model_full_names, BATCH_SIZE_DEFAULT, CHOSEN_LOCAL_MODEL_TYPE, LLM_SEED, COGNITO_AUTH, MAX_QUEUE_SIZE, MAX_FILE_SIZE, GRADIO_SERVER_PORT, ROOT_PATH, INPUT_FOLDER, OUTPUT_FOLDER, S3_LOG_BUCKET, CONFIG_FOLDER, GRADIO_TEMP_DIR, MPLCONFIGDIR, model_name_map, GET_COST_CODES, ENFORCE_COST_CODES, DEFAULT_COST_CODE, COST_CODES_PATH, S3_COST_CODES_PATH, OUTPUT_COST_CODES_PATH, SHOW_COSTS, SAVE_LOGS_TO_CSV, SAVE_LOGS_TO_DYNAMODB, ACCESS_LOG_DYNAMODB_TABLE_NAME, USAGE_LOG_DYNAMODB_TABLE_NAME, FEEDBACK_LOG_DYNAMODB_TABLE_NAME, LOG_FILE_NAME, FEEDBACK_LOG_FILE_NAME, USAGE_LOG_FILE_NAME, CSV_ACCESS_LOG_HEADERS, CSV_FEEDBACK_LOG_HEADERS, CSV_USAGE_LOG_HEADERS, DYNAMODB_ACCESS_LOG_HEADERS, DYNAMODB_FEEDBACK_LOG_HEADERS, DYNAMODB_USAGE_LOG_HEADERS, S3_ACCESS_LOGS_FOLDER, S3_FEEDBACK_LOGS_FOLDER, S3_USAGE_LOGS_FOLDER
|
| 16 |
|
| 17 |
def ensure_folder_exists(output_folder:str):
|
| 18 |
"""Checks if the specified folder exists, creates it if not."""
|
|
@@ -115,6 +115,8 @@ with app:
|
|
| 115 |
summarised_outputs_list = gr.Dropdown(value= list(), choices= list(), visible=False, label="List of summarised outputs", allow_custom_value=True)
|
| 116 |
latest_summary_completed_num = gr.Number(0, visible=False)
|
| 117 |
|
|
|
|
|
|
|
| 118 |
original_data_file_name_textbox = gr.Textbox(label = "Reference data file name", value="", visible=False)
|
| 119 |
working_data_file_name_textbox = gr.Textbox(label = "Working data file name", value="", visible=False)
|
| 120 |
unique_topics_table_file_name_textbox = gr.Textbox(label="Unique topics data file name textbox", visible=False)
|
|
@@ -132,13 +134,17 @@ with app:
|
|
| 132 |
cost_code_dataframe = gr.Dataframe(value=pd.DataFrame(), type="pandas", visible=False, wrap=True)
|
| 133 |
cost_code_choice_drop = gr.Dropdown(value=DEFAULT_COST_CODE, label="Choose cost code for analysis. Please contact Finance if you can't find your cost code in the given list.", choices=[DEFAULT_COST_CODE], allow_custom_value=False, visible=False)
|
| 134 |
|
|
|
|
|
|
|
|
|
|
|
|
|
| 135 |
###
|
| 136 |
# UI LAYOUT
|
| 137 |
###
|
| 138 |
|
| 139 |
gr.Markdown("""# Large language model topic modelling
|
| 140 |
|
| 141 |
-
Extract topics and summarise outputs using Large Language Models (LLMs, Gemma 3 4b/GPT-OSS 20b if local (see config.py), Gemini 2.5, or Bedrock models e.g. (Claude 3 Haiku/Claude Sonnet 3.7). The app will query the LLM with batches of responses to produce summary tables, which are then compared iteratively to output a table with the general topics, subtopics, topic sentiment, and relevant text rows related to them. The prompts are designed for topic modelling public consultations, but they can be adapted to different contexts (see the LLM settings tab to modify).
|
| 142 |
|
| 143 |
Instructions on use can be found in the README.md file. Try it out with this [dummy development consultation dataset](https://huggingface.co/datasets/seanpedrickcase/dummy_development_consultation/tree/main), which you can also try with [zero-shot topics](https://huggingface.co/datasets/seanpedrickcase/dummy_development_consultation/tree/main). Try also this [dummy case notes dataset](https://huggingface.co/datasets/seanpedrickcase/dummy_case_notes/tree/main).
|
| 144 |
|
|
@@ -183,14 +189,12 @@ with app:
|
|
| 183 |
all_in_one_btn = gr.Button("All in one - Extract topics, deduplicate, and summarise", variant="primary")
|
| 184 |
extract_topics_btn = gr.Button("1. Extract topics", variant="secondary")
|
| 185 |
|
| 186 |
-
with gr.Row():
|
| 187 |
-
|
| 188 |
-
|
|
|
|
| 189 |
|
| 190 |
display_topic_table_markdown = gr.Markdown(value="### Language model response will appear here", show_copy_button=True)
|
| 191 |
-
latest_batch_completed = gr.Number(value=0, label="Number of files prepared", interactive=False, visible=False)
|
| 192 |
-
# Duplicate version of the above variable for when you don't want to initiate the summarisation loop
|
| 193 |
-
latest_batch_completed_no_loop = gr.Number(value=0, label="Number of files prepared", interactive=False, visible=False)
|
| 194 |
|
| 195 |
data_feedback_title = gr.Markdown(value="## Please give feedback", visible=False)
|
| 196 |
data_feedback_radio = gr.Radio(label="Please give some feedback about the results of the topic extraction.",
|
|
@@ -287,14 +291,15 @@ with app:
|
|
| 287 |
with gr.Tab(label="LLM and topic extraction settings"):
|
| 288 |
gr.Markdown("""Define settings that affect large language model output.""")
|
| 289 |
with gr.Accordion("Settings for LLM generation", open = True):
|
| 290 |
-
temperature_slide = gr.Slider(minimum=0.1, maximum=1.0, value=0.1, label="Choose LLM temperature setting")
|
| 291 |
batch_size_number = gr.Number(label = "Number of responses to submit in a single LLM query", value = BATCH_SIZE_DEFAULT, precision=0, minimum=1, maximum=100)
|
| 292 |
random_seed = gr.Number(value=LLM_SEED, label="Random seed for LLM generation", visible=False)
|
| 293 |
|
| 294 |
with gr.Accordion("AWS API keys", open = False):
|
|
|
|
| 295 |
with gr.Row():
|
| 296 |
-
aws_access_key_textbox = gr.Textbox(label="AWS access key",
|
| 297 |
-
aws_secret_key_textbox = gr.Textbox(label="AWS secret key",
|
| 298 |
|
| 299 |
with gr.Accordion("Gemini API keys", open = False):
|
| 300 |
google_api_key_textbox = gr.Textbox(value = GEMINI_API_KEY, label="Enter Gemini API key (only if using Google API models)", lines=1, type="password")
|
|
@@ -413,10 +418,11 @@ with app:
|
|
| 413 |
missing_df_state,
|
| 414 |
input_tokens_num,
|
| 415 |
output_tokens_num,
|
| 416 |
-
number_of_calls_num
|
| 417 |
-
|
|
|
|
| 418 |
success(lambda *args: usage_callback.flag(list(args), save_to_csv=SAVE_LOGS_TO_CSV, save_to_dynamodb=SAVE_LOGS_TO_DYNAMODB, dynamodb_table_name=USAGE_LOG_DYNAMODB_TABLE_NAME, dynamodb_headers=DYNAMODB_USAGE_LOG_HEADERS, replacement_headers=CSV_USAGE_LOG_HEADERS), [session_hash_textbox, original_data_file_name_textbox, in_colnames, model_choice, conversation_metadata_textbox_placeholder, input_tokens_num, output_tokens_num, number_of_calls_num, estimated_time_taken_number, cost_code_choice_drop], None, preprocess=False, api_name="usage_logs").\
|
| 419 |
-
success(collect_output_csvs_and_create_excel_output, inputs=[in_data_files, in_colnames, original_data_file_name_textbox, in_group_col, model_choice, master_reference_df_state, master_unique_topics_df_state, summarised_output_df, missing_df_state, in_excel_sheets, usage_logs_state, model_name_map_state, output_folder_state], outputs=[topic_extraction_output_files_xlsx])
|
| 420 |
|
| 421 |
###
|
| 422 |
# DEDUPLICATION AND SUMMARISATION FUNCTIONS
|
|
@@ -436,16 +442,16 @@ with app:
|
|
| 436 |
success(fn= enforce_cost_codes, inputs=[enforce_cost_code_textbox, cost_code_choice_drop, cost_code_dataframe_base]).\
|
| 437 |
success(load_in_previous_data_files, inputs=[summarisation_input_files], outputs=[master_reference_df_state, master_unique_topics_df_state, latest_batch_completed_no_loop, deduplication_input_files_status, working_data_file_name_textbox, unique_topics_table_file_name_textbox]).\
|
| 438 |
success(sample_reference_table_summaries, inputs=[master_reference_df_state, random_seed], outputs=[summary_reference_table_sample_state, summarised_references_markdown], api_name="sample_summaries").\
|
| 439 |
-
success(summarise_output_topics, inputs=[summary_reference_table_sample_state, master_unique_topics_df_state, master_reference_df_state, model_choice, google_api_key_textbox, temperature_slide, working_data_file_name_textbox, summarised_outputs_list, latest_summary_completed_num, conversation_metadata_textbox, in_data_files, in_excel_sheets, in_colnames, log_files_output_list_state, summarise_format_radio, output_folder_state, context_textbox, aws_access_key_textbox, aws_secret_key_textbox, model_name_map_state], outputs=[summary_reference_table_sample_state, master_unique_topics_df_revised_summaries_state, master_reference_df_revised_summaries_state, summary_output_files, summarised_outputs_list, latest_summary_completed_num, conversation_metadata_textbox, summarised_output_markdown, log_files_output, overall_summarisation_input_files, input_tokens_num, output_tokens_num, number_of_calls_num, estimated_time_taken_number], api_name="summarise_topics").\
|
| 440 |
success(lambda *args: usage_callback.flag(list(args), save_to_csv=SAVE_LOGS_TO_CSV, save_to_dynamodb=SAVE_LOGS_TO_DYNAMODB, dynamodb_table_name=USAGE_LOG_DYNAMODB_TABLE_NAME, dynamodb_headers=DYNAMODB_USAGE_LOG_HEADERS, replacement_headers=CSV_USAGE_LOG_HEADERS), [session_hash_textbox, original_data_file_name_textbox, in_colnames, model_choice, conversation_metadata_textbox_placeholder, input_tokens_num, output_tokens_num, number_of_calls_num, estimated_time_taken_number, cost_code_choice_drop], None, preprocess=False).\
|
| 441 |
-
success(collect_output_csvs_and_create_excel_output, inputs=[in_data_files, in_colnames, original_data_file_name_textbox, in_group_col, model_choice, master_reference_df_revised_summaries_state, master_unique_topics_df_revised_summaries_state, summarised_output_df, missing_df_state, in_excel_sheets, usage_logs_state, model_name_map_state, output_folder_state], outputs=[summary_output_files_xlsx])
|
| 442 |
|
| 443 |
# SUMMARISE WHOLE TABLE PAGE
|
| 444 |
overall_summarise_previous_data_btn.click(fn= enforce_cost_codes, inputs=[enforce_cost_code_textbox, cost_code_choice_drop, cost_code_dataframe_base]).\
|
| 445 |
success(load_in_previous_data_files, inputs=[overall_summarisation_input_files], outputs=[master_reference_df_state, master_unique_topics_df_state, latest_batch_completed_no_loop, deduplication_input_files_status, working_data_file_name_textbox, unique_topics_table_file_name_textbox]).\
|
| 446 |
-
success(overall_summary, inputs=[master_unique_topics_df_state, model_choice, google_api_key_textbox, temperature_slide, working_data_file_name_textbox, output_folder_state, in_colnames, context_textbox, aws_access_key_textbox, aws_secret_key_textbox, model_name_map_state], outputs=[overall_summary_output_files, overall_summarised_output_markdown, summarised_output_df, conversation_metadata_textbox, input_tokens_num, output_tokens_num, number_of_calls_num, estimated_time_taken_number], scroll_to_output=True, api_name="overall_summary").\
|
| 447 |
success(lambda *args: usage_callback.flag(list(args), save_to_csv=SAVE_LOGS_TO_CSV, save_to_dynamodb=SAVE_LOGS_TO_DYNAMODB, dynamodb_table_name=USAGE_LOG_DYNAMODB_TABLE_NAME, dynamodb_headers=DYNAMODB_USAGE_LOG_HEADERS, replacement_headers=CSV_USAGE_LOG_HEADERS), [session_hash_textbox, original_data_file_name_textbox, in_colnames, model_choice, conversation_metadata_textbox_placeholder, input_tokens_num, output_tokens_num, number_of_calls_num, estimated_time_taken_number, cost_code_choice_drop], None, preprocess=False).\
|
| 448 |
-
success(collect_output_csvs_and_create_excel_output, inputs=[in_data_files, in_colnames, original_data_file_name_textbox, in_group_col, model_choice, master_reference_df_state, master_unique_topics_df_state, summarised_output_df, missing_df_state, in_excel_sheets, usage_logs_state, model_name_map_state, output_folder_state], outputs=[overall_summary_output_files_xlsx])
|
| 449 |
|
| 450 |
|
| 451 |
# All in one button
|
|
@@ -509,21 +515,20 @@ with app:
|
|
| 509 |
missing_df_state,
|
| 510 |
input_tokens_num,
|
| 511 |
output_tokens_num,
|
| 512 |
-
number_of_calls_num
|
|
|
|
| 513 |
success(lambda *args: usage_callback.flag(list(args), save_to_csv=SAVE_LOGS_TO_CSV, save_to_dynamodb=SAVE_LOGS_TO_DYNAMODB, dynamodb_table_name=USAGE_LOG_DYNAMODB_TABLE_NAME, dynamodb_headers=DYNAMODB_USAGE_LOG_HEADERS, replacement_headers=CSV_USAGE_LOG_HEADERS), [session_hash_textbox, original_data_file_name_textbox, in_colnames, model_choice, conversation_metadata_textbox_placeholder, input_tokens_num, output_tokens_num, number_of_calls_num, estimated_time_taken_number, cost_code_choice_drop], None, preprocess=False).\
|
| 514 |
success(load_in_previous_data_files, inputs=[deduplication_input_files], outputs=[master_reference_df_state, master_unique_topics_df_state, latest_batch_completed_no_loop, deduplication_input_files_status, working_data_file_name_textbox, unique_topics_table_file_name_textbox]).\
|
| 515 |
success(deduplicate_topics, inputs=[master_reference_df_state, master_unique_topics_df_state, working_data_file_name_textbox, unique_topics_table_file_name_textbox, in_excel_sheets, merge_sentiment_drop, merge_general_topics_drop, deduplicate_score_threshold, in_data_files, in_colnames, output_folder_state], outputs=[master_reference_df_state, master_unique_topics_df_state, summarisation_input_files, log_files_output, summarised_output_markdown]).\
|
| 516 |
success(load_in_previous_data_files, inputs=[summarisation_input_files], outputs=[master_reference_df_state, master_unique_topics_df_state, latest_batch_completed_no_loop, deduplication_input_files_status, working_data_file_name_textbox, unique_topics_table_file_name_textbox]).\
|
| 517 |
success(sample_reference_table_summaries, inputs=[master_reference_df_state, random_seed], outputs=[summary_reference_table_sample_state, summarised_references_markdown]).\
|
| 518 |
-
success(summarise_output_topics, inputs=[summary_reference_table_sample_state, master_unique_topics_df_state, master_reference_df_state, model_choice, google_api_key_textbox, temperature_slide, working_data_file_name_textbox, summarised_outputs_list, latest_summary_completed_num, conversation_metadata_textbox, in_data_files, in_excel_sheets, in_colnames, log_files_output_list_state, summarise_format_radio, output_folder_state, context_textbox, aws_access_key_textbox, aws_secret_key_textbox, model_name_map_state], outputs=[summary_reference_table_sample_state, master_unique_topics_df_revised_summaries_state, master_reference_df_revised_summaries_state, summary_output_files, summarised_outputs_list, latest_summary_completed_num, conversation_metadata_textbox,
|
| 519 |
success(lambda *args: usage_callback.flag(list(args), save_to_csv=SAVE_LOGS_TO_CSV, save_to_dynamodb=SAVE_LOGS_TO_DYNAMODB, dynamodb_table_name=USAGE_LOG_DYNAMODB_TABLE_NAME, dynamodb_headers=DYNAMODB_USAGE_LOG_HEADERS, replacement_headers=CSV_USAGE_LOG_HEADERS), [session_hash_textbox, original_data_file_name_textbox, in_colnames, model_choice, conversation_metadata_textbox_placeholder, input_tokens_num, output_tokens_num, number_of_calls_num, estimated_time_taken_number, cost_code_choice_drop], None, preprocess=False).\
|
| 520 |
success(load_in_previous_data_files, inputs=[overall_summarisation_input_files], outputs=[master_reference_df_state, master_unique_topics_df_state, latest_batch_completed_no_loop, deduplication_input_files_status, working_data_file_name_textbox, unique_topics_table_file_name_textbox]).\
|
| 521 |
-
success(overall_summary, inputs=[master_unique_topics_df_state, model_choice, google_api_key_textbox, temperature_slide, working_data_file_name_textbox, output_folder_state, in_colnames, context_textbox, aws_access_key_textbox, aws_secret_key_textbox, model_name_map_state], outputs=[overall_summary_output_files, overall_summarised_output_markdown, summarised_output_df, conversation_metadata_textbox, input_tokens_num, output_tokens_num, number_of_calls_num, estimated_time_taken_number]).\
|
| 522 |
success(lambda *args: usage_callback.flag(list(args), save_to_csv=SAVE_LOGS_TO_CSV, save_to_dynamodb=SAVE_LOGS_TO_DYNAMODB, dynamodb_table_name=USAGE_LOG_DYNAMODB_TABLE_NAME, dynamodb_headers=DYNAMODB_USAGE_LOG_HEADERS, replacement_headers=CSV_USAGE_LOG_HEADERS), [session_hash_textbox, original_data_file_name_textbox, in_colnames, model_choice, conversation_metadata_textbox_placeholder, input_tokens_num, output_tokens_num, number_of_calls_num, estimated_time_taken_number, cost_code_choice_drop], None, preprocess=False).\
|
| 523 |
-
success(collect_output_csvs_and_create_excel_output, inputs=[in_data_files, in_colnames, original_data_file_name_textbox, in_group_col, model_choice, master_reference_df_state, master_unique_topics_df_state, summarised_output_df, missing_df_state, in_excel_sheets, usage_logs_state, model_name_map_state, output_folder_state], outputs=[overall_summary_output_files_xlsx]).\
|
| 524 |
-
success(move_overall_summary_output_files_to_front_page, inputs=[
|
| 525 |
-
|
| 526 |
-
|
| 527 |
|
| 528 |
###
|
| 529 |
# CONTINUE PREVIOUS TOPIC EXTRACTION PAGE
|
|
|
|
| 12 |
from tools.auth import authenticate_user
|
| 13 |
from tools.prompts import initial_table_prompt, prompt2, prompt3, system_prompt, add_existing_topics_system_prompt, add_existing_topics_prompt, verify_titles_prompt, verify_titles_system_prompt, two_para_summary_format_prompt, single_para_summary_format_prompt
|
| 14 |
from tools.verify_titles import verify_titles
|
| 15 |
+
from tools.config import RUN_AWS_FUNCTIONS, HOST_NAME, ACCESS_LOGS_FOLDER, FEEDBACK_LOGS_FOLDER, USAGE_LOGS_FOLDER, RUN_LOCAL_MODEL, FILE_INPUT_HEIGHT, GEMINI_API_KEY, model_full_names, BATCH_SIZE_DEFAULT, CHOSEN_LOCAL_MODEL_TYPE, LLM_SEED, COGNITO_AUTH, MAX_QUEUE_SIZE, MAX_FILE_SIZE, GRADIO_SERVER_PORT, ROOT_PATH, INPUT_FOLDER, OUTPUT_FOLDER, S3_LOG_BUCKET, CONFIG_FOLDER, GRADIO_TEMP_DIR, MPLCONFIGDIR, model_name_map, GET_COST_CODES, ENFORCE_COST_CODES, DEFAULT_COST_CODE, COST_CODES_PATH, S3_COST_CODES_PATH, OUTPUT_COST_CODES_PATH, SHOW_COSTS, SAVE_LOGS_TO_CSV, SAVE_LOGS_TO_DYNAMODB, ACCESS_LOG_DYNAMODB_TABLE_NAME, USAGE_LOG_DYNAMODB_TABLE_NAME, FEEDBACK_LOG_DYNAMODB_TABLE_NAME, LOG_FILE_NAME, FEEDBACK_LOG_FILE_NAME, USAGE_LOG_FILE_NAME, CSV_ACCESS_LOG_HEADERS, CSV_FEEDBACK_LOG_HEADERS, CSV_USAGE_LOG_HEADERS, DYNAMODB_ACCESS_LOG_HEADERS, DYNAMODB_FEEDBACK_LOG_HEADERS, DYNAMODB_USAGE_LOG_HEADERS, S3_ACCESS_LOGS_FOLDER, S3_FEEDBACK_LOGS_FOLDER, S3_USAGE_LOGS_FOLDER, AWS_ACCESS_KEY, AWS_SECRET_KEY
|
| 16 |
|
| 17 |
def ensure_folder_exists(output_folder:str):
|
| 18 |
"""Checks if the specified folder exists, creates it if not."""
|
|
|
|
| 115 |
summarised_outputs_list = gr.Dropdown(value= list(), choices= list(), visible=False, label="List of summarised outputs", allow_custom_value=True)
|
| 116 |
latest_summary_completed_num = gr.Number(0, visible=False)
|
| 117 |
|
| 118 |
+
summary_xlsx_output_files_list = gr.Dropdown(value= list(), choices= list(), visible=False, label="List of xlsx summary output files", allow_custom_value=True)
|
| 119 |
+
|
| 120 |
original_data_file_name_textbox = gr.Textbox(label = "Reference data file name", value="", visible=False)
|
| 121 |
working_data_file_name_textbox = gr.Textbox(label = "Working data file name", value="", visible=False)
|
| 122 |
unique_topics_table_file_name_textbox = gr.Textbox(label="Unique topics data file name textbox", visible=False)
|
|
|
|
| 134 |
cost_code_dataframe = gr.Dataframe(value=pd.DataFrame(), type="pandas", visible=False, wrap=True)
|
| 135 |
cost_code_choice_drop = gr.Dropdown(value=DEFAULT_COST_CODE, label="Choose cost code for analysis. Please contact Finance if you can't find your cost code in the given list.", choices=[DEFAULT_COST_CODE], allow_custom_value=False, visible=False)
|
| 136 |
|
| 137 |
+
latest_batch_completed = gr.Number(value=0, label="Number of files prepared", interactive=False, visible=False)
|
| 138 |
+
# Duplicate version of the above variable for when you don't want to initiate the summarisation loop
|
| 139 |
+
latest_batch_completed_no_loop = gr.Number(value=0, label="Number of files prepared", interactive=False, visible=False)
|
| 140 |
+
|
| 141 |
###
|
| 142 |
# UI LAYOUT
|
| 143 |
###
|
| 144 |
|
| 145 |
gr.Markdown("""# Large language model topic modelling
|
| 146 |
|
| 147 |
+
Extract topics and summarise outputs using Large Language Models (LLMs, Gemma 3 4b/GPT-OSS 20b if local (see tools/config.py to modify), Gemini 2.5, or Bedrock models e.g. (Claude 3 Haiku/Claude Sonnet 3.7). The app will query the LLM with batches of responses to produce summary tables, which are then compared iteratively to output a table with the general topics, subtopics, topic sentiment, and relevant text rows related to them. The prompts are designed for topic modelling public consultations, but they can be adapted to different contexts (see the LLM settings tab to modify).
|
| 148 |
|
| 149 |
Instructions on use can be found in the README.md file. Try it out with this [dummy development consultation dataset](https://huggingface.co/datasets/seanpedrickcase/dummy_development_consultation/tree/main), which you can also try with [zero-shot topics](https://huggingface.co/datasets/seanpedrickcase/dummy_development_consultation/tree/main). Try also this [dummy case notes dataset](https://huggingface.co/datasets/seanpedrickcase/dummy_case_notes/tree/main).
|
| 150 |
|
|
|
|
| 189 |
all_in_one_btn = gr.Button("All in one - Extract topics, deduplicate, and summarise", variant="primary")
|
| 190 |
extract_topics_btn = gr.Button("1. Extract topics", variant="secondary")
|
| 191 |
|
| 192 |
+
with gr.Row(equal_height=True):
|
| 193 |
+
output_messages_textbox = gr.Textbox(value="", label="Output messages", scale=1, interactive=False)
|
| 194 |
+
topic_extraction_output_files = gr.File(label="Extract topics output files", scale=1, interactive=False)
|
| 195 |
+
topic_extraction_output_files_xlsx = gr.File(label="Overall summary xlsx file", scale=1, interactive=False)
|
| 196 |
|
| 197 |
display_topic_table_markdown = gr.Markdown(value="### Language model response will appear here", show_copy_button=True)
|
|
|
|
|
|
|
|
|
|
| 198 |
|
| 199 |
data_feedback_title = gr.Markdown(value="## Please give feedback", visible=False)
|
| 200 |
data_feedback_radio = gr.Radio(label="Please give some feedback about the results of the topic extraction.",
|
|
|
|
| 291 |
with gr.Tab(label="LLM and topic extraction settings"):
|
| 292 |
gr.Markdown("""Define settings that affect large language model output.""")
|
| 293 |
with gr.Accordion("Settings for LLM generation", open = True):
|
| 294 |
+
temperature_slide = gr.Slider(minimum=0.1, maximum=1.0, value=0.1, label="Choose LLM temperature setting", precision=1)
|
| 295 |
batch_size_number = gr.Number(label = "Number of responses to submit in a single LLM query", value = BATCH_SIZE_DEFAULT, precision=0, minimum=1, maximum=100)
|
| 296 |
random_seed = gr.Number(value=LLM_SEED, label="Random seed for LLM generation", visible=False)
|
| 297 |
|
| 298 |
with gr.Accordion("AWS API keys", open = False):
|
| 299 |
+
gr.Markdown("""Querying Bedrock models with API keys requires a role with IAM permissions for the bedrock:InvokeModel action.""")
|
| 300 |
with gr.Row():
|
| 301 |
+
aws_access_key_textbox = gr.Textbox(value=AWS_ACCESS_KEY, label="AWS access key", lines=1, type="password")
|
| 302 |
+
aws_secret_key_textbox = gr.Textbox(value=AWS_SECRET_KEY, label="AWS secret key", lines=1, type="password")
|
| 303 |
|
| 304 |
with gr.Accordion("Gemini API keys", open = False):
|
| 305 |
google_api_key_textbox = gr.Textbox(value = GEMINI_API_KEY, label="Enter Gemini API key (only if using Google API models)", lines=1, type="password")
|
|
|
|
| 418 |
missing_df_state,
|
| 419 |
input_tokens_num,
|
| 420 |
output_tokens_num,
|
| 421 |
+
number_of_calls_num,
|
| 422 |
+
output_messages_textbox],
|
| 423 |
+
api_name="extract_topics", show_progress_on=output_messages_textbox).\
|
| 424 |
success(lambda *args: usage_callback.flag(list(args), save_to_csv=SAVE_LOGS_TO_CSV, save_to_dynamodb=SAVE_LOGS_TO_DYNAMODB, dynamodb_table_name=USAGE_LOG_DYNAMODB_TABLE_NAME, dynamodb_headers=DYNAMODB_USAGE_LOG_HEADERS, replacement_headers=CSV_USAGE_LOG_HEADERS), [session_hash_textbox, original_data_file_name_textbox, in_colnames, model_choice, conversation_metadata_textbox_placeholder, input_tokens_num, output_tokens_num, number_of_calls_num, estimated_time_taken_number, cost_code_choice_drop], None, preprocess=False, api_name="usage_logs").\
|
| 425 |
+
success(collect_output_csvs_and_create_excel_output, inputs=[in_data_files, in_colnames, original_data_file_name_textbox, in_group_col, model_choice, master_reference_df_state, master_unique_topics_df_state, summarised_output_df, missing_df_state, in_excel_sheets, usage_logs_state, model_name_map_state, output_folder_state], outputs=[topic_extraction_output_files_xlsx, summary_xlsx_output_files_list])
|
| 426 |
|
| 427 |
###
|
| 428 |
# DEDUPLICATION AND SUMMARISATION FUNCTIONS
|
|
|
|
| 442 |
success(fn= enforce_cost_codes, inputs=[enforce_cost_code_textbox, cost_code_choice_drop, cost_code_dataframe_base]).\
|
| 443 |
success(load_in_previous_data_files, inputs=[summarisation_input_files], outputs=[master_reference_df_state, master_unique_topics_df_state, latest_batch_completed_no_loop, deduplication_input_files_status, working_data_file_name_textbox, unique_topics_table_file_name_textbox]).\
|
| 444 |
success(sample_reference_table_summaries, inputs=[master_reference_df_state, random_seed], outputs=[summary_reference_table_sample_state, summarised_references_markdown], api_name="sample_summaries").\
|
| 445 |
+
success(summarise_output_topics, inputs=[summary_reference_table_sample_state, master_unique_topics_df_state, master_reference_df_state, model_choice, google_api_key_textbox, temperature_slide, working_data_file_name_textbox, summarised_outputs_list, latest_summary_completed_num, conversation_metadata_textbox, in_data_files, in_excel_sheets, in_colnames, log_files_output_list_state, summarise_format_radio, output_folder_state, context_textbox, aws_access_key_textbox, aws_secret_key_textbox, model_name_map_state], outputs=[summary_reference_table_sample_state, master_unique_topics_df_revised_summaries_state, master_reference_df_revised_summaries_state, summary_output_files, summarised_outputs_list, latest_summary_completed_num, conversation_metadata_textbox, summarised_output_markdown, log_files_output, overall_summarisation_input_files, input_tokens_num, output_tokens_num, number_of_calls_num, estimated_time_taken_number, output_messages_textbox], api_name="summarise_topics", show_progress_on=[output_messages_textbox, summary_output_files]).\
|
| 446 |
success(lambda *args: usage_callback.flag(list(args), save_to_csv=SAVE_LOGS_TO_CSV, save_to_dynamodb=SAVE_LOGS_TO_DYNAMODB, dynamodb_table_name=USAGE_LOG_DYNAMODB_TABLE_NAME, dynamodb_headers=DYNAMODB_USAGE_LOG_HEADERS, replacement_headers=CSV_USAGE_LOG_HEADERS), [session_hash_textbox, original_data_file_name_textbox, in_colnames, model_choice, conversation_metadata_textbox_placeholder, input_tokens_num, output_tokens_num, number_of_calls_num, estimated_time_taken_number, cost_code_choice_drop], None, preprocess=False).\
|
| 447 |
+
success(collect_output_csvs_and_create_excel_output, inputs=[in_data_files, in_colnames, original_data_file_name_textbox, in_group_col, model_choice, master_reference_df_revised_summaries_state, master_unique_topics_df_revised_summaries_state, summarised_output_df, missing_df_state, in_excel_sheets, usage_logs_state, model_name_map_state, output_folder_state], outputs=[summary_output_files_xlsx, summary_xlsx_output_files_list])
|
| 448 |
|
| 449 |
# SUMMARISE WHOLE TABLE PAGE
|
| 450 |
overall_summarise_previous_data_btn.click(fn= enforce_cost_codes, inputs=[enforce_cost_code_textbox, cost_code_choice_drop, cost_code_dataframe_base]).\
|
| 451 |
success(load_in_previous_data_files, inputs=[overall_summarisation_input_files], outputs=[master_reference_df_state, master_unique_topics_df_state, latest_batch_completed_no_loop, deduplication_input_files_status, working_data_file_name_textbox, unique_topics_table_file_name_textbox]).\
|
| 452 |
+
success(overall_summary, inputs=[master_unique_topics_df_state, model_choice, google_api_key_textbox, temperature_slide, working_data_file_name_textbox, output_folder_state, in_colnames, context_textbox, aws_access_key_textbox, aws_secret_key_textbox, model_name_map_state], outputs=[overall_summary_output_files, overall_summarised_output_markdown, summarised_output_df, conversation_metadata_textbox, input_tokens_num, output_tokens_num, number_of_calls_num, estimated_time_taken_number, output_messages_textbox], scroll_to_output=True, api_name="overall_summary", show_progress_on=[output_messages_textbox, overall_summary_output_files]).\
|
| 453 |
success(lambda *args: usage_callback.flag(list(args), save_to_csv=SAVE_LOGS_TO_CSV, save_to_dynamodb=SAVE_LOGS_TO_DYNAMODB, dynamodb_table_name=USAGE_LOG_DYNAMODB_TABLE_NAME, dynamodb_headers=DYNAMODB_USAGE_LOG_HEADERS, replacement_headers=CSV_USAGE_LOG_HEADERS), [session_hash_textbox, original_data_file_name_textbox, in_colnames, model_choice, conversation_metadata_textbox_placeholder, input_tokens_num, output_tokens_num, number_of_calls_num, estimated_time_taken_number, cost_code_choice_drop], None, preprocess=False).\
|
| 454 |
+
success(collect_output_csvs_and_create_excel_output, inputs=[in_data_files, in_colnames, original_data_file_name_textbox, in_group_col, model_choice, master_reference_df_state, master_unique_topics_df_state, summarised_output_df, missing_df_state, in_excel_sheets, usage_logs_state, model_name_map_state, output_folder_state], outputs=[overall_summary_output_files_xlsx, summary_xlsx_output_files_list])
|
| 455 |
|
| 456 |
|
| 457 |
# All in one button
|
|
|
|
| 515 |
missing_df_state,
|
| 516 |
input_tokens_num,
|
| 517 |
output_tokens_num,
|
| 518 |
+
number_of_calls_num,
|
| 519 |
+
output_messages_textbox], show_progress_on=output_messages_textbox).\
|
| 520 |
success(lambda *args: usage_callback.flag(list(args), save_to_csv=SAVE_LOGS_TO_CSV, save_to_dynamodb=SAVE_LOGS_TO_DYNAMODB, dynamodb_table_name=USAGE_LOG_DYNAMODB_TABLE_NAME, dynamodb_headers=DYNAMODB_USAGE_LOG_HEADERS, replacement_headers=CSV_USAGE_LOG_HEADERS), [session_hash_textbox, original_data_file_name_textbox, in_colnames, model_choice, conversation_metadata_textbox_placeholder, input_tokens_num, output_tokens_num, number_of_calls_num, estimated_time_taken_number, cost_code_choice_drop], None, preprocess=False).\
|
| 521 |
success(load_in_previous_data_files, inputs=[deduplication_input_files], outputs=[master_reference_df_state, master_unique_topics_df_state, latest_batch_completed_no_loop, deduplication_input_files_status, working_data_file_name_textbox, unique_topics_table_file_name_textbox]).\
|
| 522 |
success(deduplicate_topics, inputs=[master_reference_df_state, master_unique_topics_df_state, working_data_file_name_textbox, unique_topics_table_file_name_textbox, in_excel_sheets, merge_sentiment_drop, merge_general_topics_drop, deduplicate_score_threshold, in_data_files, in_colnames, output_folder_state], outputs=[master_reference_df_state, master_unique_topics_df_state, summarisation_input_files, log_files_output, summarised_output_markdown]).\
|
| 523 |
success(load_in_previous_data_files, inputs=[summarisation_input_files], outputs=[master_reference_df_state, master_unique_topics_df_state, latest_batch_completed_no_loop, deduplication_input_files_status, working_data_file_name_textbox, unique_topics_table_file_name_textbox]).\
|
| 524 |
success(sample_reference_table_summaries, inputs=[master_reference_df_state, random_seed], outputs=[summary_reference_table_sample_state, summarised_references_markdown]).\
|
| 525 |
+
success(summarise_output_topics, inputs=[summary_reference_table_sample_state, master_unique_topics_df_state, master_reference_df_state, model_choice, google_api_key_textbox, temperature_slide, working_data_file_name_textbox, summarised_outputs_list, latest_summary_completed_num, conversation_metadata_textbox, in_data_files, in_excel_sheets, in_colnames, log_files_output_list_state, summarise_format_radio, output_folder_state, context_textbox, aws_access_key_textbox, aws_secret_key_textbox, model_name_map_state], outputs=[summary_reference_table_sample_state, master_unique_topics_df_revised_summaries_state, master_reference_df_revised_summaries_state, summary_output_files, summarised_outputs_list, latest_summary_completed_num, conversation_metadata_textbox, display_topic_table_markdown, log_files_output, overall_summarisation_input_files, input_tokens_num, output_tokens_num, number_of_calls_num, estimated_time_taken_number, output_messages_textbox], show_progress_on=[output_messages_textbox, summary_output_files]).\
|
| 526 |
success(lambda *args: usage_callback.flag(list(args), save_to_csv=SAVE_LOGS_TO_CSV, save_to_dynamodb=SAVE_LOGS_TO_DYNAMODB, dynamodb_table_name=USAGE_LOG_DYNAMODB_TABLE_NAME, dynamodb_headers=DYNAMODB_USAGE_LOG_HEADERS, replacement_headers=CSV_USAGE_LOG_HEADERS), [session_hash_textbox, original_data_file_name_textbox, in_colnames, model_choice, conversation_metadata_textbox_placeholder, input_tokens_num, output_tokens_num, number_of_calls_num, estimated_time_taken_number, cost_code_choice_drop], None, preprocess=False).\
|
| 527 |
success(load_in_previous_data_files, inputs=[overall_summarisation_input_files], outputs=[master_reference_df_state, master_unique_topics_df_state, latest_batch_completed_no_loop, deduplication_input_files_status, working_data_file_name_textbox, unique_topics_table_file_name_textbox]).\
|
| 528 |
+
success(overall_summary, inputs=[master_unique_topics_df_state, model_choice, google_api_key_textbox, temperature_slide, working_data_file_name_textbox, output_folder_state, in_colnames, context_textbox, aws_access_key_textbox, aws_secret_key_textbox, model_name_map_state], outputs=[overall_summary_output_files, overall_summarised_output_markdown, summarised_output_df, conversation_metadata_textbox, input_tokens_num, output_tokens_num, number_of_calls_num, estimated_time_taken_number, output_messages_textbox], show_progress_on=[output_messages_textbox, overall_summary_output_files]).\
|
| 529 |
success(lambda *args: usage_callback.flag(list(args), save_to_csv=SAVE_LOGS_TO_CSV, save_to_dynamodb=SAVE_LOGS_TO_DYNAMODB, dynamodb_table_name=USAGE_LOG_DYNAMODB_TABLE_NAME, dynamodb_headers=DYNAMODB_USAGE_LOG_HEADERS, replacement_headers=CSV_USAGE_LOG_HEADERS), [session_hash_textbox, original_data_file_name_textbox, in_colnames, model_choice, conversation_metadata_textbox_placeholder, input_tokens_num, output_tokens_num, number_of_calls_num, estimated_time_taken_number, cost_code_choice_drop], None, preprocess=False).\
|
| 530 |
+
success(collect_output_csvs_and_create_excel_output, inputs=[in_data_files, in_colnames, original_data_file_name_textbox, in_group_col, model_choice, master_reference_df_state, master_unique_topics_df_state, summarised_output_df, missing_df_state, in_excel_sheets, usage_logs_state, model_name_map_state, output_folder_state], outputs=[overall_summary_output_files_xlsx, summary_xlsx_output_files_list]).\
|
| 531 |
+
success(move_overall_summary_output_files_to_front_page, inputs=[summary_xlsx_output_files_list], outputs=[topic_extraction_output_files_xlsx])
|
|
|
|
|
|
|
| 532 |
|
| 533 |
###
|
| 534 |
# CONTINUE PREVIOUS TOPIC EXTRACTION PAGE
|
requirements.txt
CHANGED
|
@@ -1,3 +1,4 @@
|
|
|
|
|
| 1 |
pandas==2.3.2
|
| 2 |
gradio==5.44.1
|
| 3 |
transformers==4.56.0
|
|
@@ -13,15 +14,13 @@ html5lib==1.1
|
|
| 13 |
beautifulsoup4==4.12.3
|
| 14 |
rapidfuzz==3.13.0
|
| 15 |
python-dotenv==1.1.0
|
| 16 |
-
# Torch and
|
| 17 |
# GPU
|
| 18 |
torch==2.6.0 --extra-index-url https://download.pytorch.org/whl/cu124 # Latest compatible with CUDA 12.4
|
| 19 |
-
https://github.com/abetlen/llama-cpp-python/releases/download/v0.3.16-cu124/llama_cpp_python-0.3.16-cp310-cp310-linux_x86_64.whl
|
| 20 |
-
#
|
| 21 |
-
# CPU only (for e.g. Hugging Face CPU instances):
|
| 22 |
#torch==2.7.1 --extra-index-url https://download.pytorch.org/whl/cpu
|
| 23 |
-
# For Hugging Face, need a python 3.10 compatible wheel for llama-cpp-python to avoid build timeouts
|
| 24 |
#https://github.com/seanpedrick-case/llama-cpp-python-whl-builder/releases/download/v0.1.0/llama_cpp_python-0.3.16-cp310-cp310-linux_x86_64.whl
|
| 25 |
-
#https://github.com/seanpedrick-case/llama-cpp-python-whl-builder/releases/download/v0.1.0/llama_cpp_python-0.3.16-cp311-cp311-linux_x86_64.whl
|
| 26 |
|
| 27 |
|
|
|
|
| 1 |
+
# Note that this requirements file is optimised for Hugging Face spaces / Python 3.10. Please use requirements_cpu.txt for CPU instances and requirements_gpu.txt for GPU instances using Python 3.11
|
| 2 |
pandas==2.3.2
|
| 3 |
gradio==5.44.1
|
| 4 |
transformers==4.56.0
|
|
|
|
| 14 |
beautifulsoup4==4.12.3
|
| 15 |
rapidfuzz==3.13.0
|
| 16 |
python-dotenv==1.1.0
|
| 17 |
+
# Torch and llama-cpp-python
|
| 18 |
# GPU
|
| 19 |
torch==2.6.0 --extra-index-url https://download.pytorch.org/whl/cu124 # Latest compatible with CUDA 12.4
|
| 20 |
+
https://github.com/abetlen/llama-cpp-python/releases/download/v0.3.16-cu124/llama_cpp_python-0.3.16-cp310-cp310-linux_x86_64.whl
|
| 21 |
+
# CPU only (for e.g. Hugging Face CPU instances)
|
|
|
|
| 22 |
#torch==2.7.1 --extra-index-url https://download.pytorch.org/whl/cpu
|
| 23 |
+
# For Hugging Face, need a python 3.10 compatible wheel for llama-cpp-python to avoid build timeouts
|
| 24 |
#https://github.com/seanpedrick-case/llama-cpp-python-whl-builder/releases/download/v0.1.0/llama_cpp_python-0.3.16-cp310-cp310-linux_x86_64.whl
|
|
|
|
| 25 |
|
| 26 |
|
requirements_aws.txt
CHANGED
|
@@ -1,3 +1,4 @@
|
|
|
|
|
| 1 |
pandas==2.3.2
|
| 2 |
gradio==5.44.1
|
| 3 |
transformers==4.56.0
|
|
@@ -12,6 +13,4 @@ google-genai==1.32.0
|
|
| 12 |
html5lib==1.1
|
| 13 |
beautifulsoup4==4.12.3
|
| 14 |
rapidfuzz==3.13.0
|
| 15 |
-
python-dotenv==1.1.0
|
| 16 |
-
# torch==2.7.1 --extra-index-url https://download.pytorch.org/whl/cpu # Commented out as Dockerfile should install torch
|
| 17 |
-
# llama-cpp-python==0.3.16 # Commented out as Dockerfile should install llama-cpp-python
|
|
|
|
| 1 |
+
# This requirements file is optimised for AWS ECS using Python 3.11 alongside the Dockerfile, and assumes a python 3.11 compatible llama-cpp-python wheel is available (see Dockerfile). torch and llama-cpp-python are not present here, as they are installed in the main Dockerfile
|
| 2 |
pandas==2.3.2
|
| 3 |
gradio==5.44.1
|
| 4 |
transformers==4.56.0
|
|
|
|
| 13 |
html5lib==1.1
|
| 14 |
beautifulsoup4==4.12.3
|
| 15 |
rapidfuzz==3.13.0
|
| 16 |
+
python-dotenv==1.1.0
|
|
|
|
|
|
requirements_cpu.txt
ADDED
|
@@ -0,0 +1,23 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
pandas==2.3.2
|
| 2 |
+
gradio==5.44.1
|
| 3 |
+
transformers==4.56.0
|
| 4 |
+
spaces==0.40.1
|
| 5 |
+
boto3==1.40.22
|
| 6 |
+
pyarrow==21.0.0
|
| 7 |
+
openpyxl==3.1.5
|
| 8 |
+
markdown==3.7
|
| 9 |
+
tabulate==0.9.0
|
| 10 |
+
lxml==5.3.0
|
| 11 |
+
google-genai==1.32.0
|
| 12 |
+
html5lib==1.1
|
| 13 |
+
beautifulsoup4==4.12.3
|
| 14 |
+
rapidfuzz==3.13.0
|
| 15 |
+
python-dotenv==1.1.0
|
| 16 |
+
torch==2.7.1 --extra-index-url https://download.pytorch.org/whl/cpu
|
| 17 |
+
# Linux, Python 3.11 compatible wheel available:
|
| 18 |
+
#https://github.com/seanpedrick-case/llama-cpp-python-whl-builder/releases/download/v0.1.0/llama_cpp_python-0.3.16-cp311-cp311-linux_x86_64.whl
|
| 19 |
+
# Windows, Python 3.11 compatible wheel available:
|
| 20 |
+
#https://github.com/seanpedrick-case/llama-cpp-python-whl-builder/releases/download/v0.1.0/llama_cpp_python-0.3.16-cp311-cp311-win_amd64_cpu_openblas.whl
|
| 21 |
+
# If above doesn't work for Windows, try looking at'windows_install_llama-cpp-python.txt' for instructions on how to build from source
|
| 22 |
+
# Alternatively, try your luck at installing the package from source below
|
| 23 |
+
# llama-cpp-python==0.3.16
|
requirements_gpu.txt
CHANGED
|
@@ -19,6 +19,8 @@ torch==2.6.0 --extra-index-url https://download.pytorch.org/whl/cu124 # Latest c
|
|
| 19 |
# For Linux:
|
| 20 |
https://github.com/abetlen/llama-cpp-python/releases/download/v0.3.16-cu124/llama_cpp_python-0.3.16-cp311-cp311-linux_x86_64.whl
|
| 21 |
# For Windows:
|
| 22 |
-
#llama-cpp-python
|
| 23 |
-
# If above doesn't work for Windows, try looking at'windows_install_llama-cpp-python.txt'
|
|
|
|
|
|
|
| 24 |
|
|
|
|
| 19 |
# For Linux:
|
| 20 |
https://github.com/abetlen/llama-cpp-python/releases/download/v0.3.16-cu124/llama_cpp_python-0.3.16-cp311-cp311-linux_x86_64.whl
|
| 21 |
# For Windows:
|
| 22 |
+
#https://github.com/seanpedrick-case/llama-cpp-python-whl-builder/releases/download/v0.1.0/llama_cpp_python-0.3.16-cp311-cp311-win_amd64.whl
|
| 23 |
+
# If above doesn't work for Windows, try looking at'windows_install_llama-cpp-python.txt' for instructions on how to build from source
|
| 24 |
+
# If none of the above work for you, try the following:
|
| 25 |
+
# llama-cpp-python==0.3.16 -C cmake.args="-DGGML_CUDA=on -DGGML_CUBLAS=on"
|
| 26 |
|
tools/aws_functions.py
CHANGED
|
@@ -62,68 +62,6 @@ def connect_to_s3_client(aws_access_key_textbox:str="", aws_secret_key_textbox:s
|
|
| 62 |
|
| 63 |
return s3_client
|
| 64 |
|
| 65 |
-
# def connect_to_sts_client(aws_access_key_textbox:str="", aws_secret_key_textbox:str="", sts_endpoint:str=""):
|
| 66 |
-
# # If running an anthropic model, assume that running an AWS sts model, load in sts
|
| 67 |
-
# sts_client = []
|
| 68 |
-
|
| 69 |
-
# if aws_access_key_textbox and aws_secret_key_textbox:
|
| 70 |
-
# print("Connecting to sts using AWS access key and secret keys from user input.")
|
| 71 |
-
# sts_client = boto3.client('sts',
|
| 72 |
-
# aws_access_key_id=aws_access_key_textbox,
|
| 73 |
-
# aws_secret_access_key=aws_secret_key_textbox, region_name=AWS_REGION)
|
| 74 |
-
# elif RUN_AWS_FUNCTIONS == "1" and PRIORITISE_SSO_OVER_AWS_ENV_ACCESS_KEYS == "1":
|
| 75 |
-
# print("Connecting to sts via existing SSO connection")
|
| 76 |
-
# sts_client = boto3.client('sts', region_name=AWS_REGION)
|
| 77 |
-
# elif AWS_ACCESS_KEY and AWS_SECRET_KEY:
|
| 78 |
-
# print("Getting sts credentials from environment variables")
|
| 79 |
-
# sts_client = boto3.client('sts',
|
| 80 |
-
# aws_access_key_id=AWS_ACCESS_KEY,
|
| 81 |
-
# aws_secret_access_key=AWS_SECRET_KEY,
|
| 82 |
-
# region_name=AWS_REGION)
|
| 83 |
-
# else:
|
| 84 |
-
# sts_client = ""
|
| 85 |
-
# out_message = "Cannot connect to sts service. Please provide access keys under LLM settings, or choose another model type."
|
| 86 |
-
# print(out_message)
|
| 87 |
-
# raise Exception(out_message)
|
| 88 |
-
|
| 89 |
-
# return sts_client
|
| 90 |
-
|
| 91 |
-
|
| 92 |
-
|
| 93 |
-
# def get_assumed_role_info(aws_access_key_textbox, aws_secret_key_textbox, sts_endpoint):
|
| 94 |
-
# sts_endpoint = 'https://sts.' + AWS_REGION + '.amazonaws.com'
|
| 95 |
-
# sts = connect_to_sts_client(aws_access_key_textbox, aws_secret_key_textbox, endpoint_url=sts_endpoint)
|
| 96 |
-
|
| 97 |
-
# #boto3.client('sts', region_name=AWS_REGION, endpoint_url=sts_endpoint)
|
| 98 |
-
# response = sts.get_caller_identity()
|
| 99 |
-
|
| 100 |
-
# # Extract ARN of the assumed role
|
| 101 |
-
# assumed_role_arn = response['Arn']
|
| 102 |
-
|
| 103 |
-
# # Extract the name of the assumed role from the ARN
|
| 104 |
-
# assumed_role_name = assumed_role_arn.split('/')[-1]
|
| 105 |
-
|
| 106 |
-
# return assumed_role_arn, assumed_role_name
|
| 107 |
-
|
| 108 |
-
# if RUN_AWS_FUNCTIONS == "1":
|
| 109 |
-
# try:
|
| 110 |
-
# bucket_name = S3_LOG_BUCKET
|
| 111 |
-
# #session = boto3.Session() # profile_name="default"
|
| 112 |
-
# except Exception as e:
|
| 113 |
-
# print(e)
|
| 114 |
-
|
| 115 |
-
# try:
|
| 116 |
-
# assumed_role_arn, assumed_role_name = get_assumed_role_info(aws_access_key_textbox, aws_secret_key_textbox, sts_endpoint)
|
| 117 |
-
|
| 118 |
-
# #print("Assumed Role ARN:", assumed_role_arn)
|
| 119 |
-
# #print("Assumed Role Name:", assumed_role_name)
|
| 120 |
-
|
| 121 |
-
# print("Successfully assumed role with AWS STS")
|
| 122 |
-
|
| 123 |
-
# except Exception as e:
|
| 124 |
-
# print("Could not connect to AWS STS due to:", e)
|
| 125 |
-
|
| 126 |
-
|
| 127 |
# Download direct from S3 - requires login credentials
|
| 128 |
def download_file_from_s3(bucket_name:str, key:str, local_file_path:str, aws_access_key_textbox:str="", aws_secret_key_textbox:str="", RUN_AWS_FUNCTIONS=RUN_AWS_FUNCTIONS):
|
| 129 |
|
|
|
|
| 62 |
|
| 63 |
return s3_client
|
| 64 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 65 |
# Download direct from S3 - requires login credentials
|
| 66 |
def download_file_from_s3(bucket_name:str, key:str, local_file_path:str, aws_access_key_textbox:str="", aws_secret_key_textbox:str="", RUN_AWS_FUNCTIONS=RUN_AWS_FUNCTIONS):
|
| 67 |
|
tools/combine_sheets_into_xlsx.py
CHANGED
|
@@ -102,7 +102,6 @@ def csvs_to_excel(
|
|
| 102 |
for idx, csv_path in enumerate(csv_files):
|
| 103 |
# Use provided sheet name or derive from file name
|
| 104 |
sheet_name = sheet_names[idx] if sheet_names and idx < len(sheet_names) else os.path.splitext(os.path.basename(csv_path))[0]
|
| 105 |
-
print("csv_path:", csv_path)
|
| 106 |
df = pd.read_csv(csv_path)
|
| 107 |
|
| 108 |
if sheet_name == "Original data":
|
|
@@ -160,7 +159,7 @@ def csvs_to_excel(
|
|
| 160 |
|
| 161 |
wb.save(output_filename)
|
| 162 |
|
| 163 |
-
print(f"
|
| 164 |
|
| 165 |
return output_filename
|
| 166 |
|
|
@@ -243,10 +242,9 @@ def collect_output_csvs_and_create_excel_output(in_data_files:List, chosen_cols:
|
|
| 243 |
else:
|
| 244 |
raise Exception("Could not find unique topic files to put into Excel format")
|
| 245 |
if reference_table_csv_path:
|
| 246 |
-
#reference_table_csv_path = reference_table_csv_path[0]
|
| 247 |
csv_files.append(reference_table_csv_path)
|
| 248 |
sheet_names.append("Response level data")
|
| 249 |
-
column_widths["Response level data"] = {"A": 15, "B": 30, "C": 40, "
|
| 250 |
wrap_text_columns["Response level data"] = ["C", "G"]
|
| 251 |
else:
|
| 252 |
raise Exception("Could not find any reference files to put into Excel format")
|
|
@@ -308,8 +306,6 @@ def collect_output_csvs_and_create_excel_output(in_data_files:List, chosen_cols:
|
|
| 308 |
sheet_names.append("Original data")
|
| 309 |
column_widths["Original data"] = {"A": 20, "B": 20, "C": 20}
|
| 310 |
wrap_text_columns["Original data"] = ["C"]
|
| 311 |
-
|
| 312 |
-
print("Creating intro page and text")
|
| 313 |
|
| 314 |
# Intro page text
|
| 315 |
intro_text = [
|
|
@@ -381,7 +377,7 @@ def collect_output_csvs_and_create_excel_output(in_data_files:List, chosen_cols:
|
|
| 381 |
|
| 382 |
xlsx_output_filenames = [xlsx_output_filename]
|
| 383 |
|
| 384 |
-
return xlsx_output_filenames
|
| 385 |
|
| 386 |
|
| 387 |
|
|
|
|
| 102 |
for idx, csv_path in enumerate(csv_files):
|
| 103 |
# Use provided sheet name or derive from file name
|
| 104 |
sheet_name = sheet_names[idx] if sheet_names and idx < len(sheet_names) else os.path.splitext(os.path.basename(csv_path))[0]
|
|
|
|
| 105 |
df = pd.read_csv(csv_path)
|
| 106 |
|
| 107 |
if sheet_name == "Original data":
|
|
|
|
| 159 |
|
| 160 |
wb.save(output_filename)
|
| 161 |
|
| 162 |
+
print(f"Output xlsx summary saved as '{output_filename}'")
|
| 163 |
|
| 164 |
return output_filename
|
| 165 |
|
|
|
|
| 242 |
else:
|
| 243 |
raise Exception("Could not find unique topic files to put into Excel format")
|
| 244 |
if reference_table_csv_path:
|
|
|
|
| 245 |
csv_files.append(reference_table_csv_path)
|
| 246 |
sheet_names.append("Response level data")
|
| 247 |
+
column_widths["Response level data"] = {"A": 15, "B": 30, "C": 40, "H":100}
|
| 248 |
wrap_text_columns["Response level data"] = ["C", "G"]
|
| 249 |
else:
|
| 250 |
raise Exception("Could not find any reference files to put into Excel format")
|
|
|
|
| 306 |
sheet_names.append("Original data")
|
| 307 |
column_widths["Original data"] = {"A": 20, "B": 20, "C": 20}
|
| 308 |
wrap_text_columns["Original data"] = ["C"]
|
|
|
|
|
|
|
| 309 |
|
| 310 |
# Intro page text
|
| 311 |
intro_text = [
|
|
|
|
| 377 |
|
| 378 |
xlsx_output_filenames = [xlsx_output_filename]
|
| 379 |
|
| 380 |
+
return xlsx_output_filenames, xlsx_output_filenames
|
| 381 |
|
| 382 |
|
| 383 |
|
tools/config.py
CHANGED
|
@@ -203,6 +203,7 @@ MAX_COMMENT_CHARS = int(get_or_create_env_var('MAX_COMMENT_CHARS', '14000'))
|
|
| 203 |
|
| 204 |
RUN_LOCAL_MODEL = get_or_create_env_var("RUN_LOCAL_MODEL", "1")
|
| 205 |
RUN_GEMINI_MODELS = get_or_create_env_var("RUN_GEMINI_MODELS", "1")
|
|
|
|
| 206 |
GEMINI_API_KEY = get_or_create_env_var('GEMINI_API_KEY', '')
|
| 207 |
|
| 208 |
# Build up options for models
|
|
@@ -218,7 +219,7 @@ if RUN_LOCAL_MODEL == "1" and CHOSEN_LOCAL_MODEL_TYPE:
|
|
| 218 |
model_short_names.append(CHOSEN_LOCAL_MODEL_TYPE)
|
| 219 |
model_source.append("Local")
|
| 220 |
|
| 221 |
-
if
|
| 222 |
model_full_names.extend(["anthropic.claude-3-haiku-20240307-v1:0", "anthropic.claude-3-7-sonnet-20250219-v1:0"])
|
| 223 |
model_short_names.extend(["haiku", "sonnet"])
|
| 224 |
model_source.extend(["AWS", "AWS"])
|
|
|
|
| 203 |
|
| 204 |
RUN_LOCAL_MODEL = get_or_create_env_var("RUN_LOCAL_MODEL", "1")
|
| 205 |
RUN_GEMINI_MODELS = get_or_create_env_var("RUN_GEMINI_MODELS", "1")
|
| 206 |
+
RUN_AWS_BEDROCK_MODELS = get_or_create_env_var("RUN_AWS_BEDROCK_MODELS", "1")
|
| 207 |
GEMINI_API_KEY = get_or_create_env_var('GEMINI_API_KEY', '')
|
| 208 |
|
| 209 |
# Build up options for models
|
|
|
|
| 219 |
model_short_names.append(CHOSEN_LOCAL_MODEL_TYPE)
|
| 220 |
model_source.append("Local")
|
| 221 |
|
| 222 |
+
if RUN_AWS_BEDROCK_MODELS == "1":
|
| 223 |
model_full_names.extend(["anthropic.claude-3-haiku-20240307-v1:0", "anthropic.claude-3-7-sonnet-20250219-v1:0"])
|
| 224 |
model_short_names.extend(["haiku", "sonnet"])
|
| 225 |
model_source.extend(["AWS", "AWS"])
|
tools/dedup_summaries.py
CHANGED
|
@@ -529,6 +529,7 @@ def summarise_output_topics(sampled_reference_table_df:pd.DataFrame,
|
|
| 529 |
acc_number_of_calls = 0
|
| 530 |
time_taken = 0
|
| 531 |
out_metadata_str = "" # Output metadata is currently replaced on starting a summarisation task
|
|
|
|
| 532 |
|
| 533 |
tic = time.perf_counter()
|
| 534 |
|
|
@@ -573,8 +574,8 @@ def summarise_output_topics(sampled_reference_table_df:pd.DataFrame,
|
|
| 573 |
progress(0.1, f"Loading in local model: {CHOSEN_LOCAL_MODEL_TYPE}")
|
| 574 |
local_model, tokenizer = load_model(local_model_type=CHOSEN_LOCAL_MODEL_TYPE, repo_id=LOCAL_REPO_ID, model_filename=LOCAL_MODEL_FILE, model_dir=LOCAL_MODEL_FOLDER)
|
| 575 |
|
| 576 |
-
summary_loop_description = "
|
| 577 |
-
summary_loop = tqdm(range(latest_summary_completed, length_all_summaries), desc="
|
| 578 |
|
| 579 |
if do_summaries == "Yes":
|
| 580 |
|
|
@@ -675,9 +676,13 @@ def summarise_output_topics(sampled_reference_table_df:pd.DataFrame,
|
|
| 675 |
acc_input_tokens, acc_output_tokens, acc_number_of_calls = calculate_tokens_from_metadata(out_metadata_str, model_choice, model_name_map)
|
| 676 |
|
| 677 |
toc = time.perf_counter()
|
| 678 |
-
time_taken = toc - tic
|
|
|
|
|
|
|
|
|
|
|
|
|
| 679 |
|
| 680 |
-
return sampled_reference_table_df, topic_summary_df_revised, reference_table_df_revised, output_files, summarised_outputs, latest_summary_completed, out_metadata_str, summarised_output_markdown, log_output_files, output_files, acc_input_tokens, acc_output_tokens, acc_number_of_calls, time_taken
|
| 681 |
|
| 682 |
@spaces.GPU(duration=120)
|
| 683 |
def overall_summary(topic_summary_df:pd.DataFrame,
|
|
@@ -747,6 +752,7 @@ def overall_summary(topic_summary_df:pd.DataFrame,
|
|
| 747 |
output_tokens_num = 0
|
| 748 |
number_of_calls_num = 0
|
| 749 |
time_taken = 0
|
|
|
|
| 750 |
|
| 751 |
tic = time.perf_counter()
|
| 752 |
|
|
@@ -792,7 +798,7 @@ def overall_summary(topic_summary_df:pd.DataFrame,
|
|
| 792 |
local_model, tokenizer = load_model(local_model_type=CHOSEN_LOCAL_MODEL_TYPE, repo_id=LOCAL_REPO_ID, model_filename=LOCAL_MODEL_FILE, model_dir=LOCAL_MODEL_FOLDER)
|
| 793 |
#print("Local model loaded:", local_model)
|
| 794 |
|
| 795 |
-
summary_loop = tqdm(unique_groups, desc="Creating
|
| 796 |
|
| 797 |
if do_summaries == "Yes":
|
| 798 |
model_source = model_name_map[model_choice]["source"]
|
|
@@ -800,7 +806,7 @@ def overall_summary(topic_summary_df:pd.DataFrame,
|
|
| 800 |
|
| 801 |
for summary_group in summary_loop:
|
| 802 |
|
| 803 |
-
print("Creating
|
| 804 |
|
| 805 |
summary_text = topic_summary_df.loc[topic_summary_df["Group"]==summary_group].to_markdown(index=False)
|
| 806 |
|
|
@@ -879,6 +885,8 @@ def overall_summary(topic_summary_df:pd.DataFrame,
|
|
| 879 |
toc = time.perf_counter()
|
| 880 |
time_taken = toc - tic
|
| 881 |
|
| 882 |
-
|
|
|
|
|
|
|
| 883 |
|
| 884 |
-
return output_files, html_output_table, summarised_outputs_df, out_metadata_str, input_tokens_num, output_tokens_num, number_of_calls_num, time_taken
|
|
|
|
| 529 |
acc_number_of_calls = 0
|
| 530 |
time_taken = 0
|
| 531 |
out_metadata_str = "" # Output metadata is currently replaced on starting a summarisation task
|
| 532 |
+
out_message = list()
|
| 533 |
|
| 534 |
tic = time.perf_counter()
|
| 535 |
|
|
|
|
| 574 |
progress(0.1, f"Loading in local model: {CHOSEN_LOCAL_MODEL_TYPE}")
|
| 575 |
local_model, tokenizer = load_model(local_model_type=CHOSEN_LOCAL_MODEL_TYPE, repo_id=LOCAL_REPO_ID, model_filename=LOCAL_MODEL_FILE, model_dir=LOCAL_MODEL_FOLDER)
|
| 576 |
|
| 577 |
+
summary_loop_description = "Revising topic-level summaries. " + str(latest_summary_completed) + " summaries completed so far."
|
| 578 |
+
summary_loop = tqdm(range(latest_summary_completed, length_all_summaries), desc="Revising topic-level summaries", unit="summaries")
|
| 579 |
|
| 580 |
if do_summaries == "Yes":
|
| 581 |
|
|
|
|
| 676 |
acc_input_tokens, acc_output_tokens, acc_number_of_calls = calculate_tokens_from_metadata(out_metadata_str, model_choice, model_name_map)
|
| 677 |
|
| 678 |
toc = time.perf_counter()
|
| 679 |
+
time_taken = toc - tic
|
| 680 |
+
|
| 681 |
+
out_message = '\n'.join(out_message)
|
| 682 |
+
out_message = out_message + " " + f"Topic summarisation finished processing. Total time: {time_taken:.2f}s"
|
| 683 |
+
print(out_message)
|
| 684 |
|
| 685 |
+
return sampled_reference_table_df, topic_summary_df_revised, reference_table_df_revised, output_files, summarised_outputs, latest_summary_completed, out_metadata_str, summarised_output_markdown, log_output_files, output_files, acc_input_tokens, acc_output_tokens, acc_number_of_calls, time_taken, out_message
|
| 686 |
|
| 687 |
@spaces.GPU(duration=120)
|
| 688 |
def overall_summary(topic_summary_df:pd.DataFrame,
|
|
|
|
| 752 |
output_tokens_num = 0
|
| 753 |
number_of_calls_num = 0
|
| 754 |
time_taken = 0
|
| 755 |
+
out_message = list()
|
| 756 |
|
| 757 |
tic = time.perf_counter()
|
| 758 |
|
|
|
|
| 798 |
local_model, tokenizer = load_model(local_model_type=CHOSEN_LOCAL_MODEL_TYPE, repo_id=LOCAL_REPO_ID, model_filename=LOCAL_MODEL_FILE, model_dir=LOCAL_MODEL_FOLDER)
|
| 799 |
#print("Local model loaded:", local_model)
|
| 800 |
|
| 801 |
+
summary_loop = tqdm(unique_groups, desc="Creating overall summary for groups", unit="groups")
|
| 802 |
|
| 803 |
if do_summaries == "Yes":
|
| 804 |
model_source = model_name_map[model_choice]["source"]
|
|
|
|
| 806 |
|
| 807 |
for summary_group in summary_loop:
|
| 808 |
|
| 809 |
+
print("Creating overallsummary for group:", summary_group)
|
| 810 |
|
| 811 |
summary_text = topic_summary_df.loc[topic_summary_df["Group"]==summary_group].to_markdown(index=False)
|
| 812 |
|
|
|
|
| 885 |
toc = time.perf_counter()
|
| 886 |
time_taken = toc - tic
|
| 887 |
|
| 888 |
+
out_message = '\n'.join(out_message)
|
| 889 |
+
out_message = out_message + " " + f"Overall summary finished processing. Total time: {time_taken:.2f}s"
|
| 890 |
+
print(out_message)
|
| 891 |
|
| 892 |
+
return output_files, html_output_table, summarised_outputs_df, out_metadata_str, input_tokens_num, output_tokens_num, number_of_calls_num, time_taken, out_message
|
tools/llm_api_call.py
CHANGED
|
@@ -17,7 +17,7 @@ GradioFileData = gr.FileData
|
|
| 17 |
from tools.prompts import initial_table_prompt, prompt2, prompt3, initial_table_system_prompt, add_existing_topics_system_prompt, add_existing_topics_prompt, force_existing_topics_prompt, allow_new_topics_prompt, force_single_topic_prompt, add_existing_topics_assistant_prefill, initial_table_assistant_prefill, structured_summary_prompt
|
| 18 |
from tools.helper_functions import read_file, put_columns_in_df, wrap_text, initial_clean, load_in_data_file, load_in_file, create_topic_summary_df_from_reference_table, convert_reference_table_to_pivot_table, get_basic_response_data, clean_column_name, load_in_previous_data_files, create_batch_file_path_details
|
| 19 |
from tools.llm_funcs import ResponseObject, construct_gemini_generative_model, call_llm_with_markdown_table_checks, create_missing_references_df, calculate_tokens_from_metadata
|
| 20 |
-
from tools.config import RUN_LOCAL_MODEL, AWS_REGION, MAX_COMMENT_CHARS, MAX_OUTPUT_VALIDATION_ATTEMPTS, MAX_TOKENS, TIMEOUT_WAIT, NUMBER_OF_RETRY_ATTEMPTS, MAX_TIME_FOR_LOOP, BATCH_SIZE_DEFAULT, DEDUPLICATION_THRESHOLD,
|
| 21 |
from tools.aws_functions import connect_to_bedrock_runtime
|
| 22 |
|
| 23 |
if RUN_LOCAL_MODEL == "1":
|
|
@@ -1283,6 +1283,7 @@ def wrapper_extract_topics_per_column_value(
|
|
| 1283 |
acc_input_tokens = 0
|
| 1284 |
acc_output_tokens = 0
|
| 1285 |
acc_number_of_calls = 0
|
|
|
|
| 1286 |
|
| 1287 |
if grouping_col is None:
|
| 1288 |
print("No grouping column found")
|
|
@@ -1321,8 +1322,14 @@ def wrapper_extract_topics_per_column_value(
|
|
| 1321 |
|
| 1322 |
wrapper_first_loop = initial_first_loop_state
|
| 1323 |
|
| 1324 |
-
|
| 1325 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1326 |
|
| 1327 |
filtered_file_data = file_data.copy()
|
| 1328 |
|
|
@@ -1440,7 +1447,7 @@ def wrapper_extract_topics_per_column_value(
|
|
| 1440 |
acc_total_time_taken += float(seg_time_taken)
|
| 1441 |
acc_gradio_df = seg_gradio_df # Keep the latest Gradio DF
|
| 1442 |
|
| 1443 |
-
print(f"
|
| 1444 |
|
| 1445 |
except Exception as e:
|
| 1446 |
print(f"Error processing segment {grouping_col} = {group_value}: {e}")
|
|
@@ -1481,7 +1488,9 @@ def wrapper_extract_topics_per_column_value(
|
|
| 1481 |
|
| 1482 |
acc_input_tokens, acc_output_tokens, acc_number_of_calls = calculate_tokens_from_metadata(acc_whole_conversation_metadata, model_choice, model_name_map)
|
| 1483 |
|
| 1484 |
-
|
|
|
|
|
|
|
| 1485 |
|
| 1486 |
# The return signature should match extract_topics.
|
| 1487 |
# The aggregated lists will be returned in the multiple slots.
|
|
@@ -1505,7 +1514,8 @@ def wrapper_extract_topics_per_column_value(
|
|
| 1505 |
acc_missing_df,
|
| 1506 |
acc_input_tokens,
|
| 1507 |
acc_output_tokens,
|
| 1508 |
-
acc_number_of_calls
|
|
|
|
| 1509 |
)
|
| 1510 |
|
| 1511 |
|
|
|
|
| 17 |
from tools.prompts import initial_table_prompt, prompt2, prompt3, initial_table_system_prompt, add_existing_topics_system_prompt, add_existing_topics_prompt, force_existing_topics_prompt, allow_new_topics_prompt, force_single_topic_prompt, add_existing_topics_assistant_prefill, initial_table_assistant_prefill, structured_summary_prompt
|
| 18 |
from tools.helper_functions import read_file, put_columns_in_df, wrap_text, initial_clean, load_in_data_file, load_in_file, create_topic_summary_df_from_reference_table, convert_reference_table_to_pivot_table, get_basic_response_data, clean_column_name, load_in_previous_data_files, create_batch_file_path_details
|
| 19 |
from tools.llm_funcs import ResponseObject, construct_gemini_generative_model, call_llm_with_markdown_table_checks, create_missing_references_df, calculate_tokens_from_metadata
|
| 20 |
+
from tools.config import RUN_LOCAL_MODEL, AWS_REGION, MAX_COMMENT_CHARS, MAX_OUTPUT_VALIDATION_ATTEMPTS, MAX_TOKENS, TIMEOUT_WAIT, NUMBER_OF_RETRY_ATTEMPTS, MAX_TIME_FOR_LOOP, BATCH_SIZE_DEFAULT, DEDUPLICATION_THRESHOLD, model_name_map, OUTPUT_FOLDER, CHOSEN_LOCAL_MODEL_TYPE, LOCAL_REPO_ID, LOCAL_MODEL_FILE, LOCAL_MODEL_FOLDER, LLM_SEED, MAX_GROUPS, REASONING_SUFFIX
|
| 21 |
from tools.aws_functions import connect_to_bedrock_runtime
|
| 22 |
|
| 23 |
if RUN_LOCAL_MODEL == "1":
|
|
|
|
| 1283 |
acc_input_tokens = 0
|
| 1284 |
acc_output_tokens = 0
|
| 1285 |
acc_number_of_calls = 0
|
| 1286 |
+
out_message = list()
|
| 1287 |
|
| 1288 |
if grouping_col is None:
|
| 1289 |
print("No grouping column found")
|
|
|
|
| 1322 |
|
| 1323 |
wrapper_first_loop = initial_first_loop_state
|
| 1324 |
|
| 1325 |
+
if len(unique_values) == 1:
|
| 1326 |
+
loop_object = enumerate(unique_values)
|
| 1327 |
+
else:
|
| 1328 |
+
loop_object = tqdm(enumerate(unique_values), desc=f"Analysing group", total=len(unique_values), unit="groups")
|
| 1329 |
+
|
| 1330 |
+
|
| 1331 |
+
for i, group_value in loop_object:
|
| 1332 |
+
print(f"\nProcessing group: {grouping_col} = {group_value} ({i+1}/{len(unique_values)})")
|
| 1333 |
|
| 1334 |
filtered_file_data = file_data.copy()
|
| 1335 |
|
|
|
|
| 1447 |
acc_total_time_taken += float(seg_time_taken)
|
| 1448 |
acc_gradio_df = seg_gradio_df # Keep the latest Gradio DF
|
| 1449 |
|
| 1450 |
+
print(f"Group {grouping_col} = {group_value} processed. Time: {seg_time_taken:.2f}s")
|
| 1451 |
|
| 1452 |
except Exception as e:
|
| 1453 |
print(f"Error processing segment {grouping_col} = {group_value}: {e}")
|
|
|
|
| 1488 |
|
| 1489 |
acc_input_tokens, acc_output_tokens, acc_number_of_calls = calculate_tokens_from_metadata(acc_whole_conversation_metadata, model_choice, model_name_map)
|
| 1490 |
|
| 1491 |
+
out_message = '\n'.join(out_message)
|
| 1492 |
+
out_message = out_message + " " + f"Topic extraction finished processing all groups. Total time: {acc_total_time_taken:.2f}s"
|
| 1493 |
+
print(out_message)
|
| 1494 |
|
| 1495 |
# The return signature should match extract_topics.
|
| 1496 |
# The aggregated lists will be returned in the multiple slots.
|
|
|
|
| 1514 |
acc_missing_df,
|
| 1515 |
acc_input_tokens,
|
| 1516 |
acc_output_tokens,
|
| 1517 |
+
acc_number_of_calls,
|
| 1518 |
+
out_message
|
| 1519 |
)
|
| 1520 |
|
| 1521 |
|
tools/llm_funcs.py
CHANGED
|
@@ -18,7 +18,7 @@ full_text = "" # Define dummy source text (full text) just to enable highlight f
|
|
| 18 |
model = list() # Define empty list for model functions to run
|
| 19 |
tokenizer = list() #[] # Define empty list for model functions to run
|
| 20 |
|
| 21 |
-
from tools.config import
|
| 22 |
from tools.prompts import initial_table_assistant_prefill
|
| 23 |
|
| 24 |
if SPECULATIVE_DECODING == "True": SPECULATIVE_DECODING = True
|
|
@@ -220,8 +220,6 @@ def load_model(local_model_type:str=CHOSEN_LOCAL_MODEL_TYPE,
|
|
| 220 |
# Verify the device and cuda settings
|
| 221 |
# Check if CUDA is enabled
|
| 222 |
import torch
|
| 223 |
-
#if RUN_LOCAL_MODEL == "1":
|
| 224 |
-
#print("Running local model - importing llama-cpp-python")
|
| 225 |
from llama_cpp import Llama
|
| 226 |
from llama_cpp.llama_speculative import LlamaPromptLookupDecoding
|
| 227 |
|
|
|
|
| 18 |
model = list() # Define empty list for model functions to run
|
| 19 |
tokenizer = list() #[] # Define empty list for model functions to run
|
| 20 |
|
| 21 |
+
from tools.config import AWS_REGION, LLM_TEMPERATURE, LLM_TOP_K, LLM_MIN_P, LLM_TOP_P, LLM_REPETITION_PENALTY, LLM_LAST_N_TOKENS, LLM_MAX_NEW_TOKENS, LLM_SEED, LLM_RESET, LLM_STREAM, LLM_THREADS, LLM_BATCH_SIZE, LLM_CONTEXT_LENGTH, LLM_SAMPLE, MAX_TOKENS, TIMEOUT_WAIT, NUMBER_OF_RETRY_ATTEMPTS, MAX_TIME_FOR_LOOP, BATCH_SIZE_DEFAULT, DEDUPLICATION_THRESHOLD, MAX_COMMENT_CHARS, CHOSEN_LOCAL_MODEL_TYPE, LOCAL_REPO_ID, LOCAL_MODEL_FILE, LOCAL_MODEL_FOLDER, HF_TOKEN, LLM_SEED, LLM_MAX_GPU_LAYERS, SPECULATIVE_DECODING, NUM_PRED_TOKENS
|
| 22 |
from tools.prompts import initial_table_assistant_prefill
|
| 23 |
|
| 24 |
if SPECULATIVE_DECODING == "True": SPECULATIVE_DECODING = True
|
|
|
|
| 220 |
# Verify the device and cuda settings
|
| 221 |
# Check if CUDA is enabled
|
| 222 |
import torch
|
|
|
|
|
|
|
| 223 |
from llama_cpp import Llama
|
| 224 |
from llama_cpp.llama_speculative import LlamaPromptLookupDecoding
|
| 225 |
|
windows_install_llama-cpp-python.txt
CHANGED
|
@@ -1,26 +1,11 @@
|
|
| 1 |
---
|
| 2 |
|
| 3 |
-
|
| 4 |
-
|
| 5 |
|
| 6 |
-
|
| 7 |
-
---
|
| 8 |
|
| 9 |
-
#
|
| 10 |
-
|
| 11 |
-
pip install llama-cpp-python==0.3.16 --force-reinstall --no-cache-dir --verbose -C cmake.args="-DGGML_CUDA=on"
|
| 12 |
-
|
| 13 |
-
|
| 14 |
-
---
|
| 15 |
-
|
| 16 |
-
|
| 17 |
-
How to Make it Work: Step-by-Step Guide
|
| 18 |
-
To successfully run your command, you need to set up a proper C++ development environment.
|
| 19 |
-
|
| 20 |
-
Step 1: Install the C++ Compiler
|
| 21 |
-
Go to the Visual Studio downloads page.
|
| 22 |
-
|
| 23 |
-
Scroll down to "Tools for Visual Studio" and download the "Build Tools for Visual Studio". This is a standalone installer that gives you the C++ compiler and libraries without installing the full Visual Studio IDE.
|
| 24 |
|
| 25 |
Run the installer. In the "Workloads" tab, check the box for "Desktop development with C++".
|
| 26 |
|
|
@@ -34,18 +19,17 @@ Windows 10 SDK (10.0.20348.0)
|
|
| 34 |
|
| 35 |
Proceed with the installation.
|
| 36 |
|
|
|
|
| 37 |
|
| 38 |
-
|
| 39 |
-
|
| 40 |
-
Step 2: Install CMake
|
| 41 |
-
Go to the CMake download page.
|
| 42 |
|
| 43 |
Download the latest Windows installer (e.g., cmake-x.xx.x-windows-x86_64.msi).
|
| 44 |
|
| 45 |
Run the installer. Crucially, when prompted, select the option to "Add CMake to the system PATH for all users" or "for the current user." This allows you to run cmake from any command prompt.
|
| 46 |
|
| 47 |
|
| 48 |
-
Step 3: Download and Place OpenBLAS
|
| 49 |
This is often the trickiest part.
|
| 50 |
|
| 51 |
Go to the OpenBLAS releases on GitHub.
|
|
@@ -56,14 +40,12 @@ Create a folder somewhere easily accessible, for example, C:\libs\.
|
|
| 56 |
|
| 57 |
Extract the contents of the OpenBLAS zip file into that folder. Your final directory structure should look something like this:
|
| 58 |
|
| 59 |
-
Generated code
|
| 60 |
C:\libs\OpenBLAS\
|
| 61 |
βββ bin\
|
| 62 |
βββ include\
|
| 63 |
βββ lib\
|
| 64 |
-
Use code with caution.
|
| 65 |
|
| 66 |
-
3.b. Install Chocolatey
|
| 67 |
https://chocolatey.org/install
|
| 68 |
|
| 69 |
Step 1: Install Chocolatey (if you don't already have it)
|
|
@@ -71,25 +53,20 @@ Open PowerShell as an Administrator. (Right-click the Start Menu -> "Windows Pow
|
|
| 71 |
|
| 72 |
Run the following command to install Chocolatey. It's a single, long line:
|
| 73 |
|
| 74 |
-
Generated powershell
|
| 75 |
Set-ExecutionPolicy Bypass -Scope Process -Force; [System.Net.ServicePointManager]::SecurityProtocol = [System.Net.ServicePointManager]::SecurityProtocol -bor 3072; iex ((New-Object System.Net.WebClient).DownloadString('https://community.chocolatey.org/install.ps1'))
|
| 76 |
-
|
| 77 |
-
|
| 78 |
-
Wait for it to finish. Once it's done, close the Administrator PowerShell window.
|
| 79 |
|
| 80 |
Step 2: Install pkg-config-lite using Chocolatey
|
| 81 |
-
IMPORTANT: Open a NEW command prompt or PowerShell window (as a regular user is fine). This is necessary so it
|
| 82 |
|
| 83 |
-
Run the following command to install a lightweight version of pkg-config:
|
| 84 |
|
| 85 |
-
Generated cmd
|
| 86 |
choco install pkgconfiglite
|
| 87 |
-
Use code with caution.
|
| 88 |
-
Cmd
|
| 89 |
-
Approve the installation by typing Y or A if prompted.
|
| 90 |
|
|
|
|
| 91 |
|
| 92 |
-
Step 4: Run the Installation Command
|
| 93 |
Now you have all the pieces. The final step is to run the command in a terminal that is aware of your new build environment.
|
| 94 |
|
| 95 |
Open the "Developer Command Prompt for VS" from your Start Menu. This is important! This special command prompt automatically configures all the necessary paths for the C++ compiler.
|
|
@@ -98,22 +75,35 @@ Open the "Developer Command Prompt for VS" from your Start Menu. This is importa
|
|
| 98 |
|
| 99 |
set PKG_CONFIG_PATH=C:\<path-to-openblas>\OpenBLAS\lib\pkgconfig # Set this in environment variables
|
| 100 |
|
| 101 |
-
|
| 102 |
pip install llama-cpp-python==0.3.16 --force-reinstall --verbose --no-cache-dir -Ccmake.args="-DGGML_BLAS=ON;-DGGML_BLAS_VENDOR=OpenBLAS;-DBLAS_INCLUDE_DIRS=C:/<path-to-openblas>/OpenBLAS/include;-DBLAS_LIBRARIES=C:/<path-to-openblas>/OpenBLAS/lib/libopenblas.lib"
|
| 103 |
|
| 104 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
| 105 |
|
| 106 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 107 |
|
| 108 |
Use NVIDIA GPU (cuBLAS): If you have an NVIDIA GPU, using cuBLAS is often easier because the CUDA Toolkit installer handles most of the setup.
|
| 109 |
|
| 110 |
Install the NVIDIA CUDA Toolkit.
|
| 111 |
|
| 112 |
-
Run the install command specifying cuBLAS:
|
| 113 |
|
|
|
|
| 114 |
|
| 115 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
| 116 |
|
| 117 |
-
pip install llama-cpp-python==0.3.16 --force-reinstall --no-cache-dir --verbose -C cmake.args="-DGGML_CUDA=on"
|
| 118 |
|
| 119 |
|
|
|
|
| 1 |
---
|
| 2 |
|
| 3 |
+
#How to build llama-cpp-python on Windows: Step-by-Step Guide
|
|
|
|
| 4 |
|
| 5 |
+
First, you need to set up a proper C++ development environment.
|
|
|
|
| 6 |
|
| 7 |
+
# Step 1: Install the C++ Compiler
|
| 8 |
+
Scroll down the page past the main programs to "Tools for Visual Studio" and download the "Build Tools for Visual Studio". This is a standalone installer that gives you the C++ compiler and libraries without installing the full Visual Studio IDE.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 9 |
|
| 10 |
Run the installer. In the "Workloads" tab, check the box for "Desktop development with C++".
|
| 11 |
|
|
|
|
| 19 |
|
| 20 |
Proceed with the installation.
|
| 21 |
|
| 22 |
+
Need to use 'x64 Native Tools Command Prompt for VS 2022' to install the below. Run as administrator
|
| 23 |
|
| 24 |
+
# Step 2: Install CMake
|
| 25 |
+
Go to the CMake download page: https://cmake.org/download
|
|
|
|
|
|
|
| 26 |
|
| 27 |
Download the latest Windows installer (e.g., cmake-x.xx.x-windows-x86_64.msi).
|
| 28 |
|
| 29 |
Run the installer. Crucially, when prompted, select the option to "Add CMake to the system PATH for all users" or "for the current user." This allows you to run cmake from any command prompt.
|
| 30 |
|
| 31 |
|
| 32 |
+
# Step 3: (FOR CPU INFERENCE ONLY) Download and Place OpenBLAS
|
| 33 |
This is often the trickiest part.
|
| 34 |
|
| 35 |
Go to the OpenBLAS releases on GitHub.
|
|
|
|
| 40 |
|
| 41 |
Extract the contents of the OpenBLAS zip file into that folder. Your final directory structure should look something like this:
|
| 42 |
|
|
|
|
| 43 |
C:\libs\OpenBLAS\
|
| 44 |
βββ bin\
|
| 45 |
βββ include\
|
| 46 |
βββ lib\
|
|
|
|
| 47 |
|
| 48 |
+
## 3.b. Install Chocolatey
|
| 49 |
https://chocolatey.org/install
|
| 50 |
|
| 51 |
Step 1: Install Chocolatey (if you don't already have it)
|
|
|
|
| 53 |
|
| 54 |
Run the following command to install Chocolatey. It's a single, long line:
|
| 55 |
|
|
|
|
| 56 |
Set-ExecutionPolicy Bypass -Scope Process -Force; [System.Net.ServicePointManager]::SecurityProtocol = [System.Net.ServicePointManager]::SecurityProtocol -bor 3072; iex ((New-Object System.Net.WebClient).DownloadString('https://community.chocolatey.org/install.ps1'))
|
| 57 |
+
|
| 58 |
+
Once it's done, close the Administrator PowerShell window.
|
|
|
|
| 59 |
|
| 60 |
Step 2: Install pkg-config-lite using Chocolatey
|
| 61 |
+
IMPORTANT: Open a NEW command prompt or PowerShell window (as a regular user is fine). This is necessary so it recognises the new choco command.
|
| 62 |
|
| 63 |
+
Run the following command in console to install a lightweight version of pkg-config:
|
| 64 |
|
|
|
|
| 65 |
choco install pkgconfiglite
|
|
|
|
|
|
|
|
|
|
| 66 |
|
| 67 |
+
Approve the installation by typing Y or A if prompted.
|
| 68 |
|
| 69 |
+
# Step 4: Run the Installation Command
|
| 70 |
Now you have all the pieces. The final step is to run the command in a terminal that is aware of your new build environment.
|
| 71 |
|
| 72 |
Open the "Developer Command Prompt for VS" from your Start Menu. This is important! This special command prompt automatically configures all the necessary paths for the C++ compiler.
|
|
|
|
| 75 |
|
| 76 |
set PKG_CONFIG_PATH=C:\<path-to-openblas>\OpenBLAS\lib\pkgconfig # Set this in environment variables
|
| 77 |
|
|
|
|
| 78 |
pip install llama-cpp-python==0.3.16 --force-reinstall --verbose --no-cache-dir -Ccmake.args="-DGGML_BLAS=ON;-DGGML_BLAS_VENDOR=OpenBLAS;-DBLAS_INCLUDE_DIRS=C:/<path-to-openblas>/OpenBLAS/include;-DBLAS_LIBRARIES=C:/<path-to-openblas>/OpenBLAS/lib/libopenblas.lib"
|
| 79 |
|
| 80 |
+
or to make a wheel:
|
| 81 |
+
|
| 82 |
+
pip install llama-cpp-python==0.3.16 --wheel-dir dist --verbose --no-cache-dir -Ccmake.args="-DGGML_BLAS=ON;-DGGML_BLAS_VENDOR=OpenBLAS;-DBLAS_INCLUDE_DIRS=C:/<path-to-openblas>/OpenBLAS/include;-DBLAS_LIBRARIES=C:/<path-to-openblas>/OpenBLAS/lib/libopenblas.lib"
|
| 83 |
+
|
| 84 |
+
pip wheel llama-cpp-python==0.3.16 --wheel-dir dist --verbose --no-cache-dir -Ccmake.args="-DGGML_BLAS=ON;-DGGML_BLAS_VENDOR=OpenBLAS;-DBLAS_INCLUDE_DIRS=C:/Users/spedrickcase/libs/OpenBLAS/include;-DBLAS_LIBRARIES=C:/Users/spedrickcase/libs/OpenBLAS/lib/libopenblas.lib"
|
| 85 |
|
| 86 |
+
C:/Users/spedrickcase/libs
|
| 87 |
+
|
| 88 |
+
## With Cuda (NVIDIA GPUs only)
|
| 89 |
+
|
| 90 |
+
Make sure that the have the CUDA 12.4 toolkit for windows installed: https://developer.nvidia.com/cuda-12-4-0-download-archive
|
| 91 |
+
|
| 92 |
+
### Make sure you are using the x64 version of Developer command tools for the below, e.g. 'x64 Native Tools Command Prompt for VS 2022' ###
|
| 93 |
|
| 94 |
Use NVIDIA GPU (cuBLAS): If you have an NVIDIA GPU, using cuBLAS is often easier because the CUDA Toolkit installer handles most of the setup.
|
| 95 |
|
| 96 |
Install the NVIDIA CUDA Toolkit.
|
| 97 |
|
| 98 |
+
Run the install command specifying cuBLAS (for faster inference):
|
| 99 |
|
| 100 |
+
pip install llama-cpp-python==0.3.16 --force-reinstall --verbose -C cmake.args="-DGGML_CUDA=on -DGGML_CUBLAS=on"
|
| 101 |
|
| 102 |
+
If you want to create a new wheel to help with future installs, you can run:
|
| 103 |
+
|
| 104 |
+
cd first to a folder that you have edit access for
|
| 105 |
+
|
| 106 |
+
pip wheel llama-cpp-python==0.3.16 --wheel-dir dist --verbose -C cmake.args="-DGGML_CUDA=on -DGGML_CUBLAS=on"
|
| 107 |
|
|
|
|
| 108 |
|
| 109 |
|