Team Name
Kepler
Members
- Ashton Teng
- Quinn Leng
Project
Title
Kepler: Natural Language AI Agent for Tahoe-100M Exploration
Overview
Kepler lets biologists query the Tahoe-100M dataset in plain English, automating data access, analysis, and visualization without coding.
Motivation
High-dimensional datasets like Tahoe-100M require heavy compute setup, tool expertise, and programming skill—barriers that slow scientific insight.
We demonstrate the capability for the agent to allow for users to perform simple analyses with natural language.
Methods
- Extracted a pseudobulked subset with Vision differential expression scores.
- Loaded metadata tables for cell lines, drugs, and gene sets.
- Built an AI agent to translate natural-language queries into analysis code and visual outputs.
Results
Demo query: “Which pathways are upregulated in BRAF.V600E mutant models after inhibitor treatment?”
Agent automatically filtered the data, ran the analysis, and generated plots with interpretations.
Discussion
- Scalability: Move initial subsetting to DuckDB or Databricks for larger subsets.
- Knowledge alignment: Enhance the agent’s scientific context for broader, valid analyses.
- Next steps: Expand to full Tahoe-100M and optimize compute pipeline.
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support