Team Name

Kepler

Members

  • Ashton Teng
  • Quinn Leng

Project

Title

Kepler: Natural Language AI Agent for Tahoe-100M Exploration

Overview

Kepler lets biologists query the Tahoe-100M dataset in plain English, automating data access, analysis, and visualization without coding.

Motivation

High-dimensional datasets like Tahoe-100M require heavy compute setup, tool expertise, and programming skill—barriers that slow scientific insight.

We demonstrate the capability for the agent to allow for users to perform simple analyses with natural language.

Methods

  • Extracted a pseudobulked subset with Vision differential expression scores.
  • Loaded metadata tables for cell lines, drugs, and gene sets.
  • Built an AI agent to translate natural-language queries into analysis code and visual outputs.

Results

Demo query: “Which pathways are upregulated in BRAF.V600E mutant models after inhibitor treatment?”
Agent automatically filtered the data, ran the analysis, and generated plots with interpretations.

Discussion

  • Scalability: Move initial subsetting to DuckDB or Databricks for larger subsets.
  • Knowledge alignment: Enhance the agent’s scientific context for broader, valid analyses.
  • Next steps: Expand to full Tahoe-100M and optimize compute pipeline.
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Dataset used to train ashtonteng/tahoe-kepler