This is roberta-base trained on DNA promoter sequences of plants and fine-tuned on gene expression values (normalized to tpm) in 8 tissues of maize cultivars corresponding to their individual promoter sequences. Currently, this model is trained on 11.7 million Plant DNA promoter sequences. There are 47 million parameters in this model.

References:

To get predictions from DNA promoter sequences of plants from console / command-line directly, add your text file containing the sequences (1 sequence per line) to the data folder and call the main() function from prediction.py with your file name. For example:

  • Update main("test.txt") with your file name
  • Now, run python prediction.py

The results will be visible in tabular format in the console. For example,

tassel base anther middle ear shoot tip root
8.65 7.901 2.004 8.4001 7.523 6.23 9.0112 8.221

The values in the table correspond to TPM values for the tissues in the plants. TPM values are normalized gene expression values.

Both models can also be further used for more pretraining and finetuning. (Check references for further information)

Downloads last month
16
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no pipeline_tag.