l2g_xgboost_777 / README.md
ireneisdoomed's picture
test model
84b1744 verified
---
library_name: sklearn
tags:
- sklearn
- skops
- tabular-classification
model_format: skops
model_file: classifier.skops
widget:
- structuredData:
credibleSetConfidence:
- 0.75
- 0.75
- 0.25
distanceFootprintMean:
- 1.0
- 1.0
- 0.9948455095291138
distanceFootprintMeanNeighbourhood:
- 1.0
- 1.0
- 1.0
distanceSentinelFootprint:
- 1.0
- 1.0
- 0.9999213218688965
distanceSentinelFootprintNeighbourhood:
- 1.0
- 1.0
- 1.0
distanceSentinelTss:
- 0.9982281923294067
- 0.9999350309371948
- 0.9999213218688965
distanceSentinelTssNeighbourhood:
- 1.0
- 1.0
- 1.0
distanceTssMean:
- 0.9982281923294067
- 0.9999350309371948
- 0.9947366714477539
distanceTssMeanNeighbourhood:
- 1.0
- 1.0
- 1.0
eQtlColocClppMaximum:
- 0.949999988079071
- 0.0
- 0.06608512997627258
eQtlColocClppMaximumNeighbourhood:
- 1.0
- 0.0
- 1.0
eQtlColocH4Maximum:
- 1.0
- 0.0
- 0.0
eQtlColocH4MaximumNeighbourhood:
- 1.0
- 0.0
- 0.0
geneCount500kb:
- 20.0
- 15.0
- 8.0
geneId:
- ENSG00000087237
- ENSG00000169174
- ENSG00000084674
goldStandardSet:
- 1
- 1
- 1
pQtlColocClppMaximum:
- 0.0
- 1.0
- 0.0
pQtlColocClppMaximumNeighbourhood:
- 0.0
- 1.0
- 0.0
pQtlColocH4Maximum:
- 0.0
- 1.0
- 0.0
pQtlColocH4MaximumNeighbourhood:
- 0.0
- 1.0
- 0.0
proteinGeneCount500kb:
- 8.0
- 7.0
- 3.0
sQtlColocClppMaximum:
- 0.949999988079071
- 0.0
- 0.21970131993293762
sQtlColocClppMaximumNeighbourhood:
- 1.0
- 0.0
- 1.0
sQtlColocH4Maximum:
- 1.0
- 0.0
- 0.0
sQtlColocH4MaximumNeighbourhood:
- 1.0
- 0.0
- 0.0
studyLocusId:
- 005bc8624f8dd7f7c7bc63e651e9e59d
- 02c442ea4fa5ab80586a6d1ff6afa843
- 235e8ce166619f33e27582fff5bc0c94
vepMaximum:
- 0.33000001311302185
- 0.6600000262260437
- 0.6600000262260437
vepMaximumNeighbourhood:
- 1.0
- 1.0
- 1.0
vepMean:
- 0.33000001311302185
- 0.6600000262260437
- 0.0039977929554879665
vepMeanNeighbourhood:
- 1.0
- 1.0
- 1.0
---
# Model description
The locus-to-gene (L2G) model derives features to prioritise likely causal genes at each GWAS locus based on genetic and functional genomics features. The main categories of predictive features are:
- Distance: (from credible set variants to gene)
- Molecular QTL Colocalization
- Variant Pathogenicity: (from VEP)
More information at: https://opentargets.github.io/gentropy/python_api/methods/l2g/_l2g/
## Intended uses & limitations
[More Information Needed]
## Training Procedure
Gradient Boosting Classifier
### Hyperparameters
<details>
<summary> Click to expand </summary>
| Hyperparameter | Value |
|-------------------------|-----------------|
| objective | binary:logistic |
| base_score | |
| booster | |
| callbacks | |
| colsample_bylevel | |
| colsample_bynode | |
| colsample_bytree | 0.8 |
| device | |
| early_stopping_rounds | |
| enable_categorical | False |
| eval_metric | aucpr |
| feature_types | |
| feature_weights | |
| gamma | |
| grow_policy | |
| importance_type | |
| interaction_constraints | |
| learning_rate | |
| max_bin | |
| max_cat_threshold | |
| max_cat_to_onehot | |
| max_delta_step | |
| max_depth | 5 |
| max_leaves | |
| min_child_weight | 10 |
| missing | nan |
| monotone_constraints | |
| multi_strategy | |
| n_estimators | |
| n_jobs | |
| num_parallel_tree | |
| random_state | 777 |
| reg_alpha | 1 |
| reg_lambda | 1.0 |
| sampling_method | |
| scale_pos_weight | 0.8 |
| subsample | 0.8 |
| tree_method | |
| validate_parameters | |
| verbosity | |
| eta | 0.05 |
</details>
# How to Get Started with the Model
To use the model, you can load it using the `LocusToGeneModel.load_from_hub` method. This will return a `LocusToGeneModel` object that can be used to make predictions on a feature matrix.
The model can then be used to make predictions using the `predict` method.
More information can be found at: https://opentargets.github.io/gentropy/python_api/methods/l2g/model/
# Citation
https://doi.org/10.1038/s41588-021-00945-5
# License
MIT