---
library_name: sklearn
tags:
- sklearn
- skops
- tabular-classification
model_format: skops
model_file: classifier.skops
widget:
- structuredData:
    credibleSetConfidence:
    - 0.75
    - 0.75
    - 0.25
    distanceFootprintMean:
    - 1.0
    - 1.0
    - 0.9948455095291138
    distanceFootprintMeanNeighbourhood:
    - 1.0
    - 1.0
    - 1.0
    distanceSentinelFootprint:
    - 1.0
    - 1.0
    - 0.9999213218688965
    distanceSentinelFootprintNeighbourhood:
    - 1.0
    - 1.0
    - 1.0
    distanceSentinelTss:
    - 0.9982281923294067
    - 0.9999350309371948
    - 0.9999213218688965
    distanceSentinelTssNeighbourhood:
    - 1.0
    - 1.0
    - 1.0
    distanceTssMean:
    - 0.9982281923294067
    - 0.9999350309371948
    - 0.9947366714477539
    distanceTssMeanNeighbourhood:
    - 1.0
    - 1.0
    - 1.0
    eQtlColocClppMaximum:
    - 0.949999988079071
    - 0.0
    - 0.06608512997627258
    eQtlColocClppMaximumNeighbourhood:
    - 1.0
    - 0.0
    - 1.0
    eQtlColocH4Maximum:
    - 1.0
    - 0.0
    - 0.0
    eQtlColocH4MaximumNeighbourhood:
    - 1.0
    - 0.0
    - 0.0
    geneCount500kb:
    - 20.0
    - 15.0
    - 8.0
    geneId:
    - ENSG00000087237
    - ENSG00000169174
    - ENSG00000084674
    goldStandardSet:
    - 1
    - 1
    - 1
    pQtlColocClppMaximum:
    - 0.0
    - 1.0
    - 0.0
    pQtlColocClppMaximumNeighbourhood:
    - 0.0
    - 1.0
    - 0.0
    pQtlColocH4Maximum:
    - 0.0
    - 1.0
    - 0.0
    pQtlColocH4MaximumNeighbourhood:
    - 0.0
    - 1.0
    - 0.0
    proteinGeneCount500kb:
    - 8.0
    - 7.0
    - 3.0
    sQtlColocClppMaximum:
    - 0.949999988079071
    - 0.0
    - 0.21970131993293762
    sQtlColocClppMaximumNeighbourhood:
    - 1.0
    - 0.0
    - 1.0
    sQtlColocH4Maximum:
    - 1.0
    - 0.0
    - 0.0
    sQtlColocH4MaximumNeighbourhood:
    - 1.0
    - 0.0
    - 0.0
    studyLocusId:
    - 005bc8624f8dd7f7c7bc63e651e9e59d
    - 02c442ea4fa5ab80586a6d1ff6afa843
    - 235e8ce166619f33e27582fff5bc0c94
    vepMaximum:
    - 0.33000001311302185
    - 0.6600000262260437
    - 0.6600000262260437
    vepMaximumNeighbourhood:
    - 1.0
    - 1.0
    - 1.0
    vepMean:
    - 0.33000001311302185
    - 0.6600000262260437
    - 0.0039977929554879665
    vepMeanNeighbourhood:
    - 1.0
    - 1.0
    - 1.0
---

# Model description

The locus-to-gene (L2G) model derives features to prioritise likely causal genes at each GWAS locus based on genetic and functional genomics features. The main categories of predictive features are:

        - Distance: (from credible set variants to gene)
        - Molecular QTL Colocalization
        - Variant Pathogenicity: (from VEP)

        More information at: https://opentargets.github.io/gentropy/python_api/methods/l2g/_l2g/
        

## Intended uses & limitations

[More Information Needed]

## Training Procedure

Gradient Boosting Classifier

### Hyperparameters

<details>
<summary> Click to expand </summary>

| Hyperparameter          | Value           |
|-------------------------|-----------------|
| objective               | binary:logistic |
| base_score              |                 |
| booster                 |                 |
| callbacks               |                 |
| colsample_bylevel       |                 |
| colsample_bynode        |                 |
| colsample_bytree        | 0.8             |
| device                  |                 |
| early_stopping_rounds   |                 |
| enable_categorical      | False           |
| eval_metric             | aucpr           |
| feature_types           |                 |
| feature_weights         |                 |
| gamma                   |                 |
| grow_policy             |                 |
| importance_type         |                 |
| interaction_constraints |                 |
| learning_rate           |                 |
| max_bin                 |                 |
| max_cat_threshold       |                 |
| max_cat_to_onehot       |                 |
| max_delta_step          |                 |
| max_depth               | 5               |
| max_leaves              |                 |
| min_child_weight        | 10              |
| missing                 | nan             |
| monotone_constraints    |                 |
| multi_strategy          |                 |
| n_estimators            |                 |
| n_jobs                  |                 |
| num_parallel_tree       |                 |
| random_state            | 777             |
| reg_alpha               | 1               |
| reg_lambda              | 1.0             |
| sampling_method         |                 |
| scale_pos_weight        | 0.8             |
| subsample               | 0.8             |
| tree_method             |                 |
| validate_parameters     |                 |
| verbosity               |                 |
| eta                     | 0.05            |

</details>

# How to Get Started with the Model

To use the model, you can load it using the `LocusToGeneModel.load_from_hub` method. This will return a `LocusToGeneModel` object that can be used to make predictions on a feature matrix.
        The model can then be used to make predictions using the `predict` method.

        More information can be found at: https://opentargets.github.io/gentropy/python_api/methods/l2g/model/
        

# Citation

https://doi.org/10.1038/s41588-021-00945-5

# License

MIT