Spaces:
Runtime error
Runtime error
| import gradio as gr | |
| import evaluate | |
| suite = evaluate.EvaluationSuite.load("Vipitis/ShaderEval") #downloads it | |
| #TODO: can you import it locally instead? | |
| # from ShaderEval import Suite | |
| # suite = Suite("Vipitis/ShaderEval") | |
| # save resutls to a file? | |
| text = """# Welcome to the ShaderEval Suite. | |
| This space hosts the ShaderEval Suite. more to follow soon. | |
| For an interactive Demo and more information see the demo space [ShaderCoder](https://huggingface.co/spaces/Vipitis/ShaderCoder) | |
| # Task1: Return Completion | |
| ## Explanation | |
| Modelled after the [CodeXGLUE code_completion_line](https://huggingface.co/datasets/code_x_glue_cc_code_completion_line) task. | |
| Using the "return_completion" subset of the [Shadertoys-fine dataset](https://huggingface.co/datasets/Vipitis/Shadertoys-fine). | |
| All preprocessing and post proessing is done by the custom evaluator for this suite. It should be as easy as just giving it a model checkpoint that can do the "text-generation" task. | |
| Evaluated is currently with just [exact_match](https://huggingface.co/metrics/exact_match). | |
| ## Notice | |
| should you find any model that throws an error, please let me know in the issues tab. Several parts of this suite are still missing. | |
| ## Instructions | |
| ### Run the code yourself:. | |
| ```python | |
| import evaluate | |
| suite = evaluate.EvaluationSuite.load("Vipitis/ShaderEval") | |
| model_cp = "gpt2" | |
| suite.run(model_cp, snippet=300) | |
| ``` | |
| ### try the demo below | |
| - Select a **model checkpoint** from the "dropdown" | |
| - Select how many **samples** to run (there us up to 300 from the test set) | |
| - Click **Run** to run the suite | |
| - The results will be displayed in the **Output** box | |
| ## Results | |
|  | |
| Additionally, you can report results to your models and it should show up on this [leaderboard](https://huggingface.co/spaces/autoevaluate/leaderboards?dataset=Vipitis%2FShadertoys-fine) | |
| ## Todo (feel free to contribute in a [Pull Request](https://huggingface.co/spaces/Vipitis/ShaderEval/discussions?status=open&type=pull_request)) | |
| - [~] leaderboard (via autoevaluate and self reporting) | |
| - [?] supporting batches to speed up inference | |
| - [ ] CER metric (via a custom metric perhaps?) | |
| - [x] removing the pad_token warning | |
| - [ ] adding OpenVINO pipelines for inference, pending on OpenVINO release | |
| - [ ] task1b for "better" featuring a improved testset as well as better metrics. Will allow more generation parameters | |
| - [ ] semantic match by comparing the rendered frames (depending on WGPU implementation?) | |
| - [ ] CLIP match to evaluate rendered images fitting to title/description | |
| """ | |
| def run_suite(model_cp, snippet): | |
| # print(model_cp, snippet) | |
| results = suite.run(model_cp, snippet) | |
| print(results) # so they show up in the logs for me. | |
| return results[0] | |
| with gr.Blocks() as site: | |
| text_md = gr.Markdown(text) | |
| model_cp = gr.Textbox(value="gpt2", label="Model Checkpoint", interactive=True) | |
| first_n = gr.Slider(minimum=1, maximum=300, default=5, label="num_samples", step=1.0) | |
| output = gr.Textbox(label="Output") | |
| run_button = gr.Button(label="Run") | |
| run_button.click(fn=run_suite, inputs=[model_cp, first_n], outputs=output) | |
| site.launch() | |