Update README.md
Browse files
README.md
CHANGED
@@ -29,8 +29,8 @@ optimum-cli export openvino -m ""input-model"" --task image-text-to-text --weigh
|
|
29 |
### What does the test code do?
|
30 |
|
31 |
Well, it demonstrates how to inference in python *and* what parts of that code are important for benchmarking performance.
|
32 |
-
Text generation offers different challenges than text-generation with images; for examples, vision encoders often use different strategies for handling properties an image can have
|
33 |
-
In practice this
|
34 |
|
35 |
To run the test code:
|
36 |
|
|
|
29 |
### What does the test code do?
|
30 |
|
31 |
Well, it demonstrates how to inference in python *and* what parts of that code are important for benchmarking performance.
|
32 |
+
Text generation offers different challenges than text-generation with images; for examples, vision encoders often use different strategies for handling properties an image can have; to get good performance **be mindful of image resolution**.
|
33 |
+
In practice this can translate to higher memory usage, reduced throughput and greater variety in results. Gemma-3 uses SigLIP 2 which has many SOTA optimizations; even so, effort in the preprocessing stage of a pipeline makes a world of difference.
|
34 |
|
35 |
To run the test code:
|
36 |
|