Echo9Zulu commited on
Commit
477f5bc
·
verified ·
1 Parent(s): 88cc288

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -2
README.md CHANGED
@@ -29,8 +29,8 @@ optimum-cli export openvino -m ""input-model"" --task image-text-to-text --weigh
29
  ### What does the test code do?
30
 
31
  Well, it demonstrates how to inference in python *and* what parts of that code are important for benchmarking performance.
32
- Text generation offers different challenges than text-generation with images; for examples, vision encoders often use different strategies for handling properties an image can have.
33
- In practice this translates to higher memory usage, reduced throughput or bad results.
34
 
35
  To run the test code:
36
 
 
29
  ### What does the test code do?
30
 
31
  Well, it demonstrates how to inference in python *and* what parts of that code are important for benchmarking performance.
32
+ Text generation offers different challenges than text-generation with images; for examples, vision encoders often use different strategies for handling properties an image can have; to get good performance **be mindful of image resolution**.
33
+ In practice this can translate to higher memory usage, reduced throughput and greater variety in results. Gemma-3 uses SigLIP 2 which has many SOTA optimizations; even so, effort in the preprocessing stage of a pipeline makes a world of difference.
34
 
35
  To run the test code:
36