Is there a plan to support multimodality, such as inputting images and videos for understanding and interaction
No plans at the moment. But its something I may consider in the future.
· Sign up or log in to comment