Handle truncated image boundaries in `_convert` to avoid tensor size mismatch
#54
by
maikezu
- opened
Summary
This PR proposes a change in _convert
to handle cases where truncation (max_inp_length
)
could leave an unmatched <im_start>
(or <slice_start>
) token without its closing <im_end>
/ <slice_end>
.
When this happens, image_start_idx
and image_end_idx
have different lengths,
causing a runtime error in line 274:
RuntimeError: Sizes of tensors must match except in dimension 1. Expected size x but got size x-1 for tensor number 1 in the list.
Changes
- Changed
valid_image_nums
frommax(len(start), len(end))
tomin(len(start), len(end))
→ only keep valid start–end pairs