Repeated text leads to ~infinite processing
Hi there,
Sometimes an input is not controlled and might contain repetition which leads to kinda ~infinite loop of processing. It ends up with CUDA out of memory on my 12GB VRAM.
Input example: "RAS, SUS, ALO, RAS, RAS, RAS, RAS, RAS, RAS, CHAS"
Yes, it doesn't make sense, but anyway. Is there any workaround?
Weird! I have not read the whole dataset I made, but I can tell you that things like that were definitely not present in the training data haha. Functionally, what is happening is that because generation is by default using beam search, it searches the logit space for something "acceptable" as an output, never finds one, and keeps searching until you run OOM.
I can think of two potential solutions:
- decrease
num_beams
when running the inference. This will make the search space smaller, so it is more likely to finish searching before running OOM. downside: all your other results may have worse quality. - pass
max_time
with some reasonable value (30, 60 ?) so that generation stops early in these cases (docs), before it searches so much that you run OOM. Downside: you have to test empirically and find a value that does not cut off generation for standard cases.
IMO I think option 2 is better
@pszemraj
thanks for the response. I think both options are not suitable for me :(
Actually, my input comes from a transcribed speech (speech to text). Sometimes there might be hallucinations with repeated words or sentences, but it also can be a user speech who repeated a word or phrase 4 times. So, I would rather remove repetition than skip correction for a speech.
Ah gotcha ok.
I don't want to over-prescribe your problem but if they are actually repeats of the same word or phrase, you should be able to handle that by pre-processing your data with sentence-splitter and some basic regex/ngrams logic that chatgpt will write for you to check & filter sentences/chunked text for repeats
then you can run that filtered one through the model