About the environment

by Strand2013 - opened 7 days ago

Discussion

Strand2013

7 days ago

Could you provide a requirement list? thx.

Mihaiii

Owner 7 days ago

If you mean requirements to run the model, they are here: https://huggingface.co/spaces/Mihaiii/Ovis2-4B-RL-VQA-1/blob/main/requirements.txt

If you mean how this was fine-tuned, it was dome using ms-swift with the configuration provided in the readme.md.

Please let me know if this doesn't answer your question. It's not very clear to me what you mean.

Strand2013

7 days ago

thank you for you replay

I try to use the script you provided, but I meet a problem

first step is completion, its ok
but second step for train, can't got the label, the text_labels variable is None,
Can you give me some advice, how to slove it.

Mihaiii

Owner 7 days ago

•

edited 7 days ago

This is a bug in the Ovis code that I also encountered. I made a PR here: https://github.com/AIDC-AI/Ovis/pull/47 .

The PR got merged, but the HF repos were not updated.

To workaround this, I made my own repo and refereced that one in my training config:

https://huggingface.co/Mihaiii/Ovis2-4B

This is the reason why I have "--model 'Mihaiii/Ovis2-4B'" in my training config instead of "AIDC-AI/Ovis2-4B".

Alternatively, you could wait until the modeling code of the Huggingface repo to be updated to contain that fix in the above PR by the AIDC team.

//cc @runninglsy

Strand2013

7 days ago

very very very very appreciate

I fix the error, but the problem the loss is always zero, Did you encountered?

If all text_labels are set to placeholder, then wouldn't we have ground truth?

Strand2013

7 days ago

The script you ever privided at github was invalid, Could you give me a new one, thx.
“Here is the notebook: https://gist.github.com/Mihaiii/9ce15d9d82875528b84a86c3dda885bc”

Mihaiii

Owner 7 days ago

It's expected to be 0 in the beginning. Give it some time.

See here some details: https://github.com/modelscope/ms-swift/blob/main/docs/source_en/Instruction/GRPO.md

There is a section that talks about this:
Note: It is normal for the loss to approach zero during training. Refer to this issue for more details.

Strand2013

6 days ago

I train 8 hour，but the loss is always zero, I dont know how to slove it, Could you give me some help?
My task is OCR task, and I use --reward_funcs accuracy only

btw, Could you give me your email or other contact way, I am a chinese and I work about Multi-modal LLM and RL.

Strand2013

5 days ago

hello, Could you provide the plugin.py? thx

Mihaiii

Owner 5 days ago

Hey!

I looked at your image and you're still on epoch 0, which is strange. I'm attaching the content of a plugin.py file I have (I won't check now if it's for that db or another, is just an example), which also prints information. Check out if you get that info printend in you case (given it's stuck at epoch 0).

You can reach out to me by email. I'm on gmail.com with username apropodemine.

import re
from typing import List
import ast
import math
from swift.plugin.orm import ORM, orms
from swift.utils import get_logger

logger = get_logger()

class DBAccuracy(ORM):

    
    def extract_between_braces(self, s):
        match = re.search(r'\{(.*)\}', s, re.DOTALL)
        return match.group(1) if match else None
        
    def extract_first_or_self(self, value):
        return value[0] if isinstance(value, list) and value else value
    
    def __call__(self, completions, **kwargs) -> List[float]:
        str_comple = '\n'.join([s[:512] for s in completions])
        print('-' * 20, f"\nSOLUTION:\n{kwargs['solution'][0]}", f"\nEXTRACTED:\n{str_comple}")
        json_sol = ast.literal_eval(self.extract_first_or_self(kwargs['solution'][0]))
        result = []
        for completion in completions:
            json_completion = self.extract_between_braces(completion)
            if not json_completion:
                result.append(0.0)
                continue
            json_completion = "{ " + json_completion + " }"
            try:
                local_res = 0.0
                inc = 1.0 / len(json_sol)
                #print('a' * 20, f"{inc=}, {json_completion=}, {json_sol}")
                json_cmpl = ast.literal_eval(self.extract_first_or_self(json_completion))
                #print('b' * 20, f"{json_cmpl=}")
                for k, v in json_sol.items():
                    if k in json_cmpl and self.extract_first_or_self(json_cmpl[k]) == self.extract_first_or_self(v):
                        local_res += inc
                result.append(1 if math.isclose(local_res, 1, abs_tol=0.01) else local_res)
            except:
                result.append(0.0)
        print(result)
        return result
        

orms['external_db_accuracy'] = DBAccuracy

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment