Skip to content

data_proecess支持纯文本list #31

@linengcs

Description

@linengcs

经过使用发现,好像不能支持形如下面这种代码,会报错:

inp = model.data_process(
                        text=batch,
                        q_or_c="q",
                        task_instruction="Retrieve the target image that best meets these criteria: ",
                    )

即不支持纯文本list,原来的data_process里只对text为None的情况做了image的placeholder,我把data_process修改成下面后就可以正常处理,如果bge-vl-mllm模型确实不支持的话希望可以更新一下,这样处理会更灵活点

def data_process(self, images=None, text=None, q_or_c=None, task_instruction=None):
        if images is not None:
            _is_list = isinstance(images, list)
        elif text is not None:
            _is_list = isinstance(text, list)
        else:
            raise ValueError("images and text cannot be both None.")
        
        assert q_or_c in ["query", "candidate", "q", "c"]

        if not _is_list :
            text_input = self.prepare_text_input(images, text, q_or_c, task_instruction)
            text_input = [text_input]
            
            processed_images = None
            if images is not None:
                processed_images = [Image.open(images).resize((512,512)).convert("RGB")]
            
            inputs = self.processor(images=processed_images, text=text_input, return_tensors="pt", padding=True)

        else:
            # If only one of the lists is provided, create a placeholder list for the other
            if text is None and images is not None:
                text = [None] * len(images)
            elif images is None and text is not None:
                images = [None] * len(text)

            text_input = [self.prepare_text_input(_image, _text, q_or_c, task_instruction) for _image, _text in zip(images, text)]
            
            processed_images = None
            if images is not None:
                # Filter out None values before trying to open images
                valid_images = [_image for _image in images if _image is not None]
                if valid_images:
                    processed_images = [Image.open(_image).resize((512,512)).convert("RGB") for _image in valid_images]

            inputs = self.processor(images=processed_images, text=text_input, return_tensors="pt", padding=True)
        
        return inputs

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions