--- title: WHOOPS! Explorer emoji: 🔥 colorFrom: purple colorTo: blue sdk: gradio sdk_version: 3.21.0 app_file: app.py pinned: false --- Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference # Dataset Card for WHOOPS! - [Dataset Description](#dataset-description) - [Contribute Images to Extend WHOOPS!](#contribute-images-to-extend-whoops) - [Languages](#languages) - [Dataset](#dataset-structure) - [Data Fields](#data-fields) - [Data Splits](#data-splits) - [Data Loading](#data-loading) - [Licensing Information](#licensing-information) - [Annotations](#annotations) - [Considerations for Using the Data](#considerations-for-using-the-data) - [Citation Information](#citation-information) ## Dataset Description WHOOPS! is a dataset and benchmark for visual commonsense. The dataset is comprised of purposefully commonsense-defying images created by designers using publicly-available image generation tools like Midjourney. It contains commonsense-defying image from a wide range of reasons, deviations from expected social norms and everyday knowledge. The WHOOPS! benchmark includes four tasks: 1. A novel task of explanation-of-violation: generating a detailed explanation for what makes the image weird. 2. Generating a literal caption 3. Distinguishing between detailed and underspecified captions 4. Answering questions that test compositional understanding The results show that state-of-the-art models such as GPT3 and BLIP2 still lag behind human performance on WHOOPS!. * Homepage: https://whoops-benchmark.github.io/ * Paper: https://arxiv.org/pdf/2303.07274.pdf * WHOOPS! Explorer: https://huggingface.co/spaces/nlphuji/whoops-explorer-full * Normal vs. Wired Explorer: https://huggingface.co/spaces/nlphuji/whoops-explorer-analysis * Point of Contact: yonatanbitton1@gmail.com [//]: # (Colab notebook code for WHOOPS evaluation ) ## Contribute Images to Extend WHOOPS! Would you like to add a commonsense-defying image to our database? Please send candidate images to yonatanbitton1@gmail.com. Thanks! ### Languages English. ## Dataset ### Data Fields image (image) - The weird image. designer_explanation (string) - Detailed single-sentence explanation given by the designer, explaining why the image is weird. selected_caption (string) - The caption that was selected from the crowed collected captions. crowd_captions (list) - Crowd collected captions, depicting whats been seen in the image. crowd_explanations (list) - Crowd collected single-sentence explanations, explaining why the image is weird. crowd_underspecified_captions (list) - Crowd collected under-specified captions, depicting what is seen in the image, without depicting the commonsense-violation. question_answering_pairs (list) - Automatically generated Q-A pairs. FlanT5 XL was used to answer the questions and filter out instances where the BEM metric is above 0.1. commonsense_category (string) - The commonsense category the images related to (Full categories list can be found in [paper](https://arxiv.org/pdf/2303.07274.pdf)). image_id (string)- The unique id of the image in the dataset image_designer (string) - The name of the image designer. ### Data Splits There is a single TEST split. Although primarily intended as a challenging test set, we trained on the WHOOPS! dataset to demonstrate the value of the data and to create a better model. We will provide the splits in the future. ### Data Loading You can load the data as follows (credit to [Winoground](https://huggingface.co/datasets/facebook/winoground)): ``` from datasets import load_dataset examples = load_dataset('nlphuji/whoops', use_auth_token=) ``` You can get `` by following these steps: 1) log into your Hugging Face account 2) click on your profile picture 3) click "Settings" 4) click "Access Tokens" 5) generate an access token ## Licensing Information [CC-By 4.0](https://creativecommons.org/licenses/by/4.0/) Additional license information: [license_agreement.txt](https://huggingface.co/datasets/nlphuji/whoops/blob/main/license_agreement.txt) After clicking on “Access repository”, you affirmed that your intent is solely to use it for research purposes, explicitly excluding the development of commercial chatbots, and you acknowledge acceptance of the terms in the [WHOOPS! license agreement](https://whoops-benchmark.github.io/static/pdfs/whoops_license_agreement.txt). - The dataset is aimed to facilitate academic research with the purpose of publications. - Participants will not incorporate the Dataset into any other program, dataset, or product. - Participants may report results on the dataset as a test set. [//]: # (To evaluate WHOOPS! with a fine-tune BLIP2, we split the images in WHOOPS! into 5 cross- validation splits. For these 5 splits independently, we train supervised models using 60% of the data as training, 20% as validation, and 20% for test.) ## Annotations We paid designers to create images, and supply explanations for what is making the image wierd. We paid Amazon Mechanical Turk Workers to supply explanations, captions and under-specified captions for each image in our dataset. ## Considerations for Using the Data We took measures to filter out potentially harmful or offensive images and texts in WHOOPS!, but it is still possible that some individuals may find certain content objectionable. If you come across any instances of harm, please report them to our point of contact. We will review and eliminate any images from the dataset that are deemed harmful. [//]: # (All images, explanations, captions and under-specified captions were obtained with human annotators.) ### Citation Information @article{bitton2023breaking, title={Breaking Common Sense: WHOOPS! A Vision-and-Language Benchmark of Synthetic and Compositional Images}, author={Bitton-Guetta, Nitzan and Bitton, Yonatan and Hessel, Jack and Schmidt, Ludwig and Elovici, Yuval and Stanovsky, Gabriel and Schwartz, Roy}, journal={arXiv preprint arXiv:2303.07274}, year={2023} }