MLLMBench

🤗 Welcome to MLLM-Bench auto-evaluation platform! We offer limited number of FREE evaluations using GPT-4V (once per day per email). 🤗

Please follow the instructions below to submit your answer file.

Submission Format

Here is a sample for submission.
You can validate the format by running the following Python code snippet:

import json samples = json.load(open(fp)) assert type(samples) == list assert len(samples) == 420, 'the length of the answer file should be 420.' assert all([type(s) == dict for s in samples]) # "id", "gen_model_id" and "answer" are required keys for each sample. Redundant keys have no effect on evaluation. assert all(['id' in s for s in samples]), "'id' must be a key of every sample" assert all(['gen_model_id' in s for s in samples]), "key `gen_model_id` is the name for your model and is missing" assert all(['answer' in s for s in samples]), "'answer' must be a key of every sample" assert sorted([s['id'] for s in samples]) == list(range(420)), 'ids must start from 0 and end at 419' print('good to go!')

Procedure

0. Check your submission format.

1. Enter a valid email address.

2. Choose a local .json file by clicking the "Choose File" button.

3. Click the "Upload & Process" button. An email will be sent to the address above by mllmbench@163.com. Check your junk mailbox if necessary.

Submission status: ❌