Diverse Beam Search, our proposed method to decode significantly different sequences from neural sequence models is used to generate captions for images.
The captioning model is trained on COCO using the neuraltalk2 repository. Code to setup this demo is available here.
Browsers currently supported by the demo: Google Chrome, Mozilla Firefox.
Try Diverse Beam Search: Sample Images
Click on one of these images to send it to our servers (Or upload
your own images below)
Note: nothing is pre-computed for these images. They are treated as a fresh upload with every click.
Try Diverse Beam Search On Your Images
Terminal:
How it works
You upload an image.
Your request is sent to our servers with GPUs courtesy NVIDIA.
Our servers run our deep-learning based algorithm.
Results and updates are shown in real-time.
Result of Diverse Beam Search
Note: More diverse lists can be obtained by increasing the number of groups (G) and the diversity strength λ.