{"model":"salesforce/blip","classification":{"summary":"Image captioning and visual question answering","inputTypes":["text","image"],"outputTypes":["text"],"task":"visual-question-answering","useCases":["Generate captions for images","Answer questions about image content","Match text descriptions to images","Describe visual scenes in words","Create alt text for web images","Identify objects in photographs","Help visually impaired understand images","Analyze content in visual media","Create image metadata automatically","Extract information from visual content"],"taskSummary":"Visual Question Answering is the task of answering open-ended questions based on an image. They output natural language responses to natural language questions."}}