> Image inputs are still a research preview and not publicly available. Will inp... | Hacker News

Hacker Newsnew | past | comments | ask | show | jobs | submit

		nealabq on March 14, 2023 \| parent \| context \| favorite \| on: GPT-4 > Image inputs are still a research preview and not publicly available. Will input-images also be tokenized? Multi-modal input is an area of research, but an image could be converted into a text description (?) before being inserted into the input stream.

teruakohatu on March 14, 2023 [–]

My understanding is thta the image embedding is included, rather than converting to text.

2sk21 on March 14, 2023 | [–]

My understanding is that image embeddings are a rather abstract representation of the image. What about if the image itself contains text, such as street signs etc?

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact