OpenAI releases image generation in the API Hackernews Viewer

OpenAI releases image generation in the API

481 points by themanmaran 24 April 2025 | 292 comments

Comments

cuuupid 24 April 2025

When this was up yesterday I complained that the refusal rate was super high especially on government and military shaped tasks, and that this would only push contractors to use CN-developed open source models for work that could then be compromised.

Today I'm discovering there is a tier of API access with virtually no content moderation available to companies working in that space. I have no idea how to go about requesting that tier of access, but have spoken to 4 different defense contractors in the last day who seem to already be using it.

johnyzee 22 hours ago

I wanted to try this in the image playground, but I was told I have to add a payment method. When adding this, I was told I would also have to pay a minimum of $5. Did this. Then when trying to generate an image, I was told I would have to do "verification" of my organization (?). OK, I chose 'personal'. I was then told I have to complete the verification though some third party partner of OpenAI, which included giving permission to process my biometric information. Yeah, I don't want to try this that bad, but now I already paid you and have to struggle to figure out how to get my money back. Horrible UX.

tezza 24 April 2025

For the curious I generated the same prompt for each of the quality types. ‘Auto’, ‘low’, ‘medium’, ‘high’.

Prompt: “a cute dog hugs a cute cat”

https://x.com/terrylurie/status/1915161141489136095

I also then showed a couple of DALL:E 3 images for comparison in a comment

alasano 25 April 2025

I built a local playground for it if anyone is interested (your openai org needs to be verified btw..)

https://github.com/Alasano/gpt-image-1-playground

Openai's Playground doesn't expose all the API options.

Mine covers all options, has built in mask creation and cost tracking as well.

film42 24 April 2025

I generated 5 images in the playground. One using a text-only prompt and 4 using images from my phone. I spent $0.85 which isn't bad for a fun round of Studio Ghibli portraits for the family group chat, but too expensive to be used in a customer facing product.

Imnimo 24 April 2025

I'm curious what the applications are where people need to generate hundreds or thousands of these images. I like making Ghibli-esque versions of family photos as much as the next person, but I don't need to make them in volume. As far as I can recall, every time I've used image generation, it's been one-off things that I'm happy to do in the ChatGPT UI.

badmonster 24 April 2025

Usage of gpt-image-1 is priced per token, with separate pricing for text and image tokens:

Text input tokens (prompt text): $5 per 1M tokens Image input tokens (input images): $10 per 1M tokens Image output tokens (generated images): $40 per 1M tokens

In practice, this translates to roughly $0.02, $0.07, and $0.19 per generated image for low, medium, and high-quality square images, respectively.

that's a bit pricy for a startup.

minimaxir 24 April 2025

Pricing-wise, this API is going to be hard to justify the value unless you really can get value out of providing references. A generated `medium` 1024x1024 is $0.04/image, which is in the same cost class as Imagen 3 and Flux 1.1 Pro. Testing from their new playground (https://platform.openai.com/playground/images), the medium images are indeed lower quality than either of of two competitor models and still takes 15+ seconds to generate: https://x.com/minimaxir/status/1915114021466017830

Prompting the model is also substantially more different and difficult than traditional models, unsurprisingly given the way the model works. The traditional image tricks don't work out-of-the-box and I'm struggling to get something that works without significant prompt augmentation (which is what I suspect was used for the ChatGPT image generations)

jumploops 24 April 2025

This new model is autoregression-based (similar to LLMs, token by token) rather than diffusion based, meaning that it adheres to text prompts with much higher accuracy.

As an example, some users (myself included) of a generative image app were trying to make a picture of person in the pouch of a kangaroo.

No matter what we prompted, we couldn’t get it to work.

GPT-4o did it in one shot!

hombre_fatal 23 hours ago

I would have expected an API like:

    let imageId = api.generateImage(prompt)
    let {url, isFinished} = api.imageInfo(id)

But instead it's:

    let bytes = api.generateImage(prompt)

It's interesting to me how AI APIs let you hold such a persistent, active connection. I'm so used to anything that takes more than a second becoming an async background process where you notify the recipient when it's ready.

With Netflix, it makes sense that you can open a connection to some static content and receive gigabytes over it.

But streaming tokens from a GPU is a much more active process. Especially in this case where you're waiting tens of seconds for an image to generate.

gervwyk 24 April 2025

Great svg generation would be far more userful! For example, being able to edit svg images after generated by Ai would be quick to modify the last mile.. For our new website https://resonancy.io the simple svg workflow images created was still very much created by hand.. and trying various ai tools to make such images yielded shockingly bad off-brand results even when provided multiple examples. By far the best tool for this is still canva for us..

Anyone know of an Ai model for generating svg images? Please share.

gitroom 22 hours ago

Man, pain in the ass just to try an image API, and then all these hoops for payments, ID, even biometrics? Stuff like this always makes me think does anyone up top even try their own product? you figure all this extra friction just ends up pushing users somewhere else?

qhwudbebd 25 April 2025

I hope the images support in the responses API is more competently executed than the mess piling up in the v1/images/generations endpoint.

To pick an example, we have a model parameter and a response_format parameter. The response_format parameter selects whether image data should be returned as a URL (old method) or directly, base64-encoded. The new model only supports base64, whereas the old models default to a URL return, which is fine and understandable.

But the endpoint refuses to accept any value for response_format including b64_json with the new model, so you can't set-and-forget the new behaviour and allow the model to be parameterised without worrying about it. Instead, you have to request the new behaviour with the older models, and not request it (but still get it) with the new one. sigh

_pdp_ 25 April 2025

We have integrated it into our platform and we already have use-cases for it to help create ads and other marketing material.

However, while being better than my other models, it is not perfect. The image edit api will make a similar looking picture (even with masking) but exactly the same with some modifications.

sebastiennight 24 April 2025

Hmm seems pricey.

What's the current state of the art for API generation of an image from a reference plus modifier prompt?

Say, in the 1c per HD (1920*1080) image range?

PeterStuer 25 April 2025

My number one ask as am almost 2 year OpenAI in production user: Enable Tool Use in the API so I can evaluate OpenAI models in agentic environments without jumping through hoops.

JPKab 23 hours ago

As a paying customer, you get completely hosed every time they add a new feature for the non-paying users.

The website is barely responding today, and the Desktop client always has massively degraded performance. Really annoying having their desire for user growth killing the experience for those of us who are financing it.

claiir 24 April 2025

> GoDaddy is actively experimenting to integrate image generation so customers can easily create logos that are editable [..]

I remember meeting someone on Discord 1-2 years ago (?) working on a GoDaddy effort to have customer-generated icons using bespoke foundation image gen models? Suppose that kind of bespoke model at that scale is ripe for replacement by gpt-image-1, given the instruction-following ability / steerability?

greatgib 24 April 2025

Any one has an idea of what represent an "image token" for the pricing? Is it a block of an image from a given fixed size?

ChaitanyaSai 25 April 2025

Almost every image has a yellow tint. Any discussion of why and when that's being fixed?

verelo 24 April 2025

“ Editing videos: invideo enables millions of users to transform their ideas into videos using AI. With the integration of gpt-image-1, the platform now offers improved text generation, fine-grain editing controls, and advanced style guidance.”

Does this mean this also does video in some manner?

pknerd 25 April 2025

I would like to know some resources about prompt engineering to use the Image gen module by OpenAI, especially for products related to images or Ads.

PS: Does anyone know a good LLM/service to turn images into Videos?

MisterBiggs 24 April 2025

Lots of comments on the price being too high, what are the odds this is a subsidized bare metal cost?

hnthrowaway0315 25 April 2025

I wonder which model is the best to output standard 2d game resources:

- N by N sprite sheets

- Isometric sprite sheets

Basically anything that I can directly drop into my little game engine.

jeevships 25 April 2025

Genuinely curious, why would someone buy from your gpt image wrapper when they can just create it in gpt themselves?

scyzoryk_xyz 24 April 2025

Intelligence is fast approaching utility status.

jonplackett 24 April 2025

Does anyone know if you can give this endpoint an image as input along with text - not just an image to mask, but an image as part of a text input description.

I can’t see a way to do this currently, you just get a prompt.

This, I think, is the most powerful way to use the new image model since it actually understands the input image and can make a new one based on it.

Eg you can give it a person sitting at a desk and it can make one of them standing up. Or from another angle. Or in the moon.

drakenot 24 April 2025

Does the AI have the same content restrictions that the chat service does?

gcrfelix 24 April 2025

lesson: never build your moat around optimizing the existing AI capability

p1dda 25 April 2025

For how long can OpenAI beat the dead horse that is LLM

smrt 24 April 2025

I don't understand why this api needs organization verification. More paperwork ahead. Facepalm

PermissionDeniedError: Error code: 403 - {'error': {'message': 'To access gpt-image-1, please complete organization verification

GaggiX 24 April 2025

Far too expensive, I think I will wait for an equivalent Gemini model.

topaz0 25 April 2025

Criminally wasteful.

system2 20 hours ago

Jesus, $0.19 for an image you may or may not use. I think it is still super expensive to be useful. I go through 10 AI images until I find a useful one. This might not work for everyone.

animanoir 24 April 2025

Wow more AI slop

1oooqooq 24 April 2025

aren't you all embarrassed seeing lame press releases of the most uninteresting things on the top of HN front page? i kinda feel bad.

hexo 24 April 2025

Thank you for a great contribution to global warming.

pkulak 24 April 2025

I don't get it. I've been using `dall-e-3` over the public API for a couple years now. Is this just a new model?

EDIT: Oh, yes, that's what it appears to be. Is it better? Why would I switch?

rahulg 25 April 2025

Been waiting for this to implement Ghibli, Muppets etc. in my WhatsApp bot that converts your photos into AI generated art. Check it out at https://artstudiobot.com. 80% vibe-coded, 20% engineer friend.