Creating GraphQL API for OpenAI Text and Image Processing


You can see the full source code of this project in the with-openai in the StepZen examples repo on GitHub.
Recent developments in neural networks and language models have opened the door to a world with new levels of automation, enabling the programmatic generation of text and images that appear human-written. Projects like OpenAI have provided impressive examples of stand-alone services using this technology, such as DALL·E 2 and chatGPT.
StepZen is taking this a step further and enabling us to integrate this technology with existing APIs and services. This article shows how to add GPT-3 text and image generation into an existing GraphQL API with StepZen.
We start from a GraphQL API for the feed of blog posts in our own StepZen Blog. This API fetches the last N posts, including their title, URL, full text, and metadata. Then, we extend it by adding a pair of OpenAI-powered properties to each blog post entry: summary
and image
. The summary
property contains a GPT3-generated short summary of the post, and the image
property links to a DALL·E 2 generated cover image.
Ultimately, we will have a GraphQL API that allows us to fetch posts from the StepZen blog and augments them on-the-fly using a GPT-3 based language model and the DALL·E 2 image generation network.
To add a bit more fun to this example, we tasked the language model to create each summary as a limerick. For example,
stepzen request '{ feed(limit: 1) { title guid summary {choices{text}} }}'
{
"data": {
"feed": [
{
"title": "How to Build a Headless CMS using Notion and StepZen",
"guid": "https://stepzen.com/blog/how-to-build-a-headless-cms-using-notion-and-stepzen",
"summary": {
"choices": [
{
"text": "A Notion project was in sight
So he decided to give it a try
He needed an API to make it work
And he found StepZen to be a perk
With a GraphQL API, now his blog site will fly"
}
]
}
}
]
}
}
The first step is to import the OpenAI text completion API into our GraphQL schema, where we can combine it with the other APIs.
Creating a GraphQL API for OpenAI text completion
OpenAI provides a REST API to execute text completion tasks on their GPT-3 based language models. To use this API you need to first sign up with OpenAI and get a personal API key. Once you have an OpenAI API key, you can use the stepzen import curl
command from the StepZen CLI to automatically create a GraphQL schema for this API.
stepzen import curl https://api.openai.com/v1/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $OPENAI_API_KEY" \
-d '{
"model": "text-davinci-003",
"prompt": "What is a language model and how is it related to AI?",
"temperature": 0.7,
"max_tokens": 256,
"top_p": 1,
"frequency_penalty": 0,
"presence_penalty": 0
}' \
--name openai \
--query-name textCompletion \
--query-type TextCompletion
You can manually adjust the auto-generated textCompletion
GraphQL query field in the generated schema file to tailor it to your needs. For example, you may want to reduce the number of required arguments by adding default values.
type Query {
textCompletion(
prompt: String
frequency_penalty: Float = 0
max_tokens: Int = 256
model: String = "text-davinci-003"
presence_penalty: Float = 0
temperature: Float = 0.7
top_p: Float = 1
): TextCompletion
@rest(
method: POST
endpoint: "https://api.openai.com/v1/completions"
headers: [{ name: "authorization", value: "$authorization_8bfcd33539;" }]
configuration: "curl_import_config"
)
}
At this point, we have added a textCompletion
query field to our GraphQL API. We can check that it works by deploying our GraphQL schema with stepzen deploy
and then sending requests. For example:
stepzen request '{
textCompletion(prompt: "What is a language model?") {
choices { text }
}
}'
The next step is to import the OpenAI image generation API to our GraphQL schema.
Creating a GraphQL API for DALL·E image generation
Similarly to the text completion OpenAI features, the DALL·E image generation is available through a REST API. The command to import it into a StepZen workspace looks like this:
stepzen import curl https://api.openai.com/v1/images/generations \
-H 'Content-Type: application/json' \
-H "Authorization: Bearer $OPENAI_API_KEY" \
-d '{
"prompt": "A cute baby sea otter"
}' \
--name openai \
--query-name generations \
--query-type GenerationResult
This command creates a GraphQL schema that you can modify manually to fit best into the GraphQL schema you already have.
In this example, we rename the auto-generated files to keep the image generation and text completion queries organized inside the same openai
folder. You can see the exact result in the with-openai folder in the StepZen examples repo on GitHub.
With minimal effort, we now have a GraphQL API that can be used to access OpenAI’s images/generations
API. You can try it out with:
stepzen request '{
generations(prompt: "a sail flying over the horizon into the sunset") {
created data { url }
}
}'
The next step is to combine the new textCompletion
and generations
queries with the existing GraphQL API for the blog posts feed.
Adding OpenAI-based properties to an existing GraphQL type
The feed
query field in the blog GraphQL API returns a list of BlogPost
objects. Each BlogPost
includes the full text of the blog entry (as HTML) in the encoded
property.
type BlogPost {
guid: ID
title: String
description: String
encoded: String
link: String
pubDate: String
}
Our goal is to extend this type with summary
and image
properties. With StepZen, this can be done by creating an extension type with the new properties and using StepZen's @materializer
directive to populate them with the results from other queries.
extend type BlogPost {
summary: TextCompletion
@materializer(
query: "summary"
arguments: [{ name: "fulltext", field: "plaintext" }]
)
image: GenerationResult
@materializer(
query: "image"
arguments: [{ name: "fulltext", field: "plaintext" }]
)
}
Now, the remaining part is to define the summary(fulltext)
and image(fulltext)
query fields.
Adding a summary(fulltext)
query field
The summary
field is calculated in three steps:
- convert the HTML-encoded blog post content into plain text
- prepare the exact text prompt for the OpenAI text completion API
- call the OpenAI text completion API with the prepared text prompt
Steps 1 and 2 are done with small JavaScript snippets powered by the ecmascript
property of StepZen’s @rest
directive, and the first step is materialized into a plaintext
property on the extended BlogPost
type for convenience.
Then, steps 2 and 3 are executed in a sequence using the @sequence
StepZen directive.
Check the example source code for details.
You can verify that this step works by deploying the updated schema to StepZen, and then sending a request to the summary
query field:
stepzen request '{
summary(fulltext: "This is a very long and insightful text.") {
choices {
text
}
}
}'
Adding an image(fulltext)
query field
The image
field is defined similarly to the summary
field but invokes the OpenAI APIs twice. First, it uses the text completion API to generate a text prompt for the image generation service, and then it calls the image generation service with the generated prompt. The full steps sequence looks like this:
- Convert the HTML-encoded blog post content into plain text.
- Prepare the exact text prompt for the OpenAI text completion API to generate an image description based on the blog post's full text.
- Call the OpenAI text completion API with the prepared text prompt and get an image description.
- Prepare the exact text prompt for the OpenAI image generation API.
- Call the OpenAI text completion API with the prepared text prompt and get an image description.
Step 1 reuses the plain text materialized into the plaintext
property.
Steps 2 and 4 are done with small JavaScript snippets powered by the ecmascript
property of StepZen’s @rest
directive.
All the steps are executed in a sequence using the @sequence
StepZen directive.
Check the example source code for details.
You can verify that this step works by deploying the updated schema to StepZen, and then sending a request to the image
query field:
stepzen request '{
image(fulltext: "This is a very long and insightful text.") {
data {
url
}
}
}'
Putting it all together
We’ve added two new properties to the BlogPost
type, and now the clients of our GraphQL API can also fetch a summary and a cover image for every blog post with a query like this:
stepzen request '{
feed(limit: 2) {
title
guid
pubDate
summary {
choices {
text
}
}
image {
data {
url
}
}
}
}'
StepZen allows federating and combining different APIs in a single GraphQL schema. This is a powerful concept, and it allows extending the APIs you have no control over with additional capabilities, such as adding GPT-3 based text completion, DALLE-2-based image generation, and many others.
The users of the GraphQL API layer see a single combined API without the need to orchestrate multiple backend calls, combine and convert data, and think of the operational aspects of API hosting, such as deployments, releases, and maintenance.
This may be very useful for data ingestion API clients because they can use this API to augment and enrich the data they fetch on the fly.
However, this is not a universal solution. Exposing this API to read-intensive clients (such as web pages) is a bad idea - generating a new unique summary and cover image on each read operation is unnecessarily expensive and slow. This example would be a better fit for APIs where reads fetching the same data are infrequent (such as data pipelines and ETL operations).
Want to try this example for yourself? Clone the StepZen examples repo and follow the README in the with-openai folder.
We'd love to hear your feedback on this blog, on GraphQL, on REST, or on StepZen. Questions? Comments? Ping us on Twitter or join our Discord community to get in touch.
cover image credits: DALL·E [openai.com]