Every Jamstack site needs data – whether that data comes from external sources like a headless CMS or a database or from local sources like Markdown and data files. That's fundamental premise of a static site generator (SSG), to take this data and convert it into pre-rendered HTML, CSS and JavaScript.

However, accessing that data isn't always easy. In some cases, the SSG doesn't have a built in way to call external data at build time. This is the case for many "traditional" SSGs like Hugo and Jekyll. In other cases, you may have relatively easy access to make API calls, but combining data from multiple sources can prove complex. Perhaps a final case would be that you have data in external sources, but you prefer to store your content in local Markdown and data files.

Previously, I wrote about an experimental solution to this problem of a lack of easy API access by manually writing a build script in Node.js that called the an API generated by StepZen and converted the data to local files. Because it used StepZen, it could connect to data from anywhere – MySQL or Postgres databases, REST APIs and more. Today I want to share a solution called stepzen-content-pull that I am working on that eliminates the need to manually write that script. Using this tool, you can supply queries to your StepZen GraphQL API that can be connected to any backend StepZen supports and convert that data to Markdown, YAML or JSON data files for your SSG (and it works with any SSG).

Please note that this tool should be considered beta. I have successfully tested it against multiple schemas of my own, but I would appreciate any feedback on how to improve it to make it work for your schemas.

How It Works

In order to convert the data into files, the tool requires a configuration file. Essentially what this file does is supply the queries that should be run and give details about how each should be converted to files. You can supply multiple queries, but each query should output to a single type (Markdown, YAML or JSON). Let's take a look at a quick example that replicates the example from my original article.

In the below configuration, I supply my StepZen account and endpoint details and then my array of queries has only a single query. This query returns content for a list of pages that I want to include as posts in my Jekyll blog. First, I supply the query, which I can just copy and paste from the GraphiQL editor provided by stepzen start.

Next, I tell it that I want to convert it to Markdown with a .md extension (the tool also supports outputting to .markdown). In this case, I am not supplying a file name, just the file extension. This is because we are returning multiple results that each need to be unique files. In order to give each a unique file name, we supply a slug_field, which should be a string field that can be converted to a slug (ex. changing "My First Blog Post" to my-first-blog-post). We also need to indicate which field contains the Markdown body. All fields other than the body_field will be included in the frontmatter of the post.

Finally, we need to supply what folder we should put the outputted files into (for Jekyll, _posts makes sense) and supply any additional_frontmatter that we want included. This allows you to specify things like layout that your SSG might require. It is worth noting that all of these fields with the exception of additional_frontmatter are required to covert multiple query results into unique Markdown file outputs.

module.exports = {
  account_name: 'biggs',
  endpoint: '/netlify/pets-blog',
  queries: [
    {
      query: `{
            getPosts {
              title
              body
              published
              id
              categories {
                name
              }
            }
          }`,
      convert_to: '.md',
      slug_field: 'title',
      body_field: 'body',
      folder: '_posts',
      additional_frontmatter: {
        layout: 'post',
      },
    },
  ],
};

The only other thing that the stepzen-content-pull tool requires is your STEPZEN_API_KEY from your StepZen dashboard. You can supply this via a .env file or via the --apikey flag.

With the configuration and API key set, just run...

npx @remotesynth/stepzen-content-pull

And the tool will output each result into unique Markdown files within the _posts directory.

For details on how to configure the tool to include multiple queries, output JSON/YAML or output single Markdown files from a query that produces a single result, check the documentation.

Integrating Into Your Build

Once you have a configuration and STEPZEN_API_KEY environment variable set, all you need to do to get this integrated into your build is to add it to your build command. For example, for a Jekyll site, you might run:

npx @remotesynth/stepzen-content-pull && jekyll build

This will pull the content to local files before the site is deployed. If you're on Netlify, you could even integrate the StepZen Netlify Build Plugin to have a build process that:

  1. Builds and deploys your StepZen GraphQL API
  2. Pulls content from the API as local files
  3. Builds and deploys the site to Netlify's CDN

The best part is that all of this would exist within a single project structure and deployed simply by checking in changes to the git repository tied to your Netlify site.

Try It Out (Keep In Mind It's Beta)

I'd love for you to give this a try and let me know what you think. This is just an initial release and the tool could definitely use some better error messaging and debugging issues with your config file (it is open source so you can contribute too if you like). If you run into issues or you have features you'd like to see added, let me know. Hopefully it helps you pull any data from anywhere into your Jamstack site!