“Software is eating the world” and “APIs are eating software” have become familiar proclamations across the modern-day software industry. Given the resounding hype, interest, and substantial monetary investments happening in Artificial Intelligence (AI), a likely follow-on statement is, “AI is eating APIs, software, and everything else!”
AI is not a new phenomenon yet following OpenAI’s launch of its now infamous app ChatGPT in November, our minds have been consumed with the art of what’s possible. The ChatGPT launch was quickly followed by Microsoft’s $10 billion+ partnership announcement with OpenAI sending very clear statements of intent regarding big tech’s belief in the future of AI-enabled experiences. And the whirlwind is only picking up speed with the launches of new AI-powered search by Microsoft, and Google’s bumpy launch of their competitor offering, Bard. AI is the big tech news story of 2023 thus far, crossing the chasm to mainstream media.
What’s ChatGPT and why such hype about similar AI offerings?
ChatGPT (Chat Generative Pre-trained Transformer) is a chatbot type of AI generally referred to as generative AI. The ChatGPT model has been trained using a varied and extensive text-based dataset predominately from internet sourced text written by humans, including blogs, articles, websites, and social media content. (Human conversations are part of the data set which is one of the reasons responses sound human-like.) Furthermore, the model was optimized for dialogue by using what is called Reinforcement Learning with Human Feedback (RLHF). The goal of the model for ChatGPT is to have the ability to understand and generate output on a diverse range of topics with consistency and high levels of clarity.
In general, generative-based AI apps provide a brief prompt in the form of natural human language (think human conversational style) and have the app generate outputs for you (converse with you). With ChatGPT, the output provided is text, so it’s a text-to-text (or text-to-essay) subtype of generative AI (there are several forms of generative AI, like text-to-speech, text-to-image, text-to-video, and so on).
What’s an API?
API stands for Application Program Interface. An API is an interface to a service, capability, or data. The programmable interface facilitates connections and communication between software applications and components, while hiding the implementation details. APIs are the language of software and power the web, mobile, network, device, and other types of applications.
Why more APIs?
The frenzied interest and curiosity with ChatGPT led it to achieve record breaking numbers of signups within days of launch. It’s not a surprise given it offers arguably the best yet conversational experiences for generative AI. However, for the technology to be successful at scale and adopted the world over, it needs to be seamlessly integrated into the experiences where people are, and most importantly, bring value to those experiences. Having to context switch over to a standalone chatbot (or any standalone proprietary offering) is going to have limited longevity.
The full disruptive power of these new AI advances will come when they are seamlessly integrated into millions of other software programs across the planet. For integration to be possible, we need APIs! There are already numerous generative AI models available through APIs today (including existing models from OpenAI such as GPT-3, Codex, and content filter), and OpenAI have stated that an API for ChatGPT will be released shortly. Google’s Bard is also launching an API, and it’s a sensible expectation that Microsoft will also expose an API for their new Prometheus model which combines the best of existing OpenAI models with more relevant, timely, and targeted results with improved safety.
API entry points to these capabilities are provided for many reasons. First, and let us call a spade a spade, it’s big business. Offering paid tier access to the power on offer is fundamentally part of the revenue model of the providing companies for the years ahead. It’s no surprise and completely expected given the monetary investments thus far and planned.
Capital gain is not the sole driver. Governance and oversight around the ethical use of AI is also a driver giving the ability for the providers of such technology to revoke access to consumers deemed to be misusing the capabilities in violation of provider principals. OpenAI states they will “terminate API access for use cases that are found to cause (or are intended to cause) physical, emotional, or psychological harm to people, including but not limited to harassment, intentional deception, radicalization, astroturfing, or spam, as well as applications that have insufficient guardrails to limit misuse by end users.”
The ethics around AI is a huge topic and outside the scope of this piece, but for reference, it’s worth noting the principles drafted by Microsoft and OpenAI.
Accuracy, Verification, and Structured AI Assistance
Although ChatGPT has been tested on a broad spectrum of situations and has passed professional examinations, it’s not an authority on information. It has limited knowledge of the world and events after 2021. It’s a fact that it can produce incorrect answers, and it may provide harmful instructions or biased content. Large language models, like ChatGPT, are probabilistic (the freedom to be random is configurable to an extent – see API calls below) and will occasionally make up facts or hallucinate outputs. It should also be noted that because responses are human-like, they can be quite convincing regardless of accuracy. We rely on such probability algorithms all the time and being perpetually deterministic limits the capability of AI. We just need to be aware.
Verification of the responses is one of the reasons why leveraging the API is the preferred (and likely only mainstream) usage approach moving forward. The generative AI capabilities will greatly enhance human productivity across several fields. Software applications that design for the consumption of APIs while ensuring humans are in the loop, end-user restrictions are applied, post-processing of outputs (including verification, content filtering) are executed, and active monitoring is enabled, will be able to offer incredible experience across their domains.
A simplistic guide to using AI assisted information integrated into consuming systems
Improved End Customer Experience
Offering the power of generative AI models through APIs simplifies the experience for the rest of the developer world. Putting the complexity behind the API enables consumers to speed up the delivery of the latest wave of AI innovation directly into end customer experiences (CX) across all industry verticals. Afterall, APIs are a simple conduit to the value offering, but interesting by the very nature of APIs (being easily consumable), a cyclical effect is likely. Innovators outside of the core AI space will create new ways of mashing up the API-enabled AI power with other capabilities to create new innovative products, and in turn, make those new innovations available for other developers to consume (and so on).
2-layered API economy
The assembling of APIs into new API product offerings is also not new, and it’s evident in the 2-layered API economy of today. Over the last decade, many API-first companies started simply by combining offerings already existing and packaging them in a more consumable way tailored on meeting the expectations of consuming developers. In other cases, they offered new capabilities composed of several independent offerings but again focused on a specific problem and prioritizing the usability of the API for its consumers.
Layered API economy. Source: The AI Journal
The layering that’s evident in today’s API economy is often split by those providing capabilities and those who provide services and infrastructure to address large scale technical (or industry) challenges, and then wrap those capabilities in APIs. Examples are large cloud technology providers, telco providers, as well as financial institutions or banks with the latter often being mandated to expose capabilities through APIs by market or regulatory pressures. Now AI providers, often overlapping as cloud providers, fall into this category.
The core service and infra offerings of these providers are fueling a more business-to-consumer (B2C) focused segment and specialize themselves in providing excellent immersive experiences. Open Banking was an enabler for innovation and improved consumer competition within the banking sector, and now with the exposure of APIs for powerful AI capabilities, we stand at the dawn of accelerated innovation possibilities agnostic of industry or sector.
Interaction with Generative AI via APIs is easy
Getting up and running with most of the AI APIs provided by the major players is simple. You don’t need to be a developer to try it out!
Let’s walk through getting up and running with the Completions API offered by OpenAI. The Completions API offers the ability to programmatically interact with the available GPT-3 models which can understand and generate natural language.
Register for the OpenAI Platform
First, register with OpenAI to obtain your API Key, which you will need to authorize for API calls.
- Go to https://beta.openai.com/signup, and signup using your preferred method
- Once setup, click on your username (top right), and choose “View API Keys”
- Click “Generate new secret key” and make sure to copy the generated secret so you can use it for future API calls
Making a call to the Completions API
Now that you have your API key, you are almost ready to make a request to the API endpoint. The next step is to head over to the API docs to understand the shape and schemas used by the API.
To speed up the process, below you’ll find a minimalist example to meaningfully interact with the endpoint using curl.
In the above simple request, I’m asking the text-davinci-003 GPT-3 model the prompt “What’s your favorite thing about the OpenAPI Specification?” I set the max_tokens as 150, which limits the length of the answer the model will provide back (the GPT family of models process text using tokens which are common sequences of characters found in text. 1 token generally corresponds to about 4 characters of text for common English). Finally, I set the temperature to 0.4. The temperature range is between 0 and 2 (from deterministic to probabilistic).
The following API response is returned based on the request above:
The answer to the prompt provided is contain with the choices[*].text property:
“My favorite thing about the OpenAPI Specification is its ability to provide a consistent and standardized way to document and describe APIs. This makes it easier to understand how an API works and how to interact with it which is essential for developers and users alike.”
There is of course a whole range of more complex use cases that can be approached with almost as much simplicity, but the above is an attempt to break any misconceptions that may be there about the difficulty to get up and running with these APIs. The image at the top of this blog was also generated with a simple API call!
It’s even easier to start exploring and playing with the raw APIs using one of your preferred API exploration tools rather than issuing the command line or code requests.
Exploring OpenAI APIs using SwaggerHub Explore (free)
It’s an exciting time within the API and AI spaces. Embracing the potential seems like the most advantageous approach and has already happened within sectors like technology. Progression in AI is seen as a natural evolution of how we interface with (and programmatical direct) computers. This is already evident in the curriculum for computer science programs, which has a different focus compared to when I studied it at the earlier part of this century.
The general education sector is being challenged by the disruptive force of ChatGPT and other offerings. Moving forward with AI at our side, the shift will move from quoting knowledge to instead proving one’s understanding. Perhaps that will be validated by the AI tutor or at least the software systems being used by our educators!
Complexity never disappears in the art of delivering software. It just shifts around, and as consumer expectations continue to increase, we still must focus on the quality attributes of software. With the predicted acceleration in API growth, more AI contributing to even more API production and consumption, and the general trend for more companies depending on third-party APIs to deliver core capabilities, it’s never been more important to be critical of API quality.