If you want to build AI on AWS, you might want to consider Bedrock. It is packed to the rafters with clever features that work seamlessly with the rest of the AWS ecosystem, saving time, risk, and effort.
In this blog post, I will teach you the basics to get started, including:
- What it does.
- How it fits in with other AWS tools.
- The key concepts and features it is built on and uses.
- How to use the API.
- How the pricing works.
A quick look back at the previous blog post in this series
In my first blog post of this series on Building AI on AWS services, I explored critical mistakes to avoid when building AI solutions on AWS. My key recommendations to successfully deploy AI solutions were:
- Define a clear AI strategy.
- Implement a comprehensive data management plan.
- Optimize costs effectively.
- Focus on metrics related to the benefits of AI.
I also briefly defined three different layers of AWS’s main AI services: Amazon Q, AWS Bedrock, and AWS Sagemaker. Even though we will concentrate on AWS Bedrock in this blog post, it could be well worth your time returning to that previous blog post to freshen up on the two other tools that work in unison.
What AWS Bedrock does
AWS Bedrock gives you the building blocks to create amazing AI tools. You automatically have access to a whole range of AI models—some for language and some for images and graphics. This is great because you're not stuck with a single model as you can choose the one that best suits your needs and budget.
While it's mostly used through APIs, Bedrock also has a handy sandbox GUI to help you get started. We will have a closer look at the concepts and features that make this possible later, including Bedrock Studio, which makes it easier for developers and data analysts to visualize, test, and tweak AI applications without having to dive into the code or APIs directly.
How AWS Bedrock stands out in the AWS AI landscape
As far as AWS AI services go, part of the idea behind Bedrock is that there is "No one model to rule them all."
If you are a developer, rather than having a single model, Bedrock gives you access to a wide range of foundational models through its API-first approach. This gives you the flexibility to meet diverse needs.
You can find the perfect, most efficient fit for your use case without the overhead of server maintenance since Bedrock is serverless. Furthermore, AWS just released a preview of a new Bedrock feature where you can import models into Bedrock.
The key concepts you need to master to work with Bedrock
Foundation models (FMs)
Bedrock has a large collection of pre-trained foundation models (FMs) for various natural language processing (NLP) and machine learning (ML) tasks. These models include large language models (LLMs) and multimodal models—all designed for different applications.
These FMs can perform a wide range of tasks, such as:
- Writing blog posts.
- Summarizing documents.
- Solving reasoning problems.
- Chatting.
- Answering questions.
- And even composing music and poetry.
Some LLM models, such as Llama 3 by Meta and Titan Text G1 by AWS, specialize in text-based tasks. They can understand and generate human-like text, making them suitable for language translation, text summarization, and content generation. The FM “families” often have different versions optimized for specific tasks, like summarization, chat, or quick response.
Multimodal models, like Claude 3.5 by Anthropic, handle multiple input types, such as text, images, and audio. This means they can tackle tasks involving multiple data types simultaneously.
Since the FMs vary so much in Bedrock, as a developer, you can choose the most appropriate model for your specific use case, whether it's purely text-based or needs multimodal capabilities.
You can use FMs across a wide range of applications, from chatbots and virtual assistants to content creation and data analysis, without training complex models from scratch or being stuck with just one model family. You can concentrate on the main task and just cherry-pick the best model for it.
Retrieval Augmented Generation (RAG)
Imagine that you have picked the best model to go forward with, but now you need a model that can answer very specific questions related to your field of business or your company.
This is where RAG comes to the rescue. Here's how it works:
- The model itself contains general knowledge and skills to have human-like discussions with coherent sentences.
- RAG adds specific information on top of that knowledge based on extra data that you provide.
For a deeper dive into RAG, I recommend my colleague’s excellent blog post: Considerations for RAG systems in product and service development.
In Bedrock, the RAG features are located in knowledge bases. “RAGging” then enriches the model’s responses by retrieving relevant information from those knowledge bases.
A knowledge base can be used not only to answer user queries and analyze documents but also to augment prompts provided to foundation models by adding context. Responses include citations so that users can verify the information and its accuracy.
To use the knowledge base, you need to preprocess your data so it can be used with queries. This process involves splitting your data into chunks, which are then processed, indexed, and retrieved when queried, and then creating embeddings with your selected model.
You will have to do some testing around your chunking strategy to get the best performance out of your knowledge bases. Chunking is crucial. Not only does it create more accurate semantic connections and better answers, but it also has positive effects on general performance and costs.
Embedding
Embedding means your data—words, images, or other inputs—is first converted into numerical vectors by its semantic relation and then stored in a vector database.
The database helps the AI model understand relationships between words and improve user-given tasks, such as:
- Text classification.
- Summarization.
- Translation.
- Sentiment analysis.
Essentially, embeddings transform complex data into a format that can be used to make more informed decisions.
Knowledge base metadata
Do you want to allow for filtering of data during a knowledge base query? Then use a separate metadata file containing attributes about the provided knowledge base data file. The file must reside in the same S3 bucket as the data file and have the same name as the associated file but with the .metadata.json extension.
Security and guardrails
As always with AWS, Bedrock uses a shared responsibility model, and the company has the bases covered regarding hardware and encryption of data in transit and rest.
Customer data is not shared with model providers and is not used to improve the base models.
As you can expect, Bedrock also lives up to all common compliance standards, such as ISO, HIPAA, and GDPR.
To help with governance and audits, Bedrock also provides extensive monitoring and logging features.
- Use Amazon CloudWatch to track usage metrics and create custom dashboards for auditing purposes.
- AWS CloudTrail tracks all API activity and helps troubleshoot issues when needed.
- Storing metadata, requests, and responses to the S3 bucket and Amazon CloudWatch Logs is also possible (and recommended).
Guardrails
Besides Bedrock’s compliance and secure-by-default setup, there is a feature called Guardrails, where you can set certain boundaries to instruct the AI to give only safe and responsible answers. This is applied to both input and output. At the moment, there are four different categories of guardrails, and you can mix and match them:
Content filters
Modify the filter levels to prevent input prompts or model responses that contain harmful content.
Denied topics
Specify a list of topics that are not acceptable for your application. These topics will be blocked if they appear in user queries or model responses. You can select up to 30 topics per guardrail.
Word filters
Set up filters to block unwanted words, phrases, and profanities, including offensive terms and competitor names. The easiest way to use this filter is to compile these words or phrases into a file and then upload it to the guardrail.
Sensitive information filters
Block or obscure sensitive information, such as personally identifiable information (PII), or some other patterns you want to exclude in user inputs and model responses. These instructions are given as regex patterns.
You can also set up guardrails so that messages are returned to the user if the input or a model response violates your defined guardrail policies.
You can create multiple versions of your guardrail. When you start, a working draft is automatically provided for you to modify iteratively. You can experiment with various configurations and use the built-in test window to determine if they suit your use case.
Once you are satisfied with a defined set of configurations, you can finalize a version of the guardrail and deploy it with supported foundation models.
You can apply guardrails directly to FMs during the inference API invocation by specifying the guardrail ID and version. Then, when a guardrail is used, it will assess the input prompts and the FM completions against the established policies.
One important thing to note:
Before going into production with your Bedrock implementation, you must thoroughly test these guardrails. For example, guardrails currently only support English. You may need to test and fine-tune the guardrail configurations or perhaps evaluate other means to validate responses before they land on the end user’s screen.
How to use the Bedrock API
There are two ways to use the Bedrock API:
- With the AWS Command Line Interface (AWS CLI).
- With AWS SDKs.
Let us briefly look at each of them.
Using the AWS CLI
First, you need to configure the credentials. Here’s how.
Once your credentials are in place, you can access the Bedrock operations from the command line.
- To see all available commands, use: “aws bedrock help”
- To list all available models, use: “aws bedrock list-foundation-models --profile YOUR-PROFILE-NAME”
An example of how to use AWS CLI
As of writing this, Anthropic just released the newest version of the Claude 3.5 model, so let's use that in our example. And since the UEFA Euro 2024 football championship is taking place right now, let’s have a bit of fun and ask Claude 3.5 which nation is its favorite to win the tournament.
aws bedrock-runtime converse \
--model-id anthropic.claude-3-5-sonnet-20240620-v1:0 \
--messages '{"role": "user", "content": [{"text": "Based on historical data and success in big tournaments what countries are the top-3 favorites to win Euro2024 Football championship?"}]}' \
--region us-east-1 \
--query output.message.content \
--profile YOUR-PROFILE-NAME
Answer:
[
{
"text": "Based on historical data, recent performances, and success in major tournaments, the top three favorites to win Euro 2024 are likely to be:\n\n1. France: As the current World Cup runners-up (2022) and Euro 2020 (played in 2021) round of 16 participants, France has consistently been one of the strongest teams in international football. They won the 2018 World Cup and have a deep pool of talent.\n\n2. England: Finalists in Euro 2020 and semi-finalists in the 2018 World Cup, England has shown significant improvement in recent years. They have a young, talented squad and have been performing well in major tournaments.\n\n3. Germany: Although they've had some disappointing performances in recent tournaments, Germany is always a strong contender, especially when playing at home. Euro 2024 will be held in Germany, which could give them an advantage. They have a history of success in major tournaments and are known for their ability to perform well in important competitions.\n\nIt's worth noting that other strong contenders could include Spain, Italy, Belgium, and Portugal. However, the actual favorites may change based on team performances, injuries, and other factors closer to the tournament."
}
]
So… I guess congrats, Germany!
Using the AWS SDK
Next, let us use a short example code created with the AWS SDK for Python (Boto3). (I recommend using venv to keep your system in order.)
An example of how to use AWS SDK
In this example, I want to analyze the graph below about total energy consumption and carbon dioxide emissions in Finland from 1990–2020.
Code:
import boto3
from botocore.exceptions import ClientError
MODEL_ID = "anthropic.claude-3-5-sonnet-20240620-v1:0" #Note: exact model ID and available regions do change over time
IMAGE_NAME = "energy_FI.png"
profile_name = "YOUR-PROFILE-NAME" # Add this line to define the profile_name variable
if profile_name:
session = boto3.Session(profile_name=profile_name)
bedrock_runtime = session.client('bedrock-runtime', region_name="us-east-1")
else:
bedrock_runtime = boto3.client('bedrock-runtime', region_name="us-east-1")
with open(IMAGE_NAME, "rb") as f:
image = f.read()
user_message = "Analyze this picture and explain what is happening, when and why? Think step by step and explain trends and events which explain changes in the graph. Output in JSON."
messages = [
{
"role": "user",
"content": [
{"image": {"format": "png", "source": {"bytes": image}}},
{"text": user_message},
],
}
]
response = bedrock_runtime.converse(
modelId=MODEL_ID,
messages=messages,
)
response_text = response["output"]["message"]["content"][0]["text"]
print(response_text)
Answer:
{
"analysis": {
"type_of_graph": "Stacked bar chart with an overlaid line graph",
"time_period": "1990 to 2020",
"main_metrics": [
"Energy sources (TJ)",
"Carbon dioxide emissions (milj. t CO2)"
],
"energy_sources": [
"Fossiliset (Fossil fuels)",
"Turve (Peat)",
"Uusiutuvat (Renewables)",
"Ydinenergia (Nuclear energy)",
"Muut (Others)"
],
"trends": [
{
"trend": "Decreasing fossil fuel use",
"description": "The blue bar representing fossil fuels shows a general decline, especially pronounced from 2010 onwards.",
"possible_reason": "Increased focus on reducing carbon emissions and transitioning to cleaner energy sources."
},
{
"trend": "Increasing renewable energy use",
"description": "The light blue bar for renewables shows steady growth throughout the period.",
"possible_reason": "Government policies promoting renewable energy, technological advancements, and growing environmental awareness."
},
{
"trend": "Fluctuating nuclear energy use",
"description": "The pink bar for nuclear energy shows some variations but remains a significant part of the energy mix.",
"possible_reason": "Changes in nuclear policy, plant operations, or public sentiment towards nuclear energy."
},
{
"trend": "Declining CO2 emissions",
"description": "The black line representing CO2 emissions shows a peak around 2003-2004 and then a steady decline.",
"possible_reason": "Shift towards cleaner energy sources, improved energy efficiency, and implementation of climate policies."
}
],
"notable_events": [
{
"period": "Early 2000s",
"event": "Peak in total energy consumption and CO2 emissions",
"possible_explanation": "Economic growth leading to increased energy demand before efficiency measures and cleaner technologies were widely adopted."
},
{
"period": "2010 onwards",
"event": "Accelerated decline in fossil fuel use and CO2 emissions",
"possible_explanation": "Stricter environmental regulations, Paris Agreement (2015), and faster adoption of renewable technologies."
}
],
"overall_conclusion": "The graph shows a clear transition in Finland's energy mix from 1990 to 2020, with a shift away from fossil fuels towards renewable energy sources, resulting in a significant reduction in CO2 emissions, especially in the latter part of the period."
}
}
How the Bedrock pricing model works
The cost of using Bedrock is based on model inference and customization. In short, using the API accrues a cost calculated per token and the price varies between models. Both input and output will be charged by the token.
When it comes to pricing plans, you have two options: On-demand and provisioned throughput.
On-demand
I strongly recommend you use on-demand pricing until you have a clear picture of your usage. It’s a good option for testing, tuning, and POC setups.
Provisioned throughput
Consider using this pricing plan when you have a strong business case and know that provisioned throughput is a financially viable option. Use this option only if there is consistent and high demand because it is quite pricey.
Other notable pricing factors are the knowledge bases and guardrails. Unfortunately, AWS doesn’t yet have an official, easy-to-use price calculator, but you can read more about this on the AWS Bedrock pricing page.
Final words of advice
Boldly go where your company has not gone before, but remember to set cost alarms.
I strongly recommend logging in to the AWS Console and finding your way around the Bedrock playground section.
- Test different models in the playground.
- Test knowledge base performance and benefits with your uncategorized data.
- Think about and test concrete guardrails and combinations to provide safe responses.
- Get your hands dirty and start experimenting with AWS CLI and SDKs.
But when you do this, keep in mind that in cloud environments, everything accrues at least a small cost in most cases. So, unless the cost is handled at an organizational level, remember to set your own billing alarms. Here’s how.
Try to come up with new and concrete solutions
Now that the “new era of computing” is here try to think about existing challenges from a different angle. During your normal work day, keep your eyes and ears open for opportunities where AI solutions could be used to solve everyday problems.
Also, go through your idea backlog or recently discarded ideas because there could well be an AI-assisted solution to make them happen now.
Bonus tip: Use Bedrock Studio
Amazon Bedrock Studio is a recently released preview feature of Bedrock. It makes it easy to build apps without needing a separate developer environment. Think of it as a user-friendly web interface where you can experiment with different models and features. For example, you could try out different prompts using the Anthropic Claude model without writing any code.
Bedrock Studio also makes it simple to add features like knowledge bases (for context-aware responses), guardrails (for responsible AI), and function calling (to access specific capabilities)—all without needing any coding expertise.
Published: Jul 1, 2024