Preparing for the AWS Certified Solutions Architect – Professional exam isn’t about memorising services—it’s about understanding when to use them and, more importantly, when not to. This guide focuses on:
- core serverless patterns,
- frequently tested streaming and IoT services,
- and the complex scenarios that appear in scenario-heavy questions.
The goal is to make the high-level architectural decisions stick.
1. Event-Driven Core (Lambda + API Gateway)
AWS Lambda: The Center of Serverless
Lambda is the “glue” of AWS. For the SAP-C02, you must understand how it behaves under pressure.
- Provisioned Concurrency: Essential for eliminating “cold starts” in latency-sensitive applications.
- VPC Integration: Once VPC-enabled, Lambda requires a NAT Gateway in a public subnet to access the public internet or public AWS APIs.
- Canary Deployments: Use Alias Traffic Shifting to send a small percentage of traffic (e.g., 5%) to a new version to ensure health before a full rollout.
API Gateway: Safe Entry Points
- Endpoint Types: Choose Edge-Optimized for geographically distributed clients (routes via CloudFront) or Regional for clients in the same region.
- Integration: REST APIs allow direct AWS Service integration (e.g., to DynamoDB) without needing an intermediate Lambda function, reducing latency and cost.
| Feature | Description | Key Takeaway |
|---|---|---|
| API Types | HTTP APIs (low-latency, cost-effective for simple use cases) or REST APIs (more features like custom authorizers, WAF, and Caching). | Choose based on feature and latency needs. |
| Edge-Optimized | API Gateway endpoint deployed in a specified AWS Region but using CloudFront for faster access from geographically dispersed clients. | Best for global applications. |
| Regional | API Gateway endpoint accessible only within the same AWS Region. | Best for clients in the same AWS Region or private connectivity. |
| Private | API Gateway endpoint accessible only from within a VPC using a VPC Endpoint. | Best for internal/private enterprise applications. |
| Caching | Reduces the number of calls to your backend and improves latency by caching responses. | Uses Amazon ElastiCache for caching. |
| Security | Integrates with AWS WAF (Web Application Firewall) and supports various authentication methods (IAM, Cognito User Pools, Lambda Custom Authorizers). | Robust security features are built-in. |
| Throttling | Protects your backend from sudden traffic spikes by limiting the request rate. | Essential for backend stability. |
2. Streaming & Real-Time Data
The exam frequently tests your ability to choose between control and ease of use.
Kinesis Data Streams
- Routing related records to the same record processor (as in streaming MapReduce)
- maintaining the order of log statements.
- Ability for multiple applications to consume the same stream concurrently.
- Ability to consume records in the same order a few hours later.
- By default, the 2MB/second/shard output is shared between all of the applications consuming data from the stream.
- Use enhanced fan-out if you have multiple consumers retrieving data from a stream in parallel. With enhanced fan-out developers can register stream consumers to use enhanced fan-out and receive their own 2MB/second pipe of read throughput per shard, and this throughput automatically scales with the number of shards in a stream. https://aws.amazon.com/blogs/aws/kds-enhanced-fanout/
- Kinesis Data Streams cannot directly write the output to S3. In addition, KDS does not offer plug-and-play integration with an intermediary Lambda function as Firehose does. You will need to do a lot of custom coding to get the Lambda function to process the incoming stream and then reliably dump the transformed output to S3
Amazon Data Firehose
- Amazon Kinesis Data Firehose is the easiest way to load streaming data into data stores and analytics tools.
- KDF is is an extract, transform, and load (ETL) service that can capture, transform, and load streaming data into
- Amazon S3,
- Amazon Redshift,
- Amazon Elasticsearch Service,
- Splunk
- When a Kinesis data stream is configured as the source of a Firehose delivery stream, Firehose’s PutRecord and PutRecordBatch operations are disabled and Kinesis Agent cannot write to Firehose delivery stream directly. Data needs to be added to the Kinesis data stream through the Kinesis Data Streams PutRecord and PutRecords operations instead.
- Amazon Data Firehose can convert the format of your input data from JSON to Apache Parquet or Apache ORC before storing the data in Amazon S3.
- With format conversion enabled, Amazon S3 is the only destination that you can use for your Firehose stream.
- Dynamic partitioning enables you to continuously partition streaming data in Firehose by using keys within data (for example, customer_id or transaction_id) and then deliver the data grouped by these keys into corresponding Amazon Simple Storage Service (Amazon S3) prefixes.
| Service | Purpose | Key Feature / Use Case |
|---|---|---|
| Kinesis Data Streams (KDS) | Captures, processes, and stores data streams. | Requires manual provisioning of Shards for throughput. Use Enhanced Fan-Out for multiple, low-latency consumers. |
| Kinesis Data Firehose (KDF) | Load streaming data into data stores like S3, Redshift, and Elasticsearch. | Fully managed, auto-scaling, and requires no Shard management. Supports data transformation via Lambda before delivery. |
| Kinesis Data Analytics (KDA) | Process and analyze streaming data using standard SQL or Apache Flink. | Real-time dashboards and anomaly detection on data streams. |
| Kinesis Video Streams (KVS) | Securely stream video from devices to AWS. | Used for video feeds, like surveillance or mobile cameras. |
3. IoT Systems: High-Volume Ingestion
IoT questions focus on handling massive scale and maintaining state.
| Component | Description | Key Feature to Remember |
|---|---|---|
| AWS IoT Core | Connects devices to the cloud, allowing them to send and receive data securely. | Supports billions of devices and trillions of messages. |
| Device Gateway | Manages device connection and disconnection, supporting protocols like MQTT, WebSockets, and HTTPS. | Handles connectivity for massive scale. |
| Authentication and Authorization | Uses X.509 certificates and AWS IAM for authentication and authorization. | Secure by design. |
| Registry | Organizes and tracks devices; uses metadata (name, description, etc.). | Device management and metadata storage. |
| Device Shadow (Digital Twin) | A persistent, virtual version (JSON document) of a device that stores the device’s latest reported state and desired future state. | Allows apps and services to interact with a device even when it’s offline. |
| Rules Engine | Processes messages, transforms them, and routes them to other AWS services (e.g., Lambda, Kinesis, S3). | Event processing and routing. |
| AWS IoT Device SDK | Simplifies the connection of devices to AWS IoT Core. | Simplifies device-side integration. |
| AWS IoT Greengrass | An open-source Internet of Things (IoT) edge runtime and cloud service that helps you build, deploy and manage IoT applications on your devices | You can perform ML inference on your edge devices on locally generated data using cloud-trained models |

4. Serverless Containers (Fargate)
Fargate is the middle ground between the simplicity of Lambda and the control of EC2.
- Secrets Management: Never hardcode credentials. Instruct Fargate tasks to pull secrets from AWS Secrets Manager. This requires an ECS Task Execution Role with permissions to fetch the secret.
- Networking: Fargate only supports the awsvpc network mode, giving every task its own Elastic Network Interface (ENI) and private IP.
5. SQS
- SQS FIFO only supports a maximum of 10 messages per operation in batch mode. https://docs.aws.amazon.com/AWSSimpleQueueService/latest/SQSDeveloperGuide/FIFO-queues.html
- When you batch 10 messages per operation (maximum), FIFO queues can support up to 3,000 (300*10) messages per second. Therefore, you need to process 8 messages per operation so that the FIFO queue can support up to 2,400 (300*8) messages per second
- Amazon S3 can send event notification messages to the following destinations. You specify the Amazon Resource Name (ARN) value of these destinations in the notification configuration.
- Amazon Simple Notification Service (Amazon SNS) topics
- Amazon Simple Queue Service (Amazon SQS) queues
- AWS Lambda function
- Amazon EventBridge
6. Advanced & “Gotcha” Services
Multi-Account Governance
- CloudFormation StackSets: Use to provision a common set of resources across multiple accounts and regions with a single template.
- AWS Service Catalog: Allows organizations to create a “self-service” portal of approved IT services. Administrators can use CloudFormation templates to ensure users only deploy pre-configured, compliant resources (like SageMaker notebooks with mandatory KMS encryption).
Performance & Troubleshooting
- AWS X-Ray: Use this to visualise application components and identify bottlenecks. Enable Active Tracing in Lambda to trace requests that lack a tracing header from upstream services.
- API Gateway Errors: * 429 (Too Many Requests): Usually resolved with retries.
- 504 (Gateway Timeout): Indicates the backend (like Lambda) took too long; check for idempotent implementation before retrying.
7. Exam Cheat Sheet
- Lambda + DLQ: Mandatory for catching failed asynchronous events.
- Firehose + S3: Use a 300-second buffer for cost-effective batching.
- Simple AD: Often an “incorrect” distractor; it lacks MFA and trust relationships.
- CloudFront + ACM: Certificates for CloudFront must be requested in
us-east-1. - SQS FIFO: Max 3,000 messages/sec with batching (10 messages per operation).
What’s Next?
We have now covered the core architectural pillars of the SAP-C02, moving from foundational networking and security into the high-velocity world of Serverless, IoT, and Governance. You should now be comfortable choosing between the operational simplicity of Firehose and the control of Kinesis Streams, or leveraging Service Catalog to enforce compliance across a multi-account organization
The final post in this series will be your Exam Cheat Sheet, which has not been covered so far. We will shift away from deep architectural dives and into a rapid-fire summary of “Must-Know” technical details and “Gotchas”—the specific configurations, limits, and service behaviours you need to memorise 24 hours before you walk into the testing centre.
Catch up on the series:
- Last-Minute Revision for the Solutions Architect – Professional exam – Introduction
- AWS Organizations — Centralised management and SCP guardrails.
- Security Policies and Encryption — ACM, KMS, IAM boundaries, and CloudHSM.
- Data Storage — High-performance S3 patterns and RDS Multi-AZ strategies.
- Networking — Hybrid connectivity, VPC-sharing with RAM, and WAF protection.
- Serverless, IoT, and Governance — This Post
- The Final Sprint — Miscellaneous cheat sheet.
