confusing topics in AWS solution architect asscociate exam

SAA-C02

Shi
CI/CD/DevOps
15 min readJan 30, 2022

--

Standard vs Convertible RI (Reserved Instance)

* standard types are cheaper;* convertible type can be changed, but can't be sold.* standard type can't be changed, but can be sold.* ON-Demand capacity reservation requires NO commitment, and offers no discount either.* Regional Reserved Instances doesn't reserve capacity.

EC2

* ENI will stay attached even if you stopped your EC2 instance.
* Elastic IP address remain associated with your instance even after stopping it.
* The underlying host for the instance is possibly changed after restart.
* an instance store-backed instance can only be rebooted or terminated; so if an instance can be stopped and started, it is EBS-backed.

Egress-only Internet gateway vs NAT Gateway

NAT instance/gateway is only applicable to IPV4.

ENA vs EFA

OS-bypass capabilities of the Elastic Fabric Adapter (EFA) are not supported on Windows instances.

AWS DataSync vs AWS storage Gateway

DataSync:  
* auotomation, acceleration, and support a variety of AWS storage services
* common used for migration/replication
Storage Gateway:
* support a few AWS storage services
* for for integration of on-prem and hybrid cloud storage and use local cache for acceleration

Volume gateway vs file gateway

storage gateway volume gateway is used to replace block storage; for iSCSI protocol.— storage gateway file gateway is used to replace NFS storage; for NFS and SMB for protocol.

Interface endpoint vs gateway endpoint


- Interface endpoints
An interface endpoint is an elastic network interface that allows a private IP address in a subnet to connect VPC resources to a number of AWS services, such as CloudFormation, Elastic Load Balancers (ELBs), SNS, and more.
- Gateway endpoints
In contrast, a gateway endpoint is a target for a route in a route table to connect VPC resources to S3 or DynamoDB.

VPC endpoint vs VPC peering

Can Amazon EC2 instances within a VPC in one region communicate with Amazon EC2 instances within a VPC in another region?Yes. Instances in one region can communicate with each other using Inter-Region VPC Peering, public IP addresses, NAT gateway, NAT instances, VPN Connections or Direct Connect connections.so by default, VPC endpoint is region specific.

S3 IA, One Zone IA, S3 Glacier

IA 
* S3 IA has equal low latency and high throughput as S3 standard, however, it is charged for retrieval fee per GB.
One Zone IA
* One Zone IA is much cheaper than S3 IA;
* it is cheaper because it is only stored in 1 AZ (possible data loss due to zone outage);
* if data that is easily reproducible (so data loss is acceptible), we should choose One Zone IA.
Glacier
* if data is accessed every week, we can't archive it using Glacier.
* if you need something once or twice annually, think about Glacier.
- S3 standard and s3 Intelligent tiering doesn't have minimum storage duration; so for very short term storage, choose them.
- Object has to stay in S3 standard class for at least 30 days before transition to S3 IA or S3 One-Zone IA
S3 CRR (cross region replica) requires the source and destination bucket to have versioning enabled.Expediated retrieval:- Glacier deep archive doesn't support expediated retrieval, so it requires at least few hours to retrieve your data.
- If data retrieval time allowed is a week, then cheapest option to store is Glacier deep archive;
- For glacier, it takes up to 5 mins on expedited mode (recommended to use provisioned capacity), up to 5 hours for standard retrieval and up to 12 hours for bulk.- Provisioned capacity ensures that your retrieval capacity for expedited retrievals is available when you need it.

RDS multi-AZ vs read replica

* Multi-AZ is for HA and DR, not related to performance;
* Read-replica is for performance;
* Multi-AZ is automatically fail over using CNAME.
* Read replica requires manual switch.
* Multi-AZ is within a single region;* Multi-AZ replication is synchronous.
* Read replica is asynchronous.
AWS Aurora Global Database supports HA regional failure.Maintenance for Multi-AZ deployments Running a DB instance as a Multi-AZ deployment can further reduce the impact of a maintenance event, because Amazon RDS applies operating system updates by following these steps:
- Perform maintenance on the standby.
- Promote the standby to primary.
- Perform maintenance on the old primary, which becomes the new standby.
When you modify the database engine for your DB instance in a Multi-AZ deployment, then Amazon RDS upgrades both the primary and secondary DB instances at the same time. In this case, the database engine for the entire Multi-AZ deployment is shut down during the upgrade.Auroa Replica is read-only replica, not write.
Aurora multi-master allows to scale out write performance across multiple AZ and provides read-after-write consistency.

Redshift vs the rest (RDS/DynamoDB)

* Redshift is a Columnar Database.
* RedShift = analytics.
- Traditional databases are row oriented databases that store data by row. The fields for each record are sequentially stored in a long row. For example, “Customer 1: name, address, date of birth, etc.” Then all the information for Customer 2 appears in a new row.- In a columnar, or Column-oriented database, the data is stored such that each row of a column will be next to other rows from that same column.* Amazon RDS is just a "managed" service and not "fully managed".* Amazon DynamoDB is a fully managed non-relational database service; you simply create a database table, set your target utilization for Auto Scaling, and let the service handle the rest.* with IAM role for EC2, you still need to configure IAM DB auth for RDS.

EBS vs FSx

EBS is a storage primarily used for EC2.FSx is a managed file system.When you create an encrypted EBS volume and attach it to a supported instance type, the following types of data are encrypted:- Data at rest inside the volume- All data moving between the volume and the instance- All snapshots created from the volume- All volumes created from those snapshots

AWS backup vs AWS DLM (data lifecycle management)

AWS backup is a general protection for all aws services.
DLM was created for EBS snapshots, and extended to create AMIs.

S3, EBS, EFS, instance store

* S3 is for storage only.
* EBS is usually attached to EC2 as persistent storage.
* EFS is shared storage across multiple EC2.
* Instance store is fast and ephemeral storage for EC2.
* EBS doesn't support SMB protocol.
* EFS only support Linux workload.
* Amazon FSx for Lustre is high performance, parallel file system for fast processing of workloads.
* Amazon FSx for Lustre doesn't support windows-based applications or Windows servers.
* only SSD can be used as the boot volume, HDD can't.

SSD vs HDD

HDD  - HDD is good for infrequent, large, sequential I/O operations and cost effective.  - Cold HDD (sc1) volumes provide low-cost magnetic storage that defines performance in terms of throughput rather than IOPS, which is a good fit ideal for large, sequential cold-data workloads.  - Throughput Optimized HDD (st1) is optimized for frequent workloads involving large, sequential I/O; such as MapReduce, Kafka, log processing, data warehouse, and ETL workloads.SSD  - SSD is good for frequent, small, random I/O operations  - General Purpose SSD (gp2)are workloads performing small, random I/O-intensive database workloads such as MongoDB, Oracle, MySQL, and many others; general purpose SSD (gp2/gp3) has a limit of 16,000 IOPS.  - Provisioned IOPS SSD (io1) volumes are designed to meet the needs of I/O-intensive workloads, particularly database workloads, that are sensitive to storage performance and consistency.The maximum ratio of provisioned IOPS to requested volume size (in GB) is 50:1.. . . . . . . . . . . .- Transaction-intensive applications are sensitive to increased I/O latency and are well-suited for SSD-backed io1 and gp2 volumes. You can maintain high IOPS while keeping latency down by maintaining a low queue length and a high number of IOPS available to the volume. Consistently driving more IOPS to a volume than it has available can cause increased I/O latency.- Throughput-intensive applications are less sensitive to increased I/O latency, and are well-suited for HDD-backed st1 and sc1 volumes. You can maintain high throughput to HDD-backed volumes by maintaining a high queue length when performing large, sequential I/O.

NLB vs Global Accelerator

we could use Global Accelerator+endpoint group or NLB+Lambda to achieve fixed entry point (static IP) to ALB.Global accelerator is able to do this for multiple ALB across mulitple AWS regions, while NLB+Lambda does not support multiple regions.

ALB vs NLB vs CLB

* ALB supports only http/https; NLB supports TCP/UDP/TLS
* ALB doesn't have elastic IP (EIP).
* WAF doesn't work on CLB
AWS WAF also allows you to create a rate-based rule to stop brute force HTTP flood attacks

What is SNI (server name identification)

SNI is somewhat like mailing a package to an apartment building instead of to a house. When mailing something to someone's house, the street address alone is enough to get the package to the right person. But when a package goes to an apartment building, it needs the apartment number in addition to the street address; otherwise, the package might not go to the right person or might not be delivered at all.When multiple websites are hosted on one server and share a single IP address, and each website has its own SSL certificate, the server may not know which SSL certificate to show when a client device tries to securely connect to one of the websites. This is because the SSL/TLS handshake occurs before the client device indicates over HTTP which website it's connecting to.Server Name Indication (SNI) is designed to solve this problem. SNI is an extension for the TLS protocol (formerly known as the SSL protocol), which is used in HTTPS. It's included in the TLS/SSL handshake process in order to ensure that client devices are able to see the correct SSL certificate for the website they are trying to reach. The extension makes it possible to specify the hostname, or domain name, of the website during the TLS handshake, instead of when the HTTP connection opens after the handshake.

serving multiple domain SSL traffic behind load balancer

* Classic Load Balancer does not support Server Name Indication (SNI)* SNI Custom SSL relies on the SNI extension of the Transport Layer Security protocol, which allows multiple domains to serve SSL traffic over the same IP address by including the hostname viewers are trying to connect to.* Application Load Balancers support multiple TLS Certificates With Smart Selection Using SNI.To add multiple certificates for different domains to a load balancer, do one of the following:- Use a Subject Alternative Name (SAN) certificate to validate multiple domains behind the load balancer, including wildcard domains, with AWS Certificate Manager (ACM).- Use either an Application Load Balancer (ALB) or Network Load Balancer (NLB), which supports multiple certificates and smart certificate selection using Server Name Indication (SNI).

RDS vs DynamoDB

* RDS is relational database and for structured data.
* DynamoDB is NoSQL database for semi-structured/non-structure/key-value data.

RDS event vs DynamoDB stream

* RDS events only provide operational events such as DB instance events, DB parameter group events, DB security group events, and DB snapshot events, not data-modifying events (INSERT, DELETE, UPDATE), which can be achieved thru native functions or stored procedures. * DynamoDB stream tracks items changes as well as DB changes itself.
* you have to enable DynamoDB stream before you can use it.

Aurora vs RDS

- RDS is a aws managed database while Aurora is a fully managed database.- Aurora supports autoscaling to handle the growth of database, and RDS doesn't.- Aurora has replication latency of minisecond while RDS typically has latency of seconds.

Aurora vs Aurora Serverless

Aurora is for predicatable and stable workload.
Aurora Serverless is for intermittent, sporadic and unpredictable demand.

S3 TA and Multipart

* multipart for large single file upload (>100MB);
* TA (transfer acceleration) use AWS Edge locations to speed up long distance upload.

Status check

- System status checks detect (StatusCheckFailed_System) problems with your instance that require AWS involvement to repair.- Instance status checks (StatusCheckFailed_Instance) detect problems that require your involvement to repair.

Cloudwatch and Cloudwatch agent

* by default (without agent), cloudwatch monitors CPU, Network, Disk.* Cloudwatch agent can monitor memory utilisation.

Cloudwatch agent vs enhanced monitoring

Enhanced monitoring is for RDS only, not for EC2s.

Cloudwatch enhanced monitoring of DB

Regular metrics monitoring:
* CPU
* Database connections
* Freeable memory
Enhanced monitoring provides metrics about
* RDS child process
* RDS process
* OS process

Cloudfront vs Global Accelerator

* CloudFront uses Edge Locations to cache content while Global Accelerator uses Edge Locations to find an optimal pathway to the nearest regional endpoint.* CloudFront is designed to handle HTTP protocol meanwhile Global Accelerator is best used for both HTTP and non-HTTP protocols such as TCP and UDP.Global Accelerator* intelligent routing to achieve lowest latency
* 2 ip address to whitelist
* fast cross-regional failover
* static anycast IP addresses

Cloudfront vs Lambda@Edge

Lambda@Edge lets you run Node.js and Python Lambda functions to customize content that CloudFront delivers, executing the functions in AWS locations closer to the viewer.

Cloudfront vs Route53

* CloudFront geo-restriction feature is primarily used to prevent users in specific geographic locations from accessing content that you’re distributing through a CloudFront web distribution. * It does not let you choose the resources that serve your traffic based on the geographic location of your users, unlike the Geolocation routing policy in Route 53.

Route53 A record vs CNAME record

An A record is the actual record. The name is resolved to the corresponding IP address. Amazon Route 53 alias records are mapped internally to the DNS name of alias targets such as AWS resources.Route 53 supports the alias resource record set, which lets you map your zone apex (e.g. tutorialsdojo.com) DNS name to your load balancer DNS name. IP addresses associated with Elastic Load Balancing can change at any time due to scaling or software updates. Route 53 responds to each request for an Alias resource record set with one IP address for the load balancer.CNAME records (short for Canonical Name) map your hostname to another hostname

signed URL vs signed cookie

* signed URL is for individual file;
* signed cookie can be used for multiple files and it doesn't require changes to the URL string.

S3 SELECT vs Athena

* S3 select allows you to analyze and process data within an object in S3 bucket.* S3 use bucket name and object's key.* Athena is a serverless service that analyze massive amount of data, including any number of objects; behind the scene, Athena use EMR cluster.* S3 SELECT and Athena are query service, not storage service(s3, DynamoDB, Redshift)

Kinesis Data Stream (KDS) vs SQS (FIFO)

* KDS is for real time streaming while SQS is messaging service, so KDS is having much faster processing power.* SQS is full managed, and KDS requires manual ocnfiguration of shards.* SQS has a max of 14 days of retention, while KDS has 7 days.* Both support ordering, FIFO, and at least once delivery.

SQS long polling vs short polling

While the regular short polling returns immediately, even if the message queue being polled is empty, long polling doesn’t return a response until a message arrives in the message queue, or the long poll times out.

Kinesis Data Stream (KDS) vs Kinesis Firehose

- Both are fully managed services; 
- KDS requires user to setup the shards to process the data;
- KDS takes care of both producer and consumer end;
- Firehose is about producer only;
- Firehose feed to (SERS)S3/ElasticSearch/RedShift/Splunk, not Lambda.- Kinesis Data Firehose can invoke your Lambda function to transform incoming source data and deliver the transformed data to destinations.- KDS can use lambda to process data streams.- KDS is real-time.
- Firehose is near real-time.

IAM, policy

AWS ECS doesn't support resource-based policy.AWS config can be used to audit the currend password; however, it can't be used to enforce policy at time of password creation.AWS password policy enforce policy during password creation; however, it can't be used for auit existing password.

AWS Config vs AWS CloudTrail

Config reports on what has changed, whereas CloudTrail reports on who made the change, when, and from which location. Config is focused on the configuration of your AWS resources and reports with detailed snapshots on how your resources have changed. CloudTrail focuses on the events, or API calls, that drive those changes. It focuses on the user, application, and activity performed on the system.

security group vs network ACL

Stateful or stateless:
security group is stateful and network ACL is stateless, which means for network ACL, you usually need to define both inbound and outbound rules, while for security group, one way is sufficient.
Rules:
security group only support allows rules; network ACL has both allow and deny rules.
Rules process sequence:
all rules in SG will be evaluated and applied, where for network ACL, rules are applied in number order (starting with small number) until a match is found.

AWS Macie vs Rekognition vs Guarduty

* Macie in French means 'weapon';
- AWS Macie is a ML based service to monitor and detect usage patterns in S3.
- Macie use ML for PII detection in S3
* Rekognition allows search in S3 video and image.* Guarduty is for macilicious activities monitoring for S3* Inspector is for security assessment for S3

AWS Trusted Advisor vs AWS Budgets

AWS Trusted Advisor is an online tool that provides you real-time guidance to help you provision your resources following AWS best practices. It inspects your AWS environment and makes recommendations for saving money, improving system performance and reliability, or closing security gaps.AWS Budgets gives you the ability to set custom budgets that alert you when your costs or usage exceed (or are forecasted to exceed) your budgeted amount.

auto scaling

* With a target tracking scaling policy, you can increase or decrease the current capacity of the group to maintain a target value for a specific metric. This policy will help resolve the over-provisioning of your resources.*  need to wait for the cooldown period to complete before initiating additional scaling activities.* scheduled scaling is mainly used for predictable traffic patterns.* the difference between simple scaling and step scaling is cool down period.* Both simple scaling and step scaling use cloud watch alarm.
The main issue with simple scaling is that after a scaling activity is started, the policy must wait for the scaling activity or health check replacement to complete and the cooldown period to end before responding to additional alarms. Cooldown periods help to prevent the initiation of additional scaling activities before the effects of previous activities are visible.
* AWS recommend step scaling.

Decoupled architecture

SQS (simple queue service)
SNS (simple notification service)
SWF (simple workflow service)
* SQS works by polling, i.e. store and then forward.* SNS works for pushing, i.e. notify (multiple) subscribers immediately.

Default values

- default autoscaling cooldown period: 300 seconds (5mins)- message retention period for SQS is between 1 min to 14 days; default is 4 days.- Max retention period for automated RDS snapshot backup is 35 days.- the maximum days for the EFS lifecycle policy is only 90 days.- S3 lifecyle policy: the object has to stay in S3 standard class for at least 30 days before transition to S3 IA or S3 One-Zone IA; this doesn't apply to transition to Glacier.- By default, CloudTrail event log files are encrypted using Amazon S3 server-side encryption (SSE-S3); SSE-S3 only uses the AES-256 encryption algorithm and not the AES-128.- By default, autoscaling and DAX are not enabled for DynamoDB.- By default, KDS data records are only accessible for 24 hours; Max is 7 days.IOPS
- Cold HDD; max of 250
- Throughput Optimized HDD; max of 500
- General purpose SSD (gp2/gp3) has a limit of 16,000 IOPS.
- Provisioned IOPS; max of 64,000 for io1 and io2; max of 256,000 for io2 block express.

Geolocation vs Geoproximity

Geolocation routing 
- lets you choose the resources that serve your traffic based on the geographic location of your users.
- Geolocation can be used for localizing content and presenting some or all of your website in the language of your users.
- Can also protect distribution rights.
Geoproximity
- routing lets Amazon Route 53 route traffic to your resources based on the geographic location of your users and your resources. You can also optionally choose to route more traffic or less to a given resource by specifying a value, known as a bias.

Kinesis Data Stream and Shards

If the shard iterator expires immediately before you can use it, this might indicate that the DynamoDB table used by Kinesis does not have enough capacity to store the lease data. If the shard iterator expires unexpected, this might indicate that you shard capacity is not sufficient; try increase the write capacity assigned to the shard table.

Snowball Edge, SnowMobile

- Snowball Edge: up to 100 TB (80TB usable)- SnowMObile: up to 100 PB per SnowMobile / container on a truck

Secrets Manager vs System Manager Parameter

- Systems Manager Parameter Store doesn't rotate its parameters by default.

VPN vs Direct Connect

- VPN still go through public internet;
- AWS Direct Connect provide a private and dedicated connection and bypass public internet.

AWS Transit Gateway vs Direct connect gateway

- AWS Transit Gateway provides a hub and spoke design for connecting VPCs and on-premises networks.- By attaching a transit gateway to a Direct Connect gateway using a transit virtual interface, you can manage a single connection for multiple VPCs or VPNs that are in the same AWS Region.Direct Connect Gateway support cross region connection between VPCs.
Transit Gateway
Direct Connect Gateway (x-region)

MISC

* Associate the CreationPolicy attribute with a resource to prevent its status from reaching create complete until AWS CloudFormation receives a specified number of success signals or the timeout period is exceeded. To signal a resource, you can use the cfn-signal helper script or SignalResource API.* Control tower is to easily and securely setup multi-account aws environment.* You cannot integrate DynamoDB table with CloudFront as these two are incompatible.* Each subnet maps to a single Availability Zone.

--

--

Shi
CI/CD/DevOps

I am a coder/engineer/application security specialist. I like to play around with language and tools; I have strong interest in efficiency improvement.