AWS IAM Best Practices

January 1, 2023 ~ Last updated on : January 2, 2023 ~ jayendrapatil ~ 10 Comments

AWS IAM Best Practices

AWS recommends the following AWS Identity and Access Management service – IAM Best Practices to secure AWS resources

Root Account – Don’t use & Lock away access keys

Do not use the AWS Root account which has full access to all the AWS resources and services including the Billing information.
Permissions associated with the AWS Root account cannot be restricted.
Do not generate the access keys, if not required
If already generated and not needed, delete the access keys.
If access keys are needed, rotate (change) the access key regularly
Never share the Root account credentials or access keys, instead create IAM users or Roles to grant granular access
Enable AWS multifactor authentication (MFA) on the AWS account

User – Create individual IAM users

Don’t use the AWS root account credentials to access AWS, and don’t share the credentials with anyone else.
Start by creating an IAM User with an Administrator role that has access to all resources as the Root except the account’s security credentials.
Create individual users for anyone who needs access to your AWS account and gives each user unique credentials and grant different permissions.

Groups – Use groups to assign permissions to IAM users

Instead of defining permissions for individual IAM users, create groups and define the relevant permissions for each group as per the job function, and then associate IAM users to those groups.
Users in an IAM group inherit the permissions assigned to the group and a User can belong to multiple groups
It is much easier to add new users, remove users and modify the permissions of a group of users.

Permission – Grant least privilege

IAM user, by default, is created with no permissions
Users should be granted LEAST PRIVILEGE as required to perform a task.
Starting with minimal permissions and adding to the permissions as required to perform the job function is far better than granting all access and trying to then tighten it down.

Passwords – Enforce strong password policy for users

Enforce users to create strong passwords and enforce them to rotate their passwords periodically.
Enable a strong password policy to define password requirements forcing users to create passwords with requirements like at least one capital letter, one number, and how frequently it should be rotated.

MFA – Enable MFA for privileged users

For extra security, Enable MultiFactor Authentication (MFA) for privileged IAM users, who are allowed access to sensitive resources or APIs.

Role – Use temporary credentials with IAM roles

Use roles for workloads instead of creating IAM user and hardcoding the credentials which can compromise the access and are also hard to rotate.
Roles have specific permissions and do not have a permanent set of credentials.
Roles provide a way to access AWS by relying on dynamically generated & automatically rotated temporary security credentials.
Roles associated with it but dynamically provide temporary credentials that are automatically rotated

Sharing – Delegate using roles

Allow users from same AWS account, another AWS account, or externally authenticated users (either through any corporate authentication service or through Google, Facebook etc) to use IAM roles to specify the permissions which can then be assumed by them
A role can be defined that specifies what permissions the IAM users in the other account are allowed, and from which AWS accounts the IAM users are allowed to assume the role

Rotation – Rotate credentials regularly

Change your own passwords and access keys regularly and enforce it through a strong password policy. So even if a password or access key is compromised without your knowledge, you limit how long the credentials can be used to access your resources
Access keys allows creation of 2 active keys at the same time for an user. These can be used to rotate the keys.

Track & Review – Remove unnecessary credentials

Remove IAM user and credentials (that is, passwords and access keys) that are not needed.
Use the IAM Credential report that lists all IAM users in the account and the status of their various credentials, including passwords, access keys, and MFA devices and usage patterns to figure out what can be removed
Passwords and access keys that have not been used recently might be good candidates for removal.

Conditions – Use policy conditions for extra security

Define conditions under which IAM policies allow access to a resource.
Conditions would help provide finer access control to the AWS services and resources for e.g. access limited to a specific IP range or allowing only encrypted requests for uploads to S3 buckets etc.

Auditing – Monitor activity in the AWS account

Enable logging features provided through CloudTrail, S3, CloudFront in AWS to determine the actions users have taken in the account and the resources that were used.
Log files show the time and date of actions, the source IP for an action, which actions failed due to inadequate permissions, and more.

Use IAM Access Analyzer

IAM Access Analyzer analyzes the services and actions that the IAM roles use, and then generates a least-privilege policy that you can use.
Access Analyzer helps preview and analyze public and cross-account access for supported resource types by reviewing the generated findings.
IAM Access Analyzer helps to validate the policies created to ensure that they adhere to the IAM policy language (JSON) and IAM best practices.

Use Permissions Boundaries

Use IAM Permissions Boundaries to delegate permissions management within an account
IAM permissions boundaries help set the maximum permissions that you delegate and that an identity-based policy can grant to an IAM role.
A permissions boundary does not grant permissions on its own.

AWS Certification Exam Practice Questions

Questions are collected from Internet and the answers are marked as per my knowledge and understanding (which might differ with yours).

AWS services are updated everyday and both the answers and questions might be outdated soon, so research accordingly.

AWS exam questions are not updated to keep up the pace with AWS updates, so even if the underlying feature has changed the question might not be updated

Open to further feedback, discussion and correction.

Your organization is preparing for a security assessment of your use of AWS. In preparation for this assessment, which two IAM best practices should you consider implementing? Choose 2 answers
1. Create individual IAM users for everyone in your organization (May not be needed as can use Roles as well)
2. Configure MFA on the root account and for privileged IAM users
3. Assign IAM users and groups configured with policies granting least privilege access
4. Ensure all users have been assigned and are frequently rotating a password, access ID/secret key, and X.509 certificate (Must be assigned only if using console or through command line)
What are the recommended best practices for IAM? (Choose 3 answers)
1. Grant least privilege
2. User the AWS account(root) for regular user
3. Use Mutli-Factor Authentication (MFA)
4. Store access key/private key in git
5. Rotate credentials regularly
Which of the below mentioned options is not a best practice to securely manage the AWS access credentials?
1. Enable MFA for privileged users
2. Create individual IAM users
3. Keep rotating your secure access credentials at regular intervals
4. Create strong access key and secret access key and attach to the root account
Your CTO is very worried about the security of your AWS account. How best can you prevent hackers from completely hijacking your account?
1. Use short but complex password on the root account and any administrators.
2. Use AWS IAM Geo-Lock and disallow anyone from logging in except for in your city.
3. Use MFA on all users and accounts, especially on the root account. (For increased security, it is recommend to configure MFA to help protect AWS resources)
4. Don’t write down or remember the root account password after creating the AWS account.
Fill the blanks: ____ helps us track AWS API calls and transitions, ____ helps to understand what resources we have now, and ____ allows auditing credentials and logins.
1. AWS Config, CloudTrail, IAM Credential Reports
2. CloudTrail, IAM Credential Reports, AWS Config
3. CloudTrail, AWS Config, IAM Credential Reports
4. AWS Config, IAM Credential Reports, CloudTrail

References

Google Cloud Building Containers Best Practices

May 30, 2021 ~ Last updated on : June 3, 2021 ~ jayendrapatil

Google Cloud Building Containers Best Practices

Package a single app per container

An “app” is considered to be a single piece of software, with a unique parent process, and potentially several child processes.
A container is designed to have the same lifecycle as the app it hosts, so each of the containers should contain only one app. When a container starts, so should the app, and when the app stops, so should the container. for e.g. in the case of the classic Apache/MySQL/PHP stack, each component must be hosted in a separate container.

Properly handle PID 1, signal handling, and zombie processes

Linux signals are the main way to control the lifecycle of processes inside a container.
The app within the container should handle the Linux signals, as well as the best practice of a single app per container should be implemented.
Process identifiers (PIDs) are unique identifiers that the Linux kernel gives to each process.
PIDs are namespace, i.e. the containers PIDs are different from the host and are mapped to the PIDs on the host system.
Docker and Kubernetes use signals to communicate with the processes inside containers, most notably to terminate them.
Both Docker and Kubernetes can only send signals to the process that has PID 1 inside a container.
For Signal handling and Zombie processes following can be followed
- Run as PID 1 and register signal handlers
  - Launch the process with the CMD and/or ENTRYPOINT instructions in the Dockerfile, which would give the PID 1 to the process
  - Use the built-in exec command to launch the process from the shell script. The exec command replaces the script with the program and the process then inherits PID 1.
- Enable process namespace sharing in Kubernetes
  - Process namespace sharing for a Pod can be enabled where Kubernetes uses a single process namespace for all the containers in that Pod.
  - Kubernetes Pod infrastructure container becomes PID 1 and automatically reaps orphaned processes.
- Use a specialized init system
  - Init system such as tini created especially for containers that can be used to handle signals and reaps any zombie processes

Optimize for the Docker build cache

Images are built layer by layer, and in a Dockerfile, each instruction creates a layer in the resulting image.
Docker build cache can accelerate the building of container images.
During a build, when possible, Docker reuses a layer from a previous build and skips a potentially costly step.
Docker can use its build cache only if all previous build steps used it.

Remove unnecessary tools

Remove unnecessary tools helps reduce the attack surface of the app by removing any unnecessary tools.
Avoid running as root inside the container: this method offers the first layer of security and could prevent attackers from modifying files
Launch the container in read-only mode using the --read-only flag from the docker run or by using the readOnlyRootFilesystem option in Kubernetes.

Build the smallest image possible

Smaller image offers advantages such as faster upload and download times
To reduce the size of the image, install only what is strictly needed

Scan images for vulnerabilities

For vulnerabilities, as the containers are supposed to be immutable, the best practice is to rebuild the image, patches included, and redeploy it
As containers have a shorter lifecycle and a less well-defined identity than servers, a centralized inventory system would not work effectively
Container Analysis can scan the images for security vulnerabilities in publicly monitored packages

Using public image

Consider before using public images as you cannot control what’s inside them
Public image such as Debian or Alpine can be used as the base image and building everything on top of that image

Managed Base Images

Managed base images are base container images that are automatically patched by Google for security vulnerabilities, using the most recent patches available from the project upstream

GCP Certification Exam Practice Questions

Questions are collected from Internet and the answers are marked as per my knowledge and understanding (which might differ with yours).

GCP services are updated everyday and both the answers and questions might be outdated soon, so research accordingly.

GCP exam questions are not updated to keep up the pace with GCP updates, so even if the underlying feature has changed the question might not be updated

Open to further feedback, discussion and correction.

When creating a secure container image, which two items should you incorporate into the build if possible?
1. Use public container images as a base image for the app.
2. Build the smallest image possible
3. Use many container image layers to hide sensitive information.
4. Package multiple applications in a container

References

Best_Practices_For_Building_Containers

Architecting for the Cloud – AWS Best Practices – Whitepaper – Certification

December 30, 2017 ~ Last updated on : February 11, 2020 ~ jayendrapatil ~ 5 Comments

Architecting for the Cloud – AWS Best Practices

Architecting for the Cloud – AWS Best Practices whitepaper provides architectural patterns and advice on how to design systems that are secure, reliable, high performing, and cost efficient

AWS Design Principles

Scalability

While AWS provides virtually unlimited on-demand capacity, the architecture should be designed to take advantage of those resources
There are two ways to scale an IT architecture
- Vertical Scaling
  - takes place through increasing specifications of an individual resource for e.g. updating EC2 instance type with increasing RAM, CPU, IOPS, or networking capabilities
  - will eventually hit a limit, and is not always a cost effective or highly available approach
- Horizontal Scaling
  - takes place through increasing number of resources for e.g. adding more EC2 instances or EBS volumes
  - can help leverage the elasticity of cloud computing
  - not all the architectures can be designed to distribute their workload to multiple resources
  - applications designed should be stateless,
    - that needs no knowledge of previous interactions and stores no session information
    - capacity can be increased and decreased, after running tasks have been drained
  - State, if needed, can be implemented using
    - Low latency external store, for e.g. DynamoDB, Redis, to maintain state information
    - Session affinity, for e.g. ELB sticky sessions, to bind all the transactions of a session to a specific compute resource. However, it cannot be guaranteed or take advantage of newly added resources for existing sessions
  - Load can be distributed across multiple resources using
    - Push model, for e.g. through ELB where it distributes the load across multiple EC2 instances
    - Pull model, for e.g. through SQS or Kinesis where multiple consumers subscribe and consume
  - Distributed processing, for e.g. using EMR or Kinesis, helps process large amounts of data by dividing task and its data into many small fragments of works

Disposable Resources Instead of Fixed Servers

Resources need to be treated as temporary disposable resources rather than fixed permanent on-premises resources before
AWS focuses on the concept of Immutable infrastructure
- servers once launched, is never updated throughout its lifetime.
- updates can be performed on a new server with latest configurations,
- this ensures resources are always in a consistent (and tested) state and easier rollbacks
AWS provides multiple ways to instantiate compute resources in an automated and repeatable way
- Bootstraping
  - scripts to configure and setup for e.g. using data scripts and cloud-init to install software or copy resources and code
- Golden Images
  - a snapshot of a particular state of that resource,
  - faster start times and removes dependencies to configuration services or third-party repositories
- Containers
  - AWS support for docker images through Elastic Beanstalk and ECS
  - Docker allows packaging a piece of software in a Docker Image, which is a standardized unit for software development, containing everything the software needs to run: code, runtime, system tools, system libraries, etc
Infrastructure as Code
- AWS assets are programmable, techniques, practices, and tools from software development can be applied to make the whole infrastructure reusable, maintainable, extensible, and testable.
- AWS provides services like CloudFormation, OpsWorks for deployment

Automation

AWS provides various automation tools and services which help improve system’s stability, efficiency and time to market.
- Elastic Beanstalk
  - a PaaS that allows quick application deployment while handling resource provisioning, load balancing, auto scaling, monitoring etc
- EC2 Auto Recovery
  - creates CloudWatch alarm that monitors an EC2 instance and automatically recovers it if it becomes impaired.
  - A recovered instance is identical to the original instance, including the instance ID, private & Elastic IP addresses, and all instance metadata.
  - Instance is migrated through reboot, in memory contents are lost.
- Auto Scaling
  - allows maintain application availability and scale the capacity up or down automatically as per defined conditions
- CloudWatch Alarms
  - allows SNS triggers to be configured when a particular metric goes beyond a specified threshold for a specified number of periods
- CloudWatch Events
  - allows real-time stream of system events that describe changes in AWS resources
- OpsWorks
  - allows continuous configuration through lifecycle events that automatically update the instances’ configuration to adapt to environment changes.
  - Events can be used to trigger Chef recipes on each instance to perform specific configuration tasks
- Lambda Scheduled Events
  - allows Lambda function creation and direct AWS Lambda to execute it on a regular schedule.

Loose Coupling

AWS helps loose coupled architecture that reduces interdependencies, a change or failure in a component does not cascade to other components
- Asynchronous Integration
  - does not involve direct point-to-point interaction but usually through an intermediate durable storage layer for e.g. SQS, Kinesis
  - decouples the components and introduces additional resiliency
  - suitable for any interaction that doesn’t need an immediate response and an ack that a request has been registered will suffice
- Service Discovery
  - allows new resources to be launched or terminated at any point in time and discovered as well for e.g. using ELB as a single point of contact with hiding the underlying instance details or Route 53 zones to abstract load balancer’s endpoint
- Well-Defined Interfaces
  - allows various components to interact with each other through specific, technology agnostic interfaces for e.g. RESTful apis with API Gateway

Services, Not Servers

Databases

AWS provides different categories of database technologies
- Relational Databases (RDS)
  - normalizes data into well-defined tabular structures known as tables, which consist of rows and columns
  - provide a powerful query language, flexible indexing capabilities, strong integrity controls, and the ability to combine data from multiple tables in a fast and efficient manner
  - allows vertical scalability by increasing resources and horizontal scalability using Read Replicas for read capacity and sharding or data partitioning for write capacity
  - provides High Availability using Multi-AZ deployment, where data is synchronously replicated
- NoSQL Databases (DynamoDB)
  - provides databases that trade some of the query and transaction capabilities of relational databases for a more flexible data model that seamlessly scales horizontally
  - perform data partitioning and replication to scale both the reads and writes in a horizontal fashion
  - DynamoDB service synchronously replicates data across three facilities in an AWS region to provide fault tolerance in the event of a server failure or Availability Zone disruption
- Data Warehouse (Redshift)
  - Specialized type of relational database, optimized for analysis and reporting of large amounts of data
  - Redshift achieves efficient storage and optimum query performance through a combination of massively parallel processing (MPP), columnar data storage, and targeted data compression encoding schemes
  - Redshift MPP architecture enables increasing performance by increasing the number of nodes in the data warehouse cluster
For more details refer to AWS Storage Options Whitepaper

Removing Single Points of Failure

AWS provides ways to implement redundancy, automate recovery and reduce disruption at every layer of the architecture
AWS supports redundancy in the following ways
- Standby Redundancy
  - When a resource fails, functionality is recovered on a secondary resource using a process called failover.
  - Failover will typically require some time before it completes, and during that period the resource remains unavailable.
  - Secondary resource can either be launched automatically only when needed (to reduce cost), or it can be already running idle (to accelerate failover and minimize disruption).
  - Standby redundancy is often used for stateful components such as relational databases.
- Active Redundancy
  - requests are distributed to multiple redundant compute resources, if one fails, the rest can simply absorb a larger share of the workload.
  - Compared to standby redundancy, it can achieve better utilization and affect a smaller population when there is a failure.
AWS supports replication
- Synchronous replication
  - acknowledges a transaction after it has been durably stored in both the primary location and its replicas.
  - protects data integrity from the event of a primary node failure
  - used to scale read capacity for queries that require the most up-to-date data (strong consistency).
  - compromises performance and availability
- Asynchronous replication
  - decouples the primary node from its replicas at the expense of introducing replication lag
  - used to horizontally scale the system’s read capacity for queries that can tolerate that replication lag.
- Quorum-based replication
  - combines synchronous and asynchronous replication to overcome the challenges of large-scale distributed database systems
  - Replication to multiple nodes can be managed by defining a minimum number of nodes that must participate in a successful write operation
AWS provide services to reduce or remove single point of failure
- Regions, Availability Zones with multiple data centers
- ELB or Route 53 to configure health checks and mask failure by routing traffic to healthy endpoints
- Auto Scaling to automatically replace unhealthy nodes
- EC2 auto-recovery to recover unhealthy impaired nodes
- S3, DynamoDB with data redundantly stored across multiple facilities
- Multi-AZ RDS and Read Replicas
- ElastiCache Redis engine supports replication with automatic failover
For more details refer to AWS Disaster Recovery Whitepaper

Optimize for Cost

AWS can help organizations reduce capital expenses and drive savings as a result of the AWS economies of scale
AWS provides different options which should be utilized as per use case –
- EC2 instance types – On Demand, Reserved and Spot
- Trusted Advisor or EC2 usage reports to identify the compute resources and their usage
- S3 storage class – Standard, Reduced Redundancy, and Standard-Infrequent Access
- EBS volumes – Magnetic, General Purpose SSD, Provisioned IOPS SSD
- Cost Allocation tags to identify costs based on tags
- Auto Scaling to horizontally scale the capacity up or down based on demand
- Lambda based architectures to never pay for idle or redundant resources
- Utilize managed services where scaling is handled by AWS for e.g. ELB, CloudFront, Kinesis, SQS, CloudSearch etc.

Caching

Caching improves application performance and increases the cost efficiency of an implementation
- Application Data Caching
  - provides services thats helps store and retrieve information from fast, managed, in-memory caches
  - ElastiCache is a web service that makes it easy to deploy, operate, and scale an in-memory cache in the cloud and supports two open-source in-memory caching engines: Memcached and Redis
- Edge Caching
  - allows content to be served by infrastructure that is closer to viewers, lowering latency and giving high, sustained data transfer rates needed to deliver large popular objects to end users at scale.
  - CloudFront is Content Delivery Network (CDN) consisting of multiple edge locations, that allows copies of static and dynamic content to be cached

Security

AWS works on shared security responsibility model
- AWS is responsible for the security of the underlying cloud infrastructure
- you are responsible for securing the workloads you deploy in AWS
AWS also provides ample security features
- IAM to define a granular set of policies and assign them to users, groups, and AWS resources
- IAM roles to assign short term credentials to resources, which are automatically distributed and rotated
- Amazon Cognito, for mobile applications, which allows client devices to get controlled access to AWS resources via temporary tokens.
- VPC to isolate parts of infrastructure through the use of subnets, security groups, and routing controls
- WAF to help protect web applications from SQL injection and other vulnerabilities in the application code
- CloudWatch logs to collect logs centrally as the servers are temporary
- CloudTrail for auditing AWS API calls, which delivers a log file to S3 bucket. Logs can then be stored in an immutable manner and automatically processed to either notify or even take action on your behalf, protecting your organization from non-compliance
- AWS Config, Amazon Inspector, and AWS Trusted Advisor to continually monitor for compliance or vulnerabilities giving a clear overview of which IT resources are in compliance, and which are not
For more details refer to AWS Security Whitepaper

References

Architecting for the Cloud: AWS Best Practices – Whitepaper