AWS – EC2 Troubleshooting Connecting to an Instance

AWS – EC2 Troubleshooting Connecting to an Instance

EC2 Connection Methods

AWS provides multiple methods to connect to EC2 instances. Understanding these helps choose the right approach and troubleshoot connection issues effectively.

  • SSH/RDP (Traditional) – Requires open inbound ports (22/3389), key pairs, and a public IP or VPN connectivity.
  • EC2 Instance Connect – Browser-based SSH using temporary keys pushed via IAM. Requires the EC2 Instance Connect agent installed and port 22 open from EC2 Instance Connect service IP ranges.
  • EC2 Instance Connect Endpoint (EIC Endpoint) – Launched June 2023, allows SSH/RDP to instances in private subnets without a public IP, bastion host, or internet gateway. Creates a private tunnel through an endpoint in the VPC. No additional cost.
  • AWS Systems Manager Session Manager – Provides shell access without opening inbound ports, managing SSH keys, or requiring a public IP. Uses the SSM Agent and IAM for authentication. All sessions are logged and auditable. AWS recommends this as the preferred method for EC2 access.
  • EC2 Serial Console – Provides low-level serial port access for troubleshooting boot, network, and OS configuration issues even when SSH/RDP is unavailable. Does not require network connectivity to the instance.

Common Causes for Connection Issues

  1. Security Group misconfiguration – Inbound rules must allow SSH (port 22) or RDP (port 3389) traffic from your IP address. The default VPC security group does not allow inbound SSH by default.
  2. Network ACL (NACL) misconfiguration – NACLs are stateless. Both inbound rules (allow traffic on port 22 from source IP) and outbound rules (allow response traffic on ephemeral ports 1024-65535) must be configured.
  3. Missing or incorrect key pair – Verify the private key (.pem) file corresponds to the key pair selected when the instance was launched.
  4. Incorrect username – The default username varies by AMI/OS:
    AMI Default Username
    Amazon Linux ec2-user
    Ubuntu ubuntu
    Debian admin
    CentOS centos or ec2-user
    RHEL ec2-user or root
    SUSE ec2-user or root
    Fedora fedora or ec2-user
    FreeBSD ec2-user
    Oracle ec2-user
    Bitnami bitnami
    Rocky Linux rocky
  5. No public IP address – Instance must have a public IPv4 address (or Elastic IP) to connect via SSH over the internet. Alternatively, use Session Manager or EC2 Instance Connect Endpoint for private instances.
  6. Missing route to Internet Gateway – The subnet’s route table must have a route for 0.0.0.0/0 pointing to an Internet Gateway for internet-facing instances.
  7. Instance not in running state or failed status checks – Verify the instance is running and passes both system and instance status checks.
  8. Key file permissions too open – Private key file must have restrictive permissions (chmod 400 on Linux/macOS). SSH ignores keys with permissions broader than 0400.
  9. Corporate firewall blocking port 22 – Internal firewalls may block outbound SSH. Use Session Manager (HTTPS-based) as an alternative.
  10. CPU overload on instance – High CPU utilization can make the instance unresponsive. Check CloudWatch metrics and consider resizing or using Auto Scaling.

Connection Error Messages and Solutions

“Connection timed out” Error

Indicates network-level connectivity issues. Troubleshoot:

  1. Verify security group allows inbound SSH from your current public IP (IP may change with dynamic addressing)
  2. Verify route table has a route to an Internet Gateway (0.0.0.0/0 → igw-xxx)
  3. Verify Network ACL allows inbound on port 22 AND outbound on ephemeral ports (1024-65535)
  4. Verify instance has a public IPv4 address or Elastic IP
  5. Check for corporate firewall blocking outbound port 22
  6. Use VPC Reachability Analyzer to diagnose the network path

“Permission denied (publickey)” Error

Indicates authentication failure. Troubleshoot:

  1. Verify you are using the correct private key for the instance’s key pair
  2. Verify you are connecting with the correct username for the AMI
  3. Verify key file permissions are 0400 (Linux/macOS) or properly restricted (Windows)
  4. Check if permissions on ~/.ssh/authorized_keys or home directory were changed on the instance

“Unprotected Private Key File” Warning

SSH ignores keys with overly permissive file permissions.

  • Linux/macOS: chmod 0400 my_private_key.pem
  • Windows: Remove inherited permissions and grant Read access only to your user account via file Properties → Security → Advanced

“Host key verification failed” Error

Occurs when the host key stored in ~/.ssh/known_hosts doesn’t match the instance. Common after stopping/starting instances (which may change the public IP) or associating/removing an Elastic IP. Remove the old host key entry and reconnect.

“Server refused our key” Error (PuTTY)

  • Verify the .pem file was converted to .ppk format using PuTTYgen
  • Verify correct username is entered in the PuTTY configuration
  • Verify the latest version of PuTTY is installed

Troubleshooting Tools

VPC Reachability Analyzer

A configuration analysis tool that checks network reachability between a source and destination resource in your VPC. For EC2 connectivity troubleshooting:

  • Set Source type to Internet Gateway and Destination to your EC2 instance
  • Analyzes security groups, NACLs, route tables, and other network components
  • Provides hop-by-hop path details when reachable or identifies the blocking component when not reachable
  • Amazon Q network troubleshooting (2024) integrates with Reachability Analyzer to help diagnose connectivity issues using natural language

AWSSupport-TroubleshootSSH Automation Runbook

An AWS Systems Manager Automation document that automatically diagnoses and repairs common SSH connection issues:

  • Installs EC2Rescue for Linux on the instance
  • Checks and attempts to fix SSH daemon configuration, file permissions, and firewall rules
  • Can be run with Action: FixAll to automatically repair identified issues
  • Creates a temporary VPC and uses Lambda functions to perform the analysis

EC2 Serial Console

Provides serial port access for troubleshooting when SSH/RDP is unavailable:

  • Does not require network connectivity to the instance
  • Useful for troubleshooting boot failures, network misconfigurations, and OS-level issues
  • Must be enabled at the account level; requires IAM permissions and a password-based OS user
  • Supported on Nitro-based instance types

SSH Verbose Mode

Use ssh -vvv for detailed debugging output to identify where the connection fails in the SSH handshake process.

Modern Alternatives to Traditional SSH

AWS Systems Manager Session Manager

AWS recommends Session Manager as the preferred access method because it:

  • Eliminates the need to open inbound port 22
  • Removes the need to manage SSH keys
  • Does not require bastion hosts or public IP addresses
  • Provides centralized access control through IAM policies
  • Logs all sessions to CloudWatch Logs and/or S3 for audit
  • Supports port forwarding for accessing remote services
  • Encrypts all traffic using TLS 1.2
  • Available at no additional charge for EC2 instances

Requirements: SSM Agent installed (pre-installed on Amazon Linux 2/2023, Ubuntu 16.04+), instance profile with AmazonSSMManagedInstanceCore policy, and outbound HTTPS connectivity (or VPC endpoints for private subnets).

EC2 Instance Connect Endpoint

For instances in private subnets without Session Manager configured:

  • Create an EIC Endpoint in your VPC (no additional cost)
  • Connect via AWS CLI: aws ec2-instance-connect ssh --instance-id i-xxx
  • No need for public IP, IGW, or bastion hosts
  • Uses IAM for authorization
  • Security group on the endpoint controls which instances can be accessed

Lost Private Key Recovery

If the private key for an EBS-backed instance is lost:

  1. Create a new key pair
  2. Stop the instance (not terminate)
  3. Detach the root EBS volume
  4. Attach the volume to a temporary instance
  5. Mount the volume and update ~/.ssh/authorized_keys with the new public key
  6. Detach the volume and reattach to the original instance as the root volume
  7. Start the instance and connect with the new key pair

Note: This procedure only works for EBS-backed instances. Instance store-backed instances cannot be recovered without the original key. Alternatively, use Session Manager if SSM Agent is running, or use EC2 Serial Console if a password-based user is configured.

AWS Certification Exam Tips

  • “Connection timed out” typically indicates network-level issues (security groups, NACLs, route tables, no public IP)
  • “Permission denied” typically indicates authentication issues (wrong key, wrong username, key file permissions)
  • Session Manager is the recommended approach for secure, auditable access without open ports
  • EC2 Instance Connect Endpoint enables access to private instances without bastion hosts
  • EC2 Serial Console is the last-resort tool when all network-based access fails
  • VPC Reachability Analyzer is used to diagnose network path issues

Exam Scenario Questions

  1. You try to connect via SSH to a newly created Amazon EC2 instance and get one of the following error messages: “Network error: Connection timed out” or “Error connecting to [instance], reason: → Connection timed out: connect.” You have confirmed that the network and security group rules are configured correctly and the instance is passing status checks. What steps should you take to identify the source of the behavior? Choose 2 answers
    • Verify that the private key file corresponds to the Amazon EC2 key pair assigned at launch.
    • Verify that your IAM user policy has permission to launch Amazon EC2 instances.
    • Verify that you are connecting with the appropriate user name for your AMI.
    • Verify that the Amazon EC2 Instance was launched with the proper IAM role.
    • Verify that your federation trust to AWS has been established.
  2. A developer is unable to SSH into an EC2 instance in a private subnet. The instance has no public IP address and no internet gateway is attached to the VPC. The instance has the SSM Agent installed with an appropriate instance profile. What is the MOST operationally efficient way to connect?
    • Attach an Elastic IP address to the instance and connect via SSH.
    • Deploy a bastion host in a public subnet and use it to SSH into the private instance.
    • Use AWS Systems Manager Session Manager to establish a session to the instance.
    • Create a VPN connection to the VPC and connect via the private IP.
  3. An administrator receives “Permission denied (publickey)” when connecting via SSH to an EC2 instance running Amazon Linux. The administrator confirmed the correct key pair was used. What should be checked NEXT?
    • Verify the security group allows inbound traffic on port 22.
    • Verify the username is ec2-user (not root) and the key file permissions are chmod 400.
    • Verify the instance has an IAM role attached.
    • Verify the instance is in a public subnet.
  4. A security team wants to provide developers access to EC2 instances without opening any inbound ports and with full session logging. Which AWS service should they implement?
    • EC2 Instance Connect
    • AWS Systems Manager Session Manager
    • AWS Direct Connect
    • Amazon WorkSpaces
  5. An EC2 instance has become unresponsive and all network-based connection methods (SSH, Session Manager) are failing. The instance is running on a Nitro-based instance type. Which AWS feature can provide access for troubleshooting?
    • VPC Flow Logs
    • AWS CloudTrail
    • EC2 Serial Console
    • AWS X-Ray
  6. A solutions architect needs to diagnose why SSH connections to an EC2 instance are timing out. Which AWS tool can analyze the network path between an internet gateway and the instance to identify the blocking component?
    • AWS CloudTrail
    • VPC Reachability Analyzer
    • Amazon Inspector
    • AWS Trusted Advisor

References