Troubleshooting AWS EC2 Instance Deployment with Terraform during Testing Phase
Hey everyone, I'm running into an issue that's driving me crazy. I recently switched to Currently developing an infrastructure as code setup using Terraform to deploy AWS EC2 instances for a web application. The architecture aims to ensure scalability and security. I’ve structured my main configuration file as follows: ```hcl provider "aws" { region = "us-west-2" } resource "aws_instance" "web" { ami = "ami-0abcdef1234567890" instance_type = "t2.micro" key_name = var.key_name tags = { Name = "WebServer" } } ``` In the variables file, I’m specifying the key name like this: ```hcl variable "key_name" { description = "The name of the SSH key to use" type = string } ``` Upon running `terraform apply`, the EC2 instance successfully provisions, but I noticed that the instance is always in a 'stopped' state after deployment. I have double-checked the AMI ID and instance type, which seem correct. The `terraform apply` output indicates: ``` aws_instance.web: Creation complete after 30s [id=i-0abc123def456ghi] ``` Looking for the best way to diagnose this situation, I ran through the AWS console to check the instance status. The system logs show a general error indicating a lack of permissions for running the instance. I suspect that the IAM role might not have the right policies attached. Here’s the IAM role configuration I’ve used: ```hcl resource "aws_iam_role" "ec2_role" { name = "ec2_role" assume_role_policy = jsonencode({ Version = "2012-10-17" Statement = [ { Action = "sts:AssumeRole" Effect = "Allow" Principal = { Service = "ec2.amazonaws.com" } } ] }) } ``` I’ve attached a policy that allows basic EC2 actions, but perhaps it’s too permissive or missing something crucial. Considering that I also need to connect to the instance for testing, I included the security group settings as such: ```hcl resource "aws_security_group" "instance_sg" { name = "allow_ssh" description = "Allow SSH inbound traffic" ingress { from_port = 22 to_port = 22 protocol = "tcp" cidr_blocks = ["0.0.0.0/0"] } egress { from_port = 0 to_port = 0 protocol = "-1" cidr_blocks = ["0.0.0.0/0"] } } ``` I’ve tried redeploying the infrastructure multiple times with various AMIs but ended up with the same result. Would appreciate insights or any best practices on troubleshooting EC2 instance states in Terraform deployments, particularly around IAM roles and security groups. Any guidance would help steer me in the right direction. This is happening in both development and production on Ubuntu 20.04. Thanks, I really appreciate it!