AWS Redshift Terraform Module¶

terraform-aws-arc-redshift ¶

Overview¶

The ARC Terraform-aws-arc-redshift module provides a comprehensive and unified solution for deploying Amazon Redshift data warehousing infrastructure on AWS. This versatile module supports both traditional Amazon Redshift provisioned clusters and the newer Redshift Serverless workgroups, allowing you to choose the deployment model that best fits your workload requirements and cost optimization needs.

Prerequisites¶

Before using this module, ensure you have the following:

AWS credentials configured.
Terraform installed.
A working knowledge of Terraform.

Getting Started¶

Define the Module

Initially, it's essential to define a Terraform module, which is organized as a distinct directory encompassing Terraform configuration files. Within this module directory, input variables and output values must be defined in the variables.tf and outputs.tf files, respectively. The following illustrates an example directory structure:

redshift/
|-- main.tf
|-- variables.tf
|-- outputs.tf

Define Input Variables

Inside the variables.tf or in *.tfvars file, you should define values for the variables that the module requires.

Use the Module in Your Main Configuration In your main Terraform configuration file (e.g., main.tf), you can use the module. Specify the source of the module, and version, For Example

module "redshift" {
  source                 = "sourcefuse/arc-redshift/aws"
  version                = "0.0.1"

  namespace   = var.namespace
  environment = var.environment
  name        = var.name

  # Network configuration - using the subnets we created
  vpc_id     = data.aws_vpc.vpc.id
  subnet_ids = data.aws_subnets.private.ids

  # Cluster configuration
  database_name        = var.database_name
  master_username      = var.master_username
  manage_user_password = var.manage_user_password
  security_group_data    = var.security_group_data
  security_group_name           = var.security_group_name
  node_type            = var.node_type
  number_of_nodes      = var.node_count
  cluster_type         = var.node_count > 1 ? "multi-node" : "single-node"

  # Other configuration
  skip_final_snapshot = true
  publicly_accessible = false
  encrypted           = true

  tags = module.tags.tags
}

Output Values

Inside the outputs.tf file of the module, you can define output values that can be referenced in the main configuration. For example:

output "redshift_cluster_id" {
  description = "The ID of the Redshift cluster"
  value       = module.redshift.redshift_cluster_id
}

output "redshift_cluster_endpoint" {
  description = "The connection endpoint for the Redshift cluster"
  value       = module.redshift.redshift_cluster_endpoint
}

output "redshift_endpoint" {
  description = "The endpoint of the Redshift deployment (either cluster or serverless)"
  value       = module.redshift.redshift_endpoint
}

.tfvars

Inside the .tfvars file of the module, you can provide desired values that can be referenced in the main configuration.

First Time Usage¶

uncomment the backend block in main.tf

terraform init -backend-config=config.dev.hcl

If testing locally, terraform init should be fine

Create a dev workspace

terraform workspace new dev

Plan Terraform

terraform plan -var-file dev.tfvars

Apply Terraform

terraform apply -var-file dev.tfvars

Production Setup¶

terraform init -backend-config=config.prod.hcl

Create a prod workspace

terraform workspace new prod

Plan Terraform

terraform plan -var-file prod.tfvars

Apply Terraform

terraform apply -var-file prod.tfvars  

Requirements¶

Name	Version
terraform	>= 1.5.0
terraform	>= 1.5.0
aws	~> 5.0
random	~> 3.1

Providers¶

Name	Version
aws	4.67.0

Modules¶

Name	Source	Version
redshift_cluster	./modules/redshift-cluster	n/a
redshift_serverless	./modules/redshift-serverless	n/a

Resources¶

Name	Type
aws_caller_identity.current	data source

Inputs¶

Name	Description	Type	Default	Required
additional_security_group_ids	Additional security group IDs to be added to the Redshift Serverless workgroup.	`list(string)`	`[]`	no
admin_password	n/a	`string`	`null`	no
admin_username	Admin username for the Redshift Serverless namespace.	`string`	`"admin"`	no
allow_version_upgrade	If true, major version upgrades can be applied during maintenance windows	`bool`	`true`	no
automated_snapshot_retention_period	The number of days that automated snapshots are retained	`number`	`7`	no
base_capacity	The base data warehouse capacity in Redshift Processing Units (RPUs)	`number`	`32`	no
cluster_identifier	The Cluster Identifier	`string`	`null`	no
cluster_parameter_group_name	The name of the parameter group to be associated with this cluster	`string`	`null`	no
cluster_subnet_group_name	The name of a cluster subnet group to be associated with this cluster. If not specified, a new subnet group will be created	`string`	`null`	no
cluster_type	The cluster type to use. Either 'single-node' or 'multi-node'	`string`	`"single-node"`	no
config_parameters	A list of configuration parameters to apply to the Redshift Serverless namespace.	list(object({ parameter_key = string parameter_value = string }))	`[]`	no
create_random_password	Determines whether to create random password for cluster `master_password`	`bool`	`true`	no
create_security_groups	Whether to create security groups for Redshift Serverless resources	`bool`	`true`	no
database_name	The name of the database to create	`string`	n/a	yes
egress_rules	A list of egress rules for the security group.	list(object({ from_port = number to_port = number protocol = string cidr_blocks = list(string) }))	`[]`	no
enable_serverless	Enable Redshift Serverless. If true, creates the serverless module; if false, creates the standard cluster module.	`bool`	`false`	no
encrypted	If true, the data in the cluster is encrypted at rest	`bool`	`true`	no
enhanced_vpc_routing	If true, enhanced VPC routing is enabled	`bool`	`false`	no
environment	Name of the environment, i.e. dev, stage, prod	`string`	n/a	yes
final_snapshot_identifier	The identifier of the final snapshot that is to be created immediately before deleting the cluster	`string`	`null`	no
ingress_rules	A list of ingress rules for the security group.	list(object({ from_port = number to_port = number protocol = string cidr_blocks = list(string) }))	`[]`	no
kms_key_id	The ARN for the KMS encryption key	`string`	`null`	no
manage_admin_password	If true, Redshift will manage the admin password	`bool`	`false`	no
manage_user_password	Set to true to allow RDS to manage the master user password in Secrets Manager	`bool`	`null`	no
master_password	Password for the master DB user. If null, a random password will be generated	`string`	`null`	no
master_username	Username for the master DB user	`string`	n/a	yes
max_capacity	The maximum data warehouse capacity in Redshift Processing Units (RPUs)	`number`	`512`	no
name	Name for the Redshift resources	`string`	n/a	yes
namespace	Namespace of the project, i.e. arc	`string`	n/a	yes
namespace_name	The name of the Redshift Serverless namespace	`string`	`null`	no
node_type	The node type to be provisioned for the cluster	`string`	`"dc2.large"`	no
number_of_nodes	Number of nodes in the cluster	`number`	`1`	no
port	The port number on which the cluster accepts incoming connections	`number`	`5439`	no
publicly_accessible	If true, the cluster can be accessed from a public network	`bool`	`false`	no
security_group_data	(optional) Security Group data	object({ security_group_ids_to_attach = optional(list(string), []) create = optional(bool, true) description = optional(string, null) ingress_rules = optional(list(object({ description = optional(string, null) cidr_block = optional(string, null) source_security_group_id = optional(string, null) from_port = number ip_protocol = string to_port = string self = optional(bool, false) })), []) egress_rules = optional(list(object({ description = optional(string, null) cidr_block = optional(string, null) destination_security_group_id = optional(string, null) from_port = number ip_protocol = string to_port = string prefix_list_id = optional(string, null) })), []) })	{ "create": false }	no
security_group_name	Redshift Serverless resourcesr security group name	`string`	`"Redshift-Serverless-sg"`	no
skip_final_snapshot	Determines whether a final snapshot of the cluster is created before Redshift deletes it	`bool`	`false`	no
snapshot_identifier	The name of the snapshot from which to create the new cluster	`string`	`null`	no
subnet_ids	List of subnet IDs for the Redshift subnet group	`list(string)`	`[]`	no
tags	Tags to apply to resources	`map(string)`	`{}`	no
track_name	Optional track name for Redshift Serverless (used for versioning or preview tracks).	`string`	`null`	no
vpc_id	ID of the VPC for Redshift	`string`	`null`	no
vpc_security_group_ids	A list of Virtual Private Cloud (VPC) security groups to be associated with the cluster	`list(string)`	`[]`	no
workgroup_name	The name of the Redshift Serverless workgroup	`string`	`null`	no

Outputs¶

Name	Description
redshift_cluster_arn	The ARN of the Redshift cluster
redshift_cluster_database_name	The name of the default database in the Redshift cluster
redshift_cluster_endpoint	The connection endpoint for the Redshift cluster
redshift_cluster_hostname	The hostname of the Redshift cluster
redshift_cluster_id	The ID of the Redshift cluster
redshift_cluster_namespace_arn	The ARN of the Redshift cluster
redshift_cluster_port	The port of the Redshift cluster
redshift_cluster_security_group_id	The ID of the security group associated with the Redshift cluster
redshift_database_name	The name of the database in the Redshift deployment
redshift_endpoint	The endpoint of the Redshift deployment (either cluster or serverless)
redshift_serverless_endpoint	The endpoint URL for the Redshift Serverless workgroup
redshift_serverless_namespace_arn	The ARN of the Redshift Serverless namespace
redshift_serverless_namespace_id	The ID of the Redshift Serverless namespace
redshift_serverless_workgroup_arn	The ARN of the Redshift Serverless workgroup
redshift_serverless_workgroup_id	The ID of the Redshift Serverless workgroup
redshift_subnet_group_id	The ID of the Redshift subnet group

Versioning¶

This project uses a .version file at the root of the repo which the pipeline reads from and does a git tag.

When you intend to commit to main, you will need to increment this version. Once the project is merged, the pipeline will kick off and tag the latest git commit.

Development¶

Prerequisites¶

Configurations¶

Configure pre-commit hooks
1
pre-commit install

Versioning¶

while Contributing or doing git commit please specify the breaking change in your commit message whether its major,minor or patch

For Example

git commit -m "your commit message #major"

By specifying this , it will bump the version and if you don't specify this in your commit message then by default it will consider patch and will bump that accordingly

Tests¶

Tests are available in test directory

Configure the dependencies

cd test/
go mod init github.com/sourcefuse/terraform-aws-refarch-<module_name>
go get github.com/gruntwork-io/terratest/modules/terraform

Now execute the test
1
go test -timeout 30m

Authors¶

This project is authored by: - SourceFuse ARC Team

AWS Redshift Terraform Module¶

terraform-aws-arc-redshift¶

Overview¶

Prerequisites¶

Getting Started¶

First Time Usage¶

Production Setup¶

Requirements¶

Providers¶

Modules¶

Resources¶

Inputs¶

Outputs¶

Versioning¶

Development¶

Prerequisites¶

Configurations¶

Versioning¶

Tests¶

Authors¶

terraform-aws-arc-redshift ¶