Guide to connect Snowflake & AWS S3
Here is the step by step guide for beginners to connect Snowflake with AWS S3 bucket using storage integration. The snowflake documentation currently is not updated with the new AWS screens and if you are confused there, you are at a right place here.
1)Login to AWS account
2) Load the files in AWS S3 Bucket
Navigate to S3 service
Create a bucket with unique name
Note: AWS will not allow if bucket name already exists and also take a look at the bucket. You can leave rest of the options to defaults.
Once you create the bucket by navigating to the end, you will have the below screen
Click on the bucket and create a folder with any name
Within the folder upload the files you want to process to Snowflake
Once successfully loaded and closed, you can see the files in the bucket folder
Make a note of the S3 URL and here it is s3://snowflake-datademo/load/
3) Create IAM policy to give access permissions to the S3 Bucket
Navigate to IAM service
Click on Policies and create a policy going to JSON tab and add below:
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"s3:PutObject",
"s3:GetObject",
"s3:GetObjectVersion",
"s3:DeleteObject",
"s3:DeleteObjectVersion"
],
"Resource": "arn:aws:s3:::<bucket>/<prefix>/*"
},
{
"Effect": "Allow",
"Action": [
"s3:ListBucket",
"s3:GetBucketLocation"
],
"Resource": "arn:aws:s3:::<bucket>",
"Condition": {
"StringLike": {
"s3:prefix": [
"<prefix>/*"
]
}
}
}
]
}
Replace <prefix> to folder created that is load and <bucket> to bucket name created that is snowflake-datademo
On review, give name to the policy: snowflake_access and create policy
4) Create IAM Role to grant privilege's on S3 bucket
Click on the roles and create a role
Select the trusted entity as AWS account.
Following it allow entities to the existing AWS account and select Require external ID mentioning it as ‘0000’.
Note: The above options are for temporary purpose, which we would later replace in the later steps.
Next add permissions by selecting the policy snowflake_access created in step 3)
create the role as mysnowflakerole and
Click on the role, and copy role ARN. Here it is arn:aws:iam::479359098877:role/mysnowflakerole
5) Create storage integration in Snowflake
Create storage integration named s3_int in Snowflake using the S3 bucket created in Step 2) and role arn copied in step 4)
create storage integration s3_int
type = external_stage
storage_provider = 'S3'
enabled = true
storage_aws_role_arn = 'arn:aws:iam::001234567890:role/myrole'
storage_allowed_locations = ('s3://snowflake-datademo/load/')
;
Describe the integration created and make a note of below 2 properties -
STORAGE_AWS_IAM_USER_ARN: arn:aws:iam::687054681301:user/0qa40000-s
STORAGE_AWS_EXTERNAL_ID: EO73983_SFCRole=2_Eft8RJSyj/ECr0axXKWgJJnFxK4=
6) Update the AWS role with snowflake user details
Back in AWS access the role created and edit trust policy with the 2 properties noted in Step5) and update the policy.
7) Create stage in Snowflake
Create external stage in snowflake using the storage integration name created in step5) and S3 bucket url created in step2)
create stage my_s3_stage
storage_integration = s3_int
url = 's3://snowflake-datademo/load/'
;
List the stage to see if the files are present, thus completing the integration between Snowflake & AWS.
8) Copy files from S3 to Snowflake Table
8a) Using the stage:
copy into mycsvtable
from @my_s3_stage
file_format = (type = csv skip_header = 1 field_delimiter = '|');
8b) Using the url:
copy into mycsvtable
from 's3://snowflake-datademo/load/'
storage_integration = azure_int
file_format = (type = csv skip_header = 1 field_delimiter = '|');
References: https://docs.snowflake.com/en/user-guide/data-load-s3-config-storage-integration.html
Follow and Clap if you like the content and feel free to ask if you have any questions in the comments. I will be more than happy to assist and guide you.