visit
Scenario 1: Deploy S3, Dynamodb & Pipeline on the same region Ireland region
After deploying the data pipeline, the pipeline worked well. With no region being specified with the Dynamodb tables or in EMR clusters. Since the data pipeline is running on the same region as s3 and Dynamodb, the default values are configured. Therefore the pipeline works smoothly.Scenario 2: Deploy S3 & Dynamodb in N. Virginia and Data Pipeline in Ireland regions
When a cross-region data transfer occurs, for S3 it will not be an issue if the region is not specified to the EC2 instance created because s3 bucket names are unique. But for Dynamodb, the region needs to be specified and the EMR cluster region needs to be specified.Scenario 3: Deploy S3 & Dynamodb in Frankfurt and Data Pipeline in Ireland regions
After deploying S3 & Dynamodb to Frankfurt region, there are few things that needs to be considered since the region was after 2014, does not support some of the configurations. Following configurations needs to be concerned if you are moving to a region which supports new technologies.1. Instance type for EMR2. AMI version3. Reading logs & data from S3As for the instance type, previous generation instances are not supported with the region and versions needed to be updated. When it comes to EMR supported instances in the region there, in this article m4.large instance type is selected. And the AMI version needs to be 5.13.0 or later.
When reading the logs, it will not support from the data pipeline, due to an AWS Signature 4 error (AWS4-HMAC-SHA 25). To avoid the issue, create another S3 bucket in another region such as Ireland. And the logs are directed to another region would simply solve the issue.Furthermore, by not defining the region to the EMR, during the HiveActivity, it would fail by giving the AWS4-HMAC-SHA 25 error. If you run into any errors, please drop a comment and thank you for reading.