visit
The was a strategy to transform Amazon’s inner piping by requiring all products and services to be consumable via a standardized API. In other words, Amazon’s internal system needed to be transformed into a Services Oriented Architecture (SOA). Once an SOA is implemented, information flows much more freely throughout an organization. A robust SOA (for both internal and third-party data sources) should be an engineering priority for teams that want to foster stable, secure internal application development.
Like it did at Amazon, implementing an SOA across your products and data sources will unlock the power of internal tooling. An SOA allows developers to seamlessly and consistently build dashboards, workflow automation, asynchronous tasks, and more.
There are myriad articles and YouTube videos that detail SOA’s, so I don’t want to focus too much time here. For the sake of this article, it is helpful to think of SOA as a standardized set of APIs for internal and third-party data sources used within an organization. For example, with SOA, an internal database has an API for reading / writing data to a particular table. So does a file storage system such as S3, Azure File Storage, or GoogleDrive. Instead of writing code that directly interfaces with these systems, you are interfacing with internal APIs which handle credentials and interfacing with these services.
import boto3
@route('/upload_file_to_s3', methods=['POST'])
def upload_file_to_s3(bucket, file_name, file_bytes):
#centralized permission check
is_authorized = user_has_auth( request.get('api_key') )
if is_authorized:
# Consumer doesn't need to worry about keys/credentialing
client = boto3.client(
's3',
aws_access_key_id=os.env.get('s3_access_key'),
aws_secret_access_key=os.env.get('s3_secret_access_key'),
aws_session_token=os.env.get('s3_session_token') )
s3_client = boto3.client('s3')
response = s3_client.upload_file(file_name, bucket, object_name)
return 200
return 404
# Using SOA in my code:
customer_file_stored = requests.post(f'{api_url}/write_to_s3', data = data)
# No SOA in my codebase. This code plus setup, config, and credentials
# would need to be repeatedly setup in separate codebases.
s3_client = boto3.client('s3')
customer_file_stored = s3_client.upload_file(file_name, bucket, object_name)
With services set up for all data sources (both internal and external) — Developers in your organization are empowered to rapidly interface and access services when building internal tools. All the developer needs is their internal API key. With SOA, there is less overhead, fewer security risks (such as them hardcoding a password/credential), better observability, and general ease for the developer, which saves them time.
For an engineering organization, the goal should be to set up services for all internal and third-party data sources with a single permission management layer.
# Example SOA - Each item here is a service with an API to interface.
Internal
- MySQL
- Internal Processes (kick off a script, async task, etc.)
- DevOps / Deployments
Third-Party Data Sources
- Data Warehouse (snowflake, s3, etc.)
- CRM
- Slack
- Datadog / Monitoring
- GoogleDrive
For your services, you need to have a way to identify whether the client requesting data from a service route has the proper permissions. Ideally, this permission layer is a single level of abstraction which applies to all of your services. Having a single API key per person unlocks their ability to work with services instead of needing credentials for each data source they want to interface with.
When an api_key / credential is passed, a validation takes place to verify that this user has permission to access the requested endpoint. Mainly, the service checks if the API key is valid. If the API key is invalid, a single set of error codes is returned, which doesn’t need to be coded in at the endpoint level. For larger orgs, API keys should auto-generate and sync with your active directory.
user_a = {
first_name: 'Steve',
last_name: 'Wosniak',
user_groups: ['devops','backend'],
is_admin: false,
...
}
user_b = {
first_name: 'Bob',
last_name: 'Dylan',
user_groups: ['marketing'],
is_admin: false,
...
}
With this user metadata provided, you can build custom logic into your services. The most useful feature here is built-in User Groups. With this, you can build custom logic in routes around users based on this metadata. As an example, let’s say we are writing a Service for a customer DB table. We want team members in marketing to be able to read data but not write data (read-only).
# Write Endpoint
@route('/update_customer', methods=['POST'])
def update_customer(customer_id):
#passed by permission validation layer
user_metadata = request.get('user_metadata')
if 'marketing' in user_metadata.get('user_groups):
return 'Unauthorized', 401
...
# Read Endpoint
@route('/get_customer', methods=['GET'])
def get_customer(customer_id):
return sql.query.get(customer_id).json() #db query logic
2. Observability
Like it did at Amazon, implementing an SOA across your products and data sources will unlock the power of internal tooling. An SOA allows developers to seamlessly and consistently build dashboards, workflow automation, asynchronous tasks, and more.