Articles
Jun 3, 2024
8 min read

Achieving Scalability and Security: Alcion’s Pooled Multi-Tenancy Model

Achieving Scalability and Security: Alcion’s Pooled Multi-Tenancy Model

Implementing multi-tenancy is more than just a technical requirement—it is a fundamental aspect of our commitment to providing a secure, reliable, and efficient service to our customers and partners.

As a leading SaaS backup provider secure tenant isolation is one of the foundational principles we rigorously address - the security of our systems and customers' data is paramount. When architecting the platform our foremost design goal was to ensure robust access isolation, effectively preventing any cross-tenant data leaks—a critical safeguard in maintaining customer trust and data integrity.

Beyond access isolation, our secondary goals focus on operational efficiency and functionality isolation. This includes per-tenant data replication and per-tenant metric collection, as well as performance isolation. These measures are essential in maintaining high service quality and reliability while safeguarding each tenant's resources and operations.

To achieve these goals, Alcion has implemented a pooled multi-tenancy model with a cell-based architecture. This architecture leverages shared resources, such as database tables and S3 buckets for storage, to maximize efficiency and scalability. We also improve multi-tenant performance and security by leveraging DynamoDB's capabilities and we utilize rigorous security measures to ensure data isolation and privacy across multiple layers, including storage, databases, and compute resources.

Alcion and The Importance of Multi-Tenancy

The Alcion platform utilizes a pooled multi-tenancy model (read more here). In this architecture, all tenants share the same underlying resources, such as database tables and S3 buckets in each region. We use a cell-based architecture to organize and partition data within these shared resources. Each "cell" represents a subset of the overall data dedicated to a specific group of tenants. This allows tenants within the same cell to share physical database resources while keeping their data logically isolated from other cells. Read more on how Alcion uses cell-based architecture for scale and resilience.

Leveraging shared resources allows us to achieve unparalleled efficiency and scalability, but it also demands rigorous security measures to ensure data integrity and privacy.

Alcion and pooled multi-tenancy model, conceptual diagram

Security Principles of our Multi-Tenancy Model

  1. Access Isolation: Our primary focus is on airtight access isolation. By implementing stringent access control mechanisms, we ensure that tenants cannot access each other's data under any circumstances. This isolation is critical for complying with various regulatory standards and maintaining overall system integrity.
  2. Data Encryption: All data, whether in transit or at rest, is encrypted using industry-leading encryption standards. All stored backup data is client-side encrypted using AES-256 encryption with unique keys assigned to each customer. This encryption is layered on top of the object store-level (S3) encryption. Encryption keys are securely managed through our Key Management Service (KMS), ensuring comprehensive data protection.
  3. Per-Tenant Functionality Isolation: Each tenant operates within a logically isolated environment. This includes separate data replication mechanisms and customized metric collection for each tenant, ensuring that performance issues or breaches affecting one tenant cannot impact others.
  4. Operational Load Management: By optimizing operational load and resource allocation, Alcion can deliver consistent high performance. Our pooled model allows dynamic resource scaling while maintaining strict isolation policies to ensure that no single tenant can monopolize shared resources.
  5. Compliance and Auditing: Alcion is committed to meeting stringent regulatory requirements such as GDPR compliance and SOC 2 Type II certifications. Regular audits and compliance checks ensure adherence to these regulations and underscore our commitment to data security and privacy.

With these security principles in mind let's dive deeper into some architectural decisions made to the storage, database and compute layers.

Optimizing Multi-Tenant Backup Storage with Alcion

We've adopted a measured approach to storing tenant backups by using consolidated S3 buckets per-cell. We assign unique identifiers as the prefix for all objects within that bucket. This strategy was chosen over the alternative of allocating one bucket per tenant, which was eliminated due to the increased operational load and quota limits it would impose leading to an overall impact on our ability to scale effectively.

Optimizing Multi-Tenant Backup Storage with Alcion conceptual diagram

Utilizing a prefix model not only reduces operational complexity but also maintains stringent tenant-level isolation and maintains our ability to enable encryption, replication, and tiering at the tenant level.

Key Benefits of Multi-Tenant Backup Storage

Access Control: By leveraging Amazon IAM resources and condition keys, we can restrict access to specific prefixes within the bucket. This ensures that each tenant can only access their own data, maintaining robust security and privacy.

Encryption: Each tenant's data is encrypted using unique keys, which enhances data security and ensures that backups remain secure and isolated.

Performance: AWS S3 provides per-prefix rate limits, inherently delivering tenant-level performance isolation. This ensures that the performance of one tenant's operations does not impact the others.

In summary

Our approach of using a single bucket with tenant-specific prefixes balances operational efficiency with stringent security and performance isolation, making it a robust and scalable solution for multi-tenant backup storage.

Multi-Tenant Performance and Security with DynamoDB

At the database layer, our approach focuses on leveraging DynamoDB's native capabilities to optimize both performance and security for each tenant effectively, while maintaining operational efficiency. To achieve this, we employ a strategic multi-tenant table strategy that balances tenant isolation, performance, and scalability.

Multi-Tenant Performance and Security with DynamoDB conceptual diagram

Key Capabilities of Multi-Tenant Performance and Security with DynamoDB

Data Partition Strategy: We employ a multi-tenant DynamoDB table strategy where each table partition includes unique identifiers as a part of its partition key. This method significantly reduces the operational load compared to maintaining individual DynamoDB tables for each tenant.

Managing numerous tables in a large-scale environment can complicate operations and introduce latency; our consolidated partitions streamline these processes without compromising performance.

Access Control and Security: By using AWS Identity and Access Management (IAM) policies that specify partition keys, we can restrict access to DynamoDB table partitions. This ensures that tenants have access only to their data, thereby maintaining robust data isolation.

Performance Optimization: DynamoDB's inherent design ensures that performance limits are imposed per partition. This means that each tenant inherently benefits from performance isolation. Unused read/write capacity from under-utilized partitions can be reallocated to hot partitions, and DynamoDB automatically distributes data from hot partitions. This feature allows us to maintain optimal performance even during peak usage and backup periods without compromising on tenant isolation.

Future Considerations: We are closely monitoring developments in DynamoDB and its capabilities for security and performance such as the adoption of different encryption techniques when they are available. Our goal is to continuously refine our architecture to strike the perfect balance between performance, operational efficiency, and stringent security for all our tenants.

In summary

By strategically using multi-tenant DynamoDB tables, employing strict IAM policies, and leveraging built-in performance management features, Alcion is setting a benchmark for secure and efficient multi-tenant architectures in the cloud computing landscape.

Enforcing Access Control from the Compute Layer

Ensuring secure access to services like DynamoDB and S3 necessitates the use of scoped IAM policies. Given the dynamic nature of our compute layer—including AWS Lambda functions and Fargate tasks—we must dynamically scope permissions before accessing any data or metadata from our storage layer.

Enforcing Access Control from the Compute Layer, conceptual diagram

Key Design Aspects

Dynamic Scoping via IAM Policies: To effectively manage roles and policies in our multi-tenant architecture, we leverage Attribute-Based Access Control (ABAC). We implement placeholders for attributes such as tenantID in roles and supply these as key-value pairs upon role assumption. Although this reduces implementation complexity it does offer less flexibility compared to a fully modular dynamic role service.

Our decision to leverage ABAC emerges as a robust choice balancing effort and risk.

Establishing Default Roles: Determining default roles for lambdas and Fargate tasks is crucial. Contrary to common practices of broad default roles scoped down per tenant, we advocate a reversed approach, Default Zero Access.

We assign no default access and scope up roles upon invocation to provide only the necessary permissions for the task. This precaution minimizes potential vulnerabilities from broadly defined roles.

In Summary

Adopting Attribute-Based Access Control (ABAC) not only simplifies implementation but fortifies the security posture across dynamically scoped compute layers. Prioritizing these design aspects has bolstered safeguards as we scaled our cloud infrastructure.

Wrapping up

By focusing on these key areas, we have built a multi-tenancy architecture that not only meets the highest standards of security and compliance but also delivers exceptional value and service to our clients. We are continually evolving our strategies to adapt to new technologies and regulatory changes, reaffirming our dedication to tenant isolation and overall system integrity.

Connect with our team and find out how we can help or start a free trial (no credit card required). You may also join our Discord community.

Ben Young
Author
Ben Young
Technology Evangelist

Ben Young is a Technology Evangelist at Alcion with over ten years of experience in the Managed Service Provider (MSP) and Cloud Service Provider (CSP) markets. He's an expert in using APIs to automate complex tasks and integrate different technologies. His skills are recognized internationally, and he shares his knowledge through writing and speaking engagements. His passion is showcasing the art of the possible and being a product champion.