Platform Concepts
This page explains key concepts and terminology used throughout the MAR platform.
Core Concepts
Cluster
A cluster is a logical unit of compute resources that hosts application instances. In Kubernetes terms, a cluster corresponds to a namespace.
- Each cluster has its own resource quota
- Clusters can be in different regions
- Clusters can have different statuses (ready, scaling, degraded, failed)
Instance
An instance is a single deployable unit of computation. In Kubernetes terms, an instance corresponds to a pod.
- Instances run within a cluster
- Instances have resource limits (CPU, memory)
- Instances can be started, stopped, or restarted
- Instances have health checks
Config Profile
A configuration profile defines resource limits, environment variables, and security settings that can be applied to clusters or instances.
- Profiles are versioned
- Profiles can target clusters, instances, or both
- Applying a profile updates all affected resources
Alert Rule
An alert rule defines conditions that trigger notifications when metrics, logs, or events exceed thresholds.
- Rules have severity levels (critical, high, medium, low)
- Rules can be paused or active
- Rules can target multiple notification channels
Webhook
A webhook is an HTTP endpoint that receives event notifications from the platform.
- Webhooks subscribe to specific event types
- Webhooks have signing secrets for verification
- Webhook delivery is tracked for success/failure
Access Control Concepts
User
A user is an individual who can access the platform.
- Users have roles assigned
- Users can be active, inactive, or pending
Role
A role defines a set of permissions.
- Roles can be organization-scoped or project-scoped
- Roles contain multiple permissions
- Built-in roles are provided by the platform
Permission
A permission grants the ability to perform specific actions.
- Permissions follow the format: resource:action
- Example:
clusters:read,instances:delete
Role Binding
A role binding assigns a role to a user at a specific scope.
- Bindings connect users to roles
- Bindings specify the scope (org or project)
Observability Concepts
Metric
A metric is a quantitative measurement of system behavior.
- CPU utilization
- Memory usage
- Network throughput
- Error rate
Log
A log is a timestamped text entry describing an event.
- Info level - Normal operations
- Warning level - Potential issues
- Error level - Problems
Event
An event is a notable occurrence in the system.
- Instance started/stopped
- Cluster scaled
- Configuration applied
Incident
An incident is a triggered alert that requires attention.
- Has a severity level
- Has a lifecycle (triggered → acknowledged → resolved)
Operation Concepts
Operation
An operation is an action performed on platform resources.
- Operations are tracked in a queue
- Operations have statuses (pending, running, succeeded, failed)
- Operations can be triggered by users or the system
Request ID
A request ID uniquely identifies an operation for tracking and debugging.
Best Practices
Cluster Management
- Name clusters descriptively (e.g., prod-us-east, staging-us)
- Monitor cluster health regularly
- Apply appropriate config profiles
Instance Management
- Monitor restart counts
- Set appropriate resource limits
- Use health checks
Access Control
- Follow principle of least privilege
- Regularly review user access
- Use organization roles sparingly
Monitoring
- Set up alerts for critical metrics
- Review metrics regularly
- Establish baseline performance
Configuration
- Version control config profiles
- Test changes in non-production first
- Document profile purpose