Skip to Content
HeadGym
User GuideMars AdminArchitecturePlatform Concepts
Tags:#concepts#terminology#glossary#definitions

Platform Concepts

This page explains key concepts and terminology used throughout the MAR platform.

Core Concepts

Cluster

A cluster is a logical unit of compute resources that hosts application instances. In Kubernetes terms, a cluster corresponds to a namespace.

  • Each cluster has its own resource quota
  • Clusters can be in different regions
  • Clusters can have different statuses (ready, scaling, degraded, failed)

Instance

An instance is a single deployable unit of computation. In Kubernetes terms, an instance corresponds to a pod.

  • Instances run within a cluster
  • Instances have resource limits (CPU, memory)
  • Instances can be started, stopped, or restarted
  • Instances have health checks

Config Profile

A configuration profile defines resource limits, environment variables, and security settings that can be applied to clusters or instances.

  • Profiles are versioned
  • Profiles can target clusters, instances, or both
  • Applying a profile updates all affected resources

Alert Rule

An alert rule defines conditions that trigger notifications when metrics, logs, or events exceed thresholds.

  • Rules have severity levels (critical, high, medium, low)
  • Rules can be paused or active
  • Rules can target multiple notification channels

Webhook

A webhook is an HTTP endpoint that receives event notifications from the platform.

  • Webhooks subscribe to specific event types
  • Webhooks have signing secrets for verification
  • Webhook delivery is tracked for success/failure

Access Control Concepts

User

A user is an individual who can access the platform.

  • Users have roles assigned
  • Users can be active, inactive, or pending

Role

A role defines a set of permissions.

  • Roles can be organization-scoped or project-scoped
  • Roles contain multiple permissions
  • Built-in roles are provided by the platform

Permission

A permission grants the ability to perform specific actions.

  • Permissions follow the format: resource:action
  • Example: clusters:read, instances:delete

Role Binding

A role binding assigns a role to a user at a specific scope.

  • Bindings connect users to roles
  • Bindings specify the scope (org or project)

Observability Concepts

Metric

A metric is a quantitative measurement of system behavior.

  • CPU utilization
  • Memory usage
  • Network throughput
  • Error rate

Log

A log is a timestamped text entry describing an event.

  • Info level - Normal operations
  • Warning level - Potential issues
  • Error level - Problems

Event

An event is a notable occurrence in the system.

  • Instance started/stopped
  • Cluster scaled
  • Configuration applied

Incident

An incident is a triggered alert that requires attention.

  • Has a severity level
  • Has a lifecycle (triggered → acknowledged → resolved)

Operation Concepts

Operation

An operation is an action performed on platform resources.

  • Operations are tracked in a queue
  • Operations have statuses (pending, running, succeeded, failed)
  • Operations can be triggered by users or the system

Request ID

A request ID uniquely identifies an operation for tracking and debugging.

Best Practices

Cluster Management

  • Name clusters descriptively (e.g., prod-us-east, staging-us)
  • Monitor cluster health regularly
  • Apply appropriate config profiles

Instance Management

  • Monitor restart counts
  • Set appropriate resource limits
  • Use health checks

Access Control

  • Follow principle of least privilege
  • Regularly review user access
  • Use organization roles sparingly

Monitoring

  • Set up alerts for critical metrics
  • Review metrics regularly
  • Establish baseline performance

Configuration

  • Version control config profiles
  • Test changes in non-production first
  • Document profile purpose
Last updated on