Terms

This topic describes the common terms used in Auto Scaling.

Term	Description
Auto Scaling	Auto Scaling is a service that dynamically adjusts the number of instances based on your business requirements and scaling policies. You can use Auto Scaling to scale Elastic Compute Service (ECS) instances or elastic container instances. To ensure that sufficient computing resources are available, you can use Auto Scaling to add instances of the specified type during peak hours. To prevent waste of resources and reduce costs, you can use Auto Scaling to remove instances of the specified type during off-peak hours.
scaling group	A scaling group consists of instances of the same type that you can use for similar business scenarios. You can configure a scaling group to specify the minimum number and maximum number of instances in the scaling group, and associate Server Load Balancer (SLB) instances and ApsaraDB RDS instances with the scaling group.
ECS instance	An ECS instance is a virtual server that consists of basic computing components such as vCPU, memory, operating system, network configuration, and disk. ECS eliminates the need for upfront investments in IT hardware and allows you to scale computing resources on demand. This makes ECS instances more convenient and efficient than physical servers.
elastic container instance	Elastic Container Instance is a container service provided by Alibaba Cloud that combines container and serverless technologies.
SLB instance	SLB is a service that forwards network traffic to backend servers to increase the throughput of your applications. You can use SLB to prevent service interruptions that are caused by single points of failure (SPOFs) and improve the availability of applications.
ApsaraDB RDS instance	ApsaraDB RDS is a stable and reliable online database service that supports elastic scaling. ApsaraDB RDS supports mainstream database engines and provides a variety of database solutions, such as disaster recovery, backup, restoration, monitoring, and migration.
scaling mode	A scaling mode specifies when to add or remove a specific number of instances for a scaling group. Scaling modes include the scheduled mode, dynamic mode, fixed-number mode, custom mode, health mode, and multiple modes.
nstance configuration source	Auto Scaling uses the instance configuration source that you select to create instances. The instance configuration source can be a scaling configuration or a launch template.
scaling configuration	A scaling configuration is a type of instance configuration source and contains the configuration information of instances.
scaling rule	• Step scaling rules, target tracking scaling rules, and simple scaling rules are used to add or remove instances when scaling activities are triggered. • Predictive scaling rules are used to predict the future metric values based on historical monitoring data and intelligently specify the maximum number and minimum number of instances for scaling groups.
scaling task	Scaling tasks are categorized into scheduled tasks and event-triggered tasks. A scheduled task can be used to scale instances at the specified time. An event-triggered task can be used to dynamically scale instances based on specified monitoring metrics.
scaling activity	A scaling activity records the changes in the number of instances within a scaling group, the maximum number and minimum number of instances in the scaling group, and the expected number of instances. Scaling activities are triggered when scaling rules are run, the maximum number and minimum number of instances in a scaling group are modified, or the expected number of instances is modified.
expected number of instances	After the Expected Number of Instances feature is enabled for a scaling group, Auto Scaling automatically maintains the number of instances at the expected value.
parallel scaling activity	A scaling activity triggered by using one of the following methods is a parallel scaling activity: • Run a scaling rule manually or by using a scheduled task. • Manually add or remove ECS instances. • Perform a check on the instance health, or the expected, minimum, or maximum number of instances. A parallel scaling activity can be triggered if the ongoing scaling activities are also parallel scaling activities.
non-parallel scaling activity	Scaling activities other than parallel scaling activities are non-parallel scaling activities. No other scaling activities can be triggered when a non-parallel scaling activity is in progress.
stable instance	A stable instance refers to an ECS instance that is in the In Service, Protected, or Standby state in a scaling group.
scaling process	A scaling process refers to a process that you can manually suspend and resume, such as a scale-out, a scale-in, a health check, a scheduled task, or an event-triggered task. Scaling processes help you control scaling groups at the process level.
lifecycle of an instance in a scaling group	The lifecycle of an ECS instance or elastic container instance in a scaling group refers to the process from the time when the instance is created to the time when it is released. The lifecycle management mode of an ECS instance or elastic container instance depends on how the instance is created: • If the instance is automatically created by Auto Scaling, the lifecycle of the instance is managed by the scaling group. • If the instance is manually created and you enable the scaling group to manage the lifecycle of the instance, the lifecycle of the instance is managed by the scaling group. If you do not enable the scaling group to manage the lifecycle of the instance, you must manually manage the lifecycle of the instance.
lifecycle hook	A lifecycle hook allows ECS instances or elastic container instances that are being added to or removed from a scaling group to enter the Pending state. After the ECS instances or elastic container instances enter the Pending state, you can perform custom operations on them. For example, after an ECS instance or elastic container instance is created, you can use a lifecycle hook to allow the instance to enter the Pending state. You can perform tests on the instance to ensure its service availability. Then, Auto Scaling adds the instance as a backend server to an associated SLB instance.
cooldown time	The cooldown time refers to a period during which Auto Scaling cannot trigger new scaling activities after a scaling activity is complete in a scaling group. During the cooldown time, Auto Scaling rejects all scaling activity requests of event-triggered tasks from CloudMonitor. This prevents scaling activities from being frequently triggered due to fluctuations in metric values.