本文介绍Amazon AWS EC2 T2实例,Amazon CloudWatch的一些指标,如何从t2.micro升级到t2.small。
用了AWS Free Tier(t2.micro)一年,后来续费又用了一年,马上又该续费了。随着博客访问量日益增加,明显感觉到t2.micro不够用(内存和CPU),网站在访问高峰会宕掉。下图是过去30天(2017/03/29 – 2017/04/27)谷歌分析的统计结果:
Fig. 1 Google Analystics on Spark & Shine from March 29 to April 27, 2017
估计将来很长一段时间还是用T2,所以决定花点时间了解下T2,主要是想弄明白怎么通过一些指标来决定是否需要升级T2实例。
T2实例介绍见Amazon EC2 Instance Types,摘抄一段如下:
T2 instances are Burstable Performance Instances that provide a baseline level of CPU performance with the ability to burst above the baseline. The baseline performance and ability to burst are governed by CPU Credits. Each T2 instance receives CPU Credits continuously at a set rate depending on the instance size. T2 instances accrue CPU Credits when they are idle, and use CPU credits when they are active. T2 instances are a good choice for workloads that don’t use the full CPU often or consistently, but occasionally need to burst (e.g. web servers, developer environments and databases). For more information see Burstable Performance Instances.
T2实例细分为7种,从最小的naro到最大的2xlarge,如下图所示,详情见这里。
Fig. 2 Amazon EC2 T2 Instance Types.
T2实例的基本想法是用户可以将平时没用的CPU累积起来(系统每小时给T2实例发放CPU Credits),在使用时加速CPU。
Burstable Performance Instances provide a baseline level of CPU performance with the ability to burst above the baseline.
One CPU credit is equal to one vCPU running at 100% utilization for one minute. Other combinations of vCPUs, utilization, and time are also equal to one CPU credit; for example, one vCPU running at 50% utilization for two minutes or two vCPUs running at 25% utilization for two minutes.
一个CPU Credits可以让CPU全速运行1分钟。这样的话,想24小时全速使用CPU,那就需要1个小时有60个CPU Credits。不同类型的T2实例具有不同的CPU性能基准值,比如t2.small
是20%,意思是CPU利用率只有20%,也就是说每个小时发放12个credits(60 * 20%)。没使用的CPU Credits(比如系统处于空闲)可以累加(时效是24小时),当CPU高负荷时,可以使用累积的CPU Credits,从而提升CPU性能(类似于超频)。
关于CPU Credits介绍见T2 Instances CPU Credits,摘抄一段如下:
For example, a t2.small instance receives credits continuously at a rate of 12 CPU Credits per hour. This capability provides baseline performance equivalent to 20% of a CPU core. If at any moment the instance does not need the credits it receives, it stores them in its CPU Credit balance for up to 24 hours. If and when your t2.small needs to burst to more than 20% of a core, it draws from its CPU Credit balance to handle this surge seamlessly. Over time, if you find your workload needs more CPU Credits than you have, or your instance does not maintain a positive CPU Credit balance, we recommend either a larger T2 size, such as the t2.medium, or a Fixed Performance Instance type.
使用完所有累积的CPU Credits会怎么样呢,见原文:
If your instance uses all of its CPU credit balance, performance remains at the baseline performance level. If your instance is running low on credits, your instance’s CPU credit consumption (and therefore CPU performance) is gradually lowered to the base performance level over a 15-minute interval, so you will not experience a sharp performance drop-off when your CPU credits are depleted. If your instance consistently uses all of its CPU credit balance, we recommend a larger T2 size or a fixed performance instance type such as M3 or C3.
有了以上基础后,现在来看看运行的实例究竟使用了多少资源。Amazon CloudWatch提供了丰富的监测指标EC2 > Per-Instance Metrics
(在我的例子中,是23种)。详细的指标解释见List the Available CloudWatch Metrics for Your Instances和Amazon EBS Metrics and Dimensions。
CPU Credits大概是T2实例最重要的概念,下面摘抄4个与CPU Credits相关的指标,并附上我实例t2.micro的数据(4周,间隔1小时,Statistic为average)。
The number of CPU credits consumed during the specified period (Units: Count). This metric identifies the amount of time during which physical CPUs were used for processing instructions by virtual CPUs allocated to the instance.
t2.micro一个小时有6个CPU Credits,而我实例每小时平均消耗CPU Credits最高也才2,我又看了每小时消耗CPU Credits最高也才4.7。这么说t2.micro的CPU性能对我来说是够的?那么,网站偶尔挂掉多半是因为内存不足,同时访问者太多导致数据库崩溃?
The number of CPU credits that an instance has accumulated (Units: Count). This metric is used to determine how long an instance can burst beyond its baseline performance level at a given rate.
从图上看,CPU Credits剩余在不少时间段是0。白天用得多,晚上用得少可以累积一些CPU Credits。
The percentage of allocated EC2 compute units that are currently in use on the instance (Units: Percent). This metric identifies the processing power required to run an application upon a selected instance.
Note: Depending on the instance type, tools in your operating system may show a lower percentage than CloudWatch when the instance is not allocated a full processor core.
Fig. 5: CPU Credit Utilization
t2.micro的base performance (CPU Utilization)只有10%。从图中可见,需要burst的情况还挺多的。
Provides information about the percentage of I/O credits (for gp2) or throughput credits (for st1 and sc1) remaining in the burst bucket (Units: Percent). Data is reported to CloudWatch only when the volume is active. If the volume is not attached, no data is reported.
Note: Used with General Purpose SSD (gp2), Throughput Optimized HDD (st1), and Cold HDD (sc1) volumes only.
我用的是Elastic Block Storage (EBS) 的General Purpose (SSD),详情见Amazon EBS Volume Types。从图上看,我实例的I/O很少?
我决定将t2.micro升级到t2.small(主要考虑到内存,从1G到2G)。因为是预留实例(reserved instance),而且实例类型还是t2,也不更换区域,所以升级就很简单了,步骤如下:
Instances --> Actions --> Instance State --> Stop
Instances --> Instance Settings --> Change Instance Type --> t2.small
Actions --> Instance State --> Start
现在内存是2G了,如下所示:
$ free -m
total used free shared buffers cached
Mem: 2000 485 1514 64 21 227
-/+ buffers/cache: 236 1763
Swap: 1023 0 1023
EC2不同实例的比较,可以使用这个网站EC2Instances.info。这里是一个价格比较的例子。
关于续费的一点心得:预付(upfront)可以便宜不少,全预付和部分预付(Partial Upfront)几乎是一样。3年会比1年便宜很多,但我觉得没必要一下子买三年,因为产品会越来越便宜(比如我去年买的t2.micro花了85美元,现在只需要53美元了)。综上,个人建议是1年的部分预计(Stardard 1-year Term + Partial Upfront)。
服务器CPU和内存增加了一倍,现在可以把之前为了减少内存使用调低MPM prefork各项参数给调回来了,修改文件/etc/apache2/mods-available/mpm_prefork.conf
,最后内容如下:
<IfModule mpm_prefork_module>
StartServers 2
MinSpareServers 2
MaxSpareServers 10
MaxRequestWorkers 500
MaxConnectionsPerChild 25
</IfModule>
重启apache(service apache2 restart
)使其生效。
使用df
(display free disk space)命令查看磁盘使用情况,举例如下:
ubuntu@ip-XX-XX-XX-XX:~$ df -h
Filesystem Size Used Avail Use% Mounted on
/dev/xvda1 9.8G 6.7G 2.6G 73% /
none 4.0K 0 4.0K 0% /sys/fs/cgroup
udev 492M 12K 492M 1% /dev
tmpfs 100M 336K 99M 1% /run
none 5.0M 0 5.0M 0% /run/lock
none 497M 0 497M 0% /run/shm
none 100M 0 100M 0% /run/user
使用free
(Display amount of free and used memory in the system)命令查看内存使用情况,举例如下(-m
单位为megabyte,-h
表示--human
):
$ free -m
total used free shared buffers cached
Mem: 992 691 301 45 74 358
-/+ buffers/cache: 258 733
Swap: 1023 99 924
$ free -h
total used free shared buffers cached
Mem: 992M 695M 296M 45M 74M 360M
-/+ buffers/cache: 260M 731M
Swap: 1.0G 99M 924M
值得注意的是[2]>:
Linux likes to use any extra memory to cache hard drive blocks. So you don’t want to look at just the free
Mem
. You want to look at thefree
column of the-/+ buffers/cache:
row. This shows how much memory is available to applications.
所以,在我的例子中free -m
,未使用内存有733 MB。
References:
[1] StackOverflow: How to safely upgrade an Amazon EC2 instance from t1.micro to large?
[2] AskUbuntu: How can I monitor the memory usage?
[3] StackOverflow: Upgrading ec2 from t2.micro to t2.medium or t2.large
[4] StackOverflow: How do you increase the max number of concurrent connections in Apache?
[5] serverfault: Optimal values for ServerLimit, MaxClients, MaxRequestsPerChild directives
[6] How to optimize apache web server for maximum concurrent connections or increase max clients in apache