阿里云优惠活动,点击链接进行购买: 一年仅需89元即可以购买服务器~。我个人服务器为2核4G配置,也更加推荐购买 2核4G三年799元 配置的服务器。
你可以跟着我的笔记 当我有一台服务器时,我做了什么 来开始维护服务器并搭建应用,将引导你使用 docker 和 k8s 搭建一个自己的服务器开发集群。

date: 2020-04-10 14:00


# ORM 层分页查询过慢优化

这是山月修改的第三千七百六十五个bug

# 情景再现

今天产品在生产环境例行巡视并且与往常一样点点点时,发现某一页数据加载过慢,于是照例找到了我。

借助于山月我自己搭建的高效率的性能查询平台,很快就根据慢接口的 API 与 requestId 定位到了瓶颈相关 SQL,并分析如下。

相关 API 是关于请求数据库某资源的分页数据,但并不是 offset 过大问题,而问题出现在 ORM 生成的 SQL 上。

# 捉虫

通过 requestId 找到对应的慢 SQL

该 API 请求资源 Practice 分页数据,在 ORM 层使用了 sequelize,并通过 findAndCountAll 获取资源总数及列表。

Practice.findAndCountAll({
  where: {
    status: 'Visible',
    column_a_id: 1,
    column_b_id: 1
  },
  distinct: true,
  include: [{
    model: PracticeRef
  }, {
    model: PracticeSchedule
  }],
  limit: 0,
  offset: 10
})

由上可见,Practice 资源关联了四个表,从日志中可以得知生成的 SQL 如下

SELECT count(DISTINCT("Practice"."id")) AS "count" FROM "practice" AS "Practice"
LEFT OUTER JOIN "practice_ref" AS "ref" ON "Practice"."id" = "ref"."practice_id"
LEFT OUTER JOIN "practice_schedule" AS "schedules" ON "Practice"."id" = "schedules"."practice_id" AND "schedules"."user_id" = 10086
WHERE "Practice"."status" = 'Visible' AND "Practice"."column_a_id" = 1 AND "Practice"."column_b_id" = 1;

通过 exaplain 分析,发现 schedule 全表扫描,导致 totalCost 过大,从执行计划还可以看出与 schedules 表关联还没有索引,造成数据索引时间过长

+-------------------------------------------------------------------------------------------------------------------------+
| QUERY PLAN                                                                                                              |
|-------------------------------------------------------------------------------------------------------------------------|
| Limit  (cost=14500.36..14500.38 rows=1 width=4)                                                                         |
|   ->  Aggregate  (cost=14500.36..14500.38 rows=1 width=4)                                                               |
|         ->  Nested Loop Left Join  (cost=0.29..14500.36 rows=2 width=4)                                                 |
|               ->  Nested Loop Left Join  (cost=0.00..14496.02 rows=1 width=4)                                           |
|                     Join Filter: ("Practice".id = schedules.practice_id)                                                |
|                     ->  Seq Scan on practice "Practice"  (cost=0.00..561.19 rows=1 width=4)                             |
|                           Filter: (((status)::text = 'Visible'::text) AND (subject_id = 1) AND (topic_id = 111))        |
|                     ->  Seq Scan on practice_schedule schedules  (cost=0.00..13934.06 rows=61 width=4)                           |
|                           Filter: (user_id = 7411238)                                                                   |
|               ->  Index Only Scan using ref_practice_index on practice_ref ref (cost=0.29..4.32 rows=2 width=4) |
|                     Index Cond: (practice_id = "Practice".id)                                                           |
+-------------------------------------------------------------------------------------------------------------------------+
EXPLAIN
+-------------------------------------------------------------------------------------------------------------------+
| QUERY PLAN                                                                                                        |
|-------------------------------------------------------------------------------------------------------------------|
| Aggregate  (cost=1118.39..1118.40 rows=1 width=4)                                                                 |
|   ->  Nested Loop Left Join  (cost=6.01..1118.38 rows=2 width=4)                                                  |
|         ->  Nested Loop Left Join  (cost=5.72..1114.04 rows=1 width=4)                                            |
|               ->  Seq Scan on practice "Practice"  (cost=0.00..520.80 rows=1 width=4)                             |
|                     Filter: (((status)::text = 'Visible'::text) AND (subject_id = 1) AND (topic_id = 111))        |
|               ->  Bitmap Heap Scan on schedule schedules  (cost=5.72..593.23 rows=1 width=4)                      |
|                     Recheck Cond: ("Practice".id = practice_id)                                                   |
|                     Filter: (user_id = 7411238)                                                                   |
|                     ->  Bitmap Index Scan on schedule_practice_id_idx  (cost=0.00..5.72 rows=173 width=0)         |
|                           Index Cond: ("Practice".id = practice_id)                                               |
|         ->  Index Only Scan using question_practice_index on question questions  (cost=0.29..4.32 rows=2 width=4) |
|               Index Cond: (practice_id = "Practice".id)                                                           |
+-------------------------------------------------------------------------------------------------------------------+
+----------------------------------------------------------------------------------------------------------------------------------+
| QUERY PLAN                                                                                                                       |
|----------------------------------------------------------------------------------------------------------------------------------|
| Aggregate  (cost=533.60..533.61 rows=1 width=4)                                                                                  |
|   ->  Nested Loop Left Join  (cost=0.71..533.60 rows=2 width=4)                                                                  |
|         ->  Nested Loop Left Join  (cost=0.42..529.26 rows=1 width=4)                                                            |
|               ->  Seq Scan on practice "Practice"  (cost=0.00..520.80 rows=1 width=4)                                            |
|                     Filter: (((status)::text = 'Visible'::text) AND (subject_id = 1) AND (topic_id = 111))                       |
|               ->  Index Only Scan using schedule_practice_id_user_id_idx on schedule schedules  (cost=0.42..8.45 rows=1 width=4) |
|                     Index Cond: ((practice_id = "Practice".id) AND (user_id = 7411238))                                          |
|         ->  Index Only Scan using question_practice_index on question questions  (cost=0.29..4.32 rows=2 width=4)                |
|               Index Cond: (practice_id = "Practice".id)                                                                          |
+----------------------------------------------------------------------------------------------------------------------------------+

0.01s

+----------------------------------------------------------------------------------------------------------------------------------+
| QUERY PLAN                                                                                                                       |
|----------------------------------------------------------------------------------------------------------------------------------|
| Aggregate  (cost=21.11..21.12 rows=1 width=4)                                                                                    |
|   ->  Nested Loop Left Join  (cost=1.00..21.10 rows=2 width=4)                                                                   |
|         ->  Nested Loop Left Join  (cost=0.71..16.76 rows=1 width=4)                                                             |
|               ->  Index Scan using practice_subject_id_topic_id_idx on practice "Practice"  (cost=0.29..8.31 rows=1 width=4)     |
|                     Index Cond: ((subject_id = 1) AND (topic_id = 111))                                                          |
|                     Filter: ((status)::text = 'Visible'::text)                                                                   |
|               ->  Index Only Scan using schedule_practice_id_user_id_idx on schedule schedules  (cost=0.42..8.45 rows=1 width=4) |
|                     Index Cond: ((practice_id = "Practice".id) AND (user_id = 7411238))                                          |
|         ->  Index Only Scan using question_practice_index on question questions  (cost=0.29..4.32 rows=2 width=4)                |
|               Index Cond: (practice_id = "Practice".id)                                                                          |
+----------------------------------------------------------------------------------------------------------------------------------+

然而实际上呢,我们只需要一个 count 聚合函数用来获取分页的总页数

  1. dinstinct practice.id,表明无需关心关联表的个数
  2. where 全在 practice 表上,表明只需要查询 practice

因此,我们完全可以只在 Practice 表中进行 count 查询

+------------------------------------------------------------------------------------------------------------------+
| QUERY PLAN                                                                                                       |
|------------------------------------------------------------------------------------------------------------------|
| Aggregate  (cost=8.31..8.32 rows=1 width=4)                                                                      |
|   ->  Index Scan using practice_subject_id_topic_id_idx on practice "Practice"  (cost=0.29..8.31 rows=1 width=4) |
|         Index Cond: ((subject_id = 1) AND (topic_id = 111))                                                      |
|         Filter: ((status)::text = 'Visible'::text)                                                               |
+------------------------------------------------------------------------------------------------------------------+

# 解决

通过 ORM 把 findAndCountAll 分为两个 API 分别查询

const records = await Practice.findAll()
const count = await Practice.count()

优化后的查询如下,大幅度解决了查询速度

SELECT count(DISTINCT("Practice"."id")) AS "count" FROM "practice" AS "Practice"
WHERE "Practice"."status" = 'Visible' AND "Practice"."column_a_id" = 1 AND "Practice"."column_b_id" = 1;

# 反思

明者远见于未萌,而智者避危于未形。

# 总结

  1. 基础运维建设可以大幅提高工作效率,如无法查找某条 API 请求
  2. findAndCountAll 有时可单独对 count 进行优化
  3. 养成使用 explain 优化数据库的习惯,并了解常用的 IndexScan Type

扫码关注公众号全栈成长之路,并发送

即可在关注期间无限制浏览本站全部文章内容

点击关闭

你也可以在文章关于回复公众号扫码解锁全站的技术实现中获得解锁代码,永久解锁本站全部文章

Last Updated: 6/30/2020, 3:14:05 AM