В настоящее время я работаю над веб-службой, которая поддерживает несколько баз данных. Я пытаюсь оптимизировать таблицы и исправить отсутствующие индексы. Ниже приведен запрос MySQL:
SELECT 'UTC' AS timezone, pak.id AS package_id, rel.unique_id AS relay, sns.unique_id AS sensor, pak.rtime AS time,
sns.units AS sensor_units, typ.name AS sensor_type, dat.data AS sensor_data,
loc.altitude AS altitude, Y(loc.location) AS latitude, X(loc.location) as longitude,
loc.speed as speed, loc.climb as climb, loc.track as track,
loc.longitude_error as longitude_error, loc.latitude_error as latitude_error, loc.altitude_error as altitude_error,
loc.speed_error as speed_error, loc.climb_error as climb_error, loc.track_error as track_error
FROM sensor_data dat
LEFT OUTER JOIN package_location loc on dat.package_id = loc.package_id
LEFT OUTER JOIN data_package pak ON dat.package_id = pak.id
LEFT OUTER JOIN relays rel ON pak.relay_id = rel.id
LEFT OUTER JOIN sensors sns ON dat.sensor_id = sns.id
LEFT OUTER JOIN sensor_types typ ON sns.sensor_type = typ.id
WHERE typ.name='Temperature'
AND rel.unique_id='OneWireTester'
AND pak.rtime > '2015-01-01'
AND pak.rtime < '2016-01-01'
а объяснение...
+----+-------------+-------+--------+------------------------------------------+----------------------+---------+------------------------+------+----------------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+--------+------------------------------------------+----------------------+---------+------------------------+------+----------------------------------------------------+
| 1 | SIMPLE | rel | ALL | PRIMARY | NULL | NULL | NULL | 5 | Using where |
| 1 | SIMPLE | pak | ref | PRIMARY,fk_package_relay_id | fk_package_relay_id | 9 | BigSense.rel.id | 1 | Using index condition; Using where |
| 1 | SIMPLE | dat | ref | fk_sensor_package_id,fk_sensor_sensor_id | fk_sensor_package_id | 9 | BigSense.pak.id | 1 | NULL |
| 1 | SIMPLE | sns | eq_ref | PRIMARY,fk_sensors_type_id | PRIMARY | 8 | BigSense.dat.sensor_id | 1 | NULL |
| 1 | SIMPLE | loc | eq_ref | PRIMARY | PRIMARY | 8 | BigSense.pak.id | 1 | NULL |
| 1 | SIMPLE | typ | ALL | PRIMARY | NULL | NULL | NULL | 5 | Using where; Using join buffer (Block Nested Loop) |
+----+-------------+-------+--------+------------------------------------------+----------------------+---------+------------------------+------+----------------------------------------------------+
... кажется довольно прямолинейным. Мне нужно добавить индекс в таблицу relays
и sensor_types
, чтобы оптимизировать запрос.
Таблицы для версии PostgreSQL практически идентичны. Однако, когда я использую следующий запрос:
SELECT 'UTC' AS timezone, pak.id AS package_id, rel.unique_id AS relay, sns.unique_id AS sensor, pak.rtime AS time,
sns.units AS sensor_units, typ.name AS sensor_type, dat.data AS sensor_data,
loc.altitude AS altitude, ST_Y(loc.location::geometry) AS latitude, ST_X(loc.location::geometry) as longitude,
loc.speed as speed, loc.climb as climb, loc.track as track,
loc.longitude_error as longitude_error, loc.latitude_error as latitude_error, loc.altitude_error as altitude_error,
loc.speed_error as speed_error, loc.climb_error as climb_error, loc.track_error as track_error
FROM sensor_data dat
LEFT OUTER JOIN package_location loc on dat.package_id = loc.package_id
LEFT OUTER JOIN data_package pak ON dat.package_id = pak.id
LEFT OUTER JOIN relays rel ON pak.relay_id = rel.id
LEFT OUTER JOIN sensors sns ON dat.sensor_id = sns.id
LEFT OUTER JOIN sensor_types typ ON sns.sensor_type = typ.id
WHERE typ.name='Temperature'
AND rel.unique_id='OneWireTester'
AND pak.rtime > '2015-01-01'
AND pak.rtime < '2016-01-01';
Если я делаю анализ объяснения, я получаю следующее:
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------------
Nested Loop Left Join (cost=36.23..131.80 rows=1 width=477) (actual time=0.074..3.933 rows=76 loops=1)
-> Nested Loop (cost=36.09..131.60 rows=1 width=349) (actual time=0.068..3.782 rows=76 loops=1)
-> Nested Loop (cost=35.94..130.58 rows=4 width=267) (actual time=0.062..2.472 rows=620 loops=1)
-> Hash Join (cost=35.67..128.73 rows=4 width=247) (actual time=0.053..0.611 rows=620 loops=1)
Hash Cond: (dat.sensor_id = sns.id)
-> Seq Scan on sensor_data dat (cost=0.00..89.46 rows=946 width=21) (actual time=0.007..0.178 rows=1006 loops=1)
-> Hash (cost=35.64..35.64 rows=2 width=238) (actual time=0.037..0.037 rows=11 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 1kB
-> Hash Join (cost=20.68..35.64 rows=2 width=238) (actual time=0.019..0.035 rows=11 loops=1)
Hash Cond: (sns.sensor_type = typ.id)
-> Seq Scan on sensors sns (cost=0.00..13.60 rows=360 width=188) (actual time=0.002..0.005 rows=31 loops=1)
-> Hash (cost=20.62..20.62 rows=4 width=66) (actual time=0.010..0.010 rows=1 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 1kB
-> Seq Scan on sensor_types typ (cost=0.00..20.62 rows=4 width=66) (actual time=0.006..0.008 rows=1 loops=1)
Filter: ((name)::text = 'Temperature'::text)
Rows Removed by Filter: 4
-> Index Scan using data_package_pkey on data_package pak (cost=0.28..0.45 rows=1 width=20) (actual time=0.002..0.002 rows=1 loops=620)
Index Cond: (id = dat.package_id)
Filter: ((rtime > '2015-01-01 00:00:00'::timestamp without time zone) AND (rtime < '2016-01-01 00:00:00'::timestamp without time zone))
-> Index Scan using relays_pkey on relays rel (cost=0.14..0.24 rows=1 width=94) (actual time=0.002..0.002 rows=0 loops=620)
Index Cond: (id = pak.relay_id)
Filter: ((unique_id)::text = 'OneWireTester'::text)
Rows Removed by Filter: 1
-> Index Scan using package_location_pkey on package_location loc (cost=0.14..0.18 rows=1 width=140) (actual time=0.001..0.001 rows=0 loops=76)
Index Cond: (dat.package_id = package_id)
Planning time: 0.959 ms
Execution time: 4.030 ms
(27 rows)
Схема таблицы имеет те же внешние ключи и общую структуру, поэтому я ожидаю увидеть те же требуемые индексы. Однако я просмотрел несколько руководств по оператору проверки pgsql, и из того, что я собрал, операторы Seq Scan являются индикаторами отсутствующих индексов, что означает отсутствие индексов на sensors
, sensor_data
и sensor_type
.
Правильно ли я интерпретирую результаты этих проверок? Что я должен искать, чтобы оптимизировать обе базы данных?
and from what I've gathered, the Seq Scan statements are indicators of missing indexes,
неправильно. - person wildplasser   schedule 30.07.2015