SQL queries being slow is one of the main complaints directed at MySQL. Why are queries slow? What’s making SQL queries slow? How to make them faster? We will explore some suggestions which could (hopefully) point developers in the right direction – we will look at optimizing all four CRUD operations – Create, Read, Update and Delete.
As far as MySQL is concerned, developers have a few choices when dealing with the RDBMS. They can choose the table storage engines. Correctly choosing a table storage engine is crucial – wrong choice in this scenario can cost both time and resources in the long run. There are seven frequently used table storage engines:
We won’t be taking a deep look at table engines in this post, however, it is widely accepted that when dealing with data that is important we should use the InnoDB engine – the only advantages of MyISAM engine are that the MyISAM tables occupy less space on disk compared to uncompressed InnoDB tables and that the COUNT(*) query on MyISAM is quite a bit faster than on InnoDB, that’s a topic for another day though.
Before you ask why are your queries slow, remember that queries are tasks that are composed of subtasks. In order to optimize a query, in a nutshell, you optimize the performance of the subtasks by making them quickier, making them happen fewer times or not happen at all.
While accomplishing all of these tasks, the query goes through multiple “stages” – network, CPU, operations (planning etc.), locking and fetching rows. The speed of those operations depend on memory, CPU and the disk speed – I/O operations are also important.
A lot of times when a query is slow, that is because it is fetching much more rows or using more columns than necessary. That might be the case if a SELECT statement that uses all columns is used – we’re talking about SELECT *
. What you should try to do instead is only ask for data you actually need – for example, if your table has id, name and purchase date columns and you only need to fetch the name of the customer, use SELECT name
instead of SELECT *
. Avoiding fetching things you don’t need can prevent needless operations, and in turn, make your query run faster.
On the other hand, if the database you’re dealing with doesn’t have many rows and you need to simplify the development of your project by, for example, using the same block of code in multiple places, using the SELECT *
approach can be reasonable. Just make sure you do it after fully understanding its impact on the database.
Another thing you might need to consider is your database actually containing needless data. If you only need distinct values in the database, but it is full of duplicates, the queries will obviously run slower. This might not be as noticeable on small databases, but as the database gets larger, the problem might become more and more apparent.
In order to avoid storing needless data, examine your needs upfront, and don’t store data you know you won’t need to access. This can be accomplished by removing duplicate values by utilizing an UNIQUE INDEX
(we will cover indexes later) or including a DISTINCT
statement in your SELECT
query.
MySQL query profiling can enable you to figure out in what stage of the query it gets stuck. In order to properly profile a query, perform these steps:
INSERT INTO performance_schema.setup_actors (HOST,USER,ROLE,ENABLED,HISTORY) VALUES ('localhost', 'root', '%', 'YES', 'YES');
UPDATE performance_schema.setup_instruments SET ENABLED = 'YES', TIMED = 'YES' WHERE NAME LIKE '%statement/%';
UPDATE performance_schema.setup_instruments SET ENABLED = 'YES', TIMED = 'YES' WHERE NAME LIKE '%stage/%';
UPDATE performance_schema.setup_consumers SET ENABLED = 'YES' WHERE NAME LIKE '%events_statements_%';
UPDATE performance_schema.setup_consumers SET ENABLED = 'YES' WHERE NAME LIKE '%events_stages_%';
SELECT * FROM test_table WHERE id = 1;
SELECT EVENT_ID, TRUNCATE(TIMER_WAIT / 1000000000000, 6) AS Duration, SQL_TEXT FROM performance_schema.events_statements_history_long WHERE SQL_TEXT LIKE '%Event ID%';
SELECT event_name AS Stage, TRUNCATE(TIMER_WAIT / 1000000000000, 6) AS Duration FROM performance_schema.events_stages_history_long WHERE NESTING_EVENT_ID = Event ID;
When you accomplish all of those things, you should be see a result set that should look something like this:
Stage | Duration |
stage/sql/starting | 0.000142 |
stage/sql/checking permissions | 0.000010 |
stage/sql/Opening tables | 0.000163 |
stage/sql/init | 0.000036 |
stage/sql/System lock | 0.000007 |
stage/sql/optimizing | 0.000011 |
stage/sql/statistics | 0.010762 |
stage/sql/preparing | 0.000022 |
stage/sql/executing | 0.000000 |
Another (though almost a deprecated as of MySQL 5.6.7) way to perform profiling is by:
SET profiling = 1;
in the database;SHOW PROFILES;
SHOW PROFILE FOR QUERY x
where x is the ID of the query.The MySQL EXPLAIN
statement is intended to be used together with a query that needs explaining. EXPLAIN
works on SELECT
, INSERT
, UPDATE
, REPLACE
and DELETE
statements. The statement provides information about how does an MySQL database execute a query and can be a great way to quickly understand the bottlenecks of query performance.
The EXPLAIN
statement can be a great way to quickly identify the causes of queries being slow, but when that doesn’t help, it’s time to dive deeper.
Before you optimize your queries for speed, you must know what statement are you optimizing and what result is good enough. How fast is fast enough? Is a query being executed in a second fast? What about five seconds? Ten seconds? A minute?
Then, decide what kind of a query you need to be optimizing. Is it a SELECT
statement? An INSERT
statement? An UPDATE
statement? A DELETE
statement? Maybe you need to optimize an ALTER TABLE
statement?
Each of those statements accomplish different tasks and so, every one of them needs to be approached differently. We’re going to be starting with SELECT
query optimization.
Indexing is one of the first things you will hear when talking about SELECT
query optimization. An index, simply put, is a data structure that improves the speed of operations in a table. There are multiple kinds on indexes each suited for different things. Indexing works slightly differently in each engine too – we won’t be covering everything in this chapter, only the basics.
There are five types of indexes:
MATCH()
and AGAINST()
statements. A type of index that finds keywords in the text instead of comparing values directly to the values in the index. This type of index has its own nuances – stopwords, search modes etc.;An index optimizes query performance by avoiding a full table scan. However, indexes are only one player in a book. There many others, such as partitioning.
Partitioning in MySQL allows a table to be split into multiple subtables but still be treated as one table in the SQL layer. There are multiple types of partitioning:
VALUES LESS THAN
operator;Partitioning can make queries faster if there’s a need of dealing with big data sets. It is worth mentioning that since from the SQL perspective partitions are treated as tables, developers can restore individual partitions.
In order to optimize INSERT
queries, keep away from indexes (try to avoid inserting data into a column containing an index – the more indexes your table has, the slower INSERT
s will become), utilize extended INSERT
s or consider loading data into a table in chunks.
One thing to note is that you should avoid INSERT queries when dealing with big data sets – consider bulk importing instead.
When running UPDATE
queries in MySQL, bear in mind that the number of records being updated has a rather huge impact on the query performance.
In order to improve UPDATE
query perfomance, consider adding a WHERE
clause to the query – this way only rows that match the clause will be updated. Bear in mind that UPDATE
queries with a WHERE
clause will be faster if you use an index.
Also consider delaying updates, then after some time locking your table and performing multiple UPDATE
s at once – doing so is much quicker than performing one UPDATE
at a time.
In order to optimize DELETE
query performance, use the MySQL-specific LIMIT
clause to tell MySQL how many rows should be deleted at once and repeat the statement by slightly altering it afterwards – there’s not that much you can do in this scenario.
In short, in order to optimize query performance, make sure you’re only asking the database to return rows you actually need, don’t make the database examine unnecessary things, where appropriate, use indexes (bear in mind that indexes are only effective when the WHERE
clause is used – indexes slow down INSERT
performance), take a look at partitioning, EXPLAIN
and profile your queries if needed, also consider utilizing the LIMIT
clause.
There have been rumors about a data breach targeting Schneider Electric. Did a data breach…
There have been rumors about the Fiskars Group – the company behind Fiskars scissors and…
Russia has fined Google more than two undecillion roubles because Google has refused to pay…
Why does RockYou 2024.txt look like a binary file when you open it up? Find…
Duolicious is a dating app that connects people who are “chronically online.” Did the Duolicious…
This blog will tell you what RockYou 2024 is, how RockYou 2024.txt came to be,…