Tuesday, August 27, 2024

Don't use SELECT * in Production Query

 Hello guys, if you are doing a code review and see a SELECT * in production code, would you allow it? Well, its not a simple question as it looks like but let's find out pros and cons about using SELECT * in a production SQL query and then decide. I have read many articles on the internet where people suggest that using SELECT * in SQL queries is a bad practice and you should always avoid that, but they never care to explain why? Some of them will say you should always use an explicit list of columns in your SQL query, which is a good suggestion and one of the SQL best practices I teach to junior programmers, but many of them don't explain the reason behind it. 


Unless you explain some reasons why one should not use SELECT * in queries, it's difficult to convince many SQL developers,  many of whom have started learning SQL by doing SELECT * from EMP in the Oracle database. 

In this article, I will try to bridge that gap by giving some practical reasons why using SELECT * in the SQL query is not a good idea.


7 Reasons why using  SELECT * in a Production SQL Query is a Bad Practice

Without wasting any more of your time, here are my seven reasons which explain why using the SELECT star in a SQL query is a bad practice and you should avoid it.


1. Unnecessary IO

By using SELECT * you can be returning unnecessary data that will just be ignored but fetching that data is not free of cost. This results in some wasteful IO cycles at the DB end, since you will be reading all of that data off the pages, then perhaps you could have read the data from index pages. This can make your query a little bit slow as well. See these free SQL and Database courses to learn more.

Why Programmers Should not use SELECT * in SQL Query



2. Increased network traffic

SELECT * returns more data than required to the client which in turn will use more network bandwidth. This increase in network bandwidth also means that data will take a longer time to reach the client application which could be SSMS or your Java application server.


3. More application memory

due to this increase in data, your application may require more memory just to hold unnecessary data which it will not be used but coming from Microsoft SQL Server or any other database you are connecting to.


4. Dependency on Order of Columns on ResultSet

When you use the SELECT * query in your application and have any dependency on the order of columns, which you should not, the ordering of the result set will change if you add a new column or change the order of columns.




5. Breaking Views while adding new columns to a table

When you use SELECT * in views then you create subtle bugs if a new column has been added and the old one is removed from the table. Why? because your view will not break but start returning an incorrect result.

To avoid that, you should always use WITHSCHEMABINDING with views. This will also prevent you from using SELECT * in views. 


6. Conflict in JOIN Query 

When you use SELECT * in the JOIN query, you can introduce complications when multiple tables have columns with the same name like status, active, name, etc.

On a simple query, this might be fine but when you try to order by one of these columns or use the query in a CTE or derived table, you will need to make some adjustments. See Microsoft SQL for Beginners to learn more about JOINs in SQL Server.

Why you should not use SELECT * in SQL Query



7.  Copying data from one table to other

When you use SELECT * into INSERT .. SELECT statement, which is a common way to copy data from one table to another, you could potentially copy incorrect data into the incorrect column if the order of column is not the same between two tables.

Some programmers think that using SELECT * vs SELECT 1 in your EXISTS code is faster because the query parser had to do extra work to validate the static value.

That might have been true long ago but nowadays parser has become smart enough to know that within an EXISTS clause, the SELECT list is completely irrelevant.


That's all about why you should not use SELECT * in a Production SQL query anymore. It's always better to use the explicit column list in the SELECT query than a * wildcard. It not only improves the performance but also makes your code more explicit. It also helps you to create maintainable code, which will not break when you add new columns to your table especially if you have views that refer to the original table.


Other Programming Articles you may like to explore:

Thanks for reading this article so far. If you think that these points make sense then please share with your friends and colleagues. If you have any questions or feedback then please drop a comment.

P. S. - If you are keen to learn and improve your SQL skills but looking for free resources to start with, you can also check out this list of Free SQL and Database Courses for Programmers and Developers. 

No comments:

Post a Comment