count(*), count(1) 차이

count(*), count(1) 차이

2020. 9. 10. 23:29ㆍDB/MySQL

728x90

count(*)과 count(column명)에 대해서는 모든 사람들이 '차이가 있다, 다르다'고 말하지만

count(*)과 count(1)에 대해서는 간혹 의견이 엇갈리는 경우가 있다.

많은 사람들이 count(*)나 count(1)이나 똑같다고 말하지만,

나는 이 사람의 답변을 좀 더 파보고 싶다.

stackoverflow.com/questions/1221559/count-vs-count1-sql-server

I work on the SQL Server team and I can hopefully clarify a few points in this thread (I had not seen it previously, so I am sorry the engineering team has not done so previously).

First, there is no semantic difference between select count(1) from table vs. select count(*) from table. They return the same results in all cases (and it is a bug if not). As noted in the other answers, select count(column) from table is semantically different and does not always return the same results as count(*).

Second, with respect to performance, there are two aspects that would matter in SQL Server (and SQL Azure): compilation-time work and execution-time work. The Compilation time work is a trivially small amount of extra work in the current implementation. There is an expansion of the * to all columns in some cases followed by a reduction back to 1 column being output due to how some of the internal operations work in binding and optimization. I doubt it would show up in any measurable test, and it would likely get lost in the noise of all the other things that happen under the covers (such as auto-stats, xevent sessions, query store overhead, triggers, etc.). It is maybe a few thousand extra CPU instructions. So, count(1) does a tiny bit less work during compilation (which will usually happen once and the plan is cached across multiple subsequent executions). For execution time, assuming the plans are the same there should be no measurable difference. (One of the earlier examples shows a difference - it is most likely due to other factors on the machine if the plan is the same).

As to how the plan can potentially be different. These are extremely unlikely to happen, but it is potentially possible in the architecture of the current optimizer. SQL Server's optimizer works as a search program (think: computer program playing chess searching through various alternatives for different parts of the query and costing out the alternatives to find the cheapest plan in reasonable time). This search has a few limits on how it operates to keep query compilation finishing in reasonable time. For queries beyond the most trivial, there are phases of the search and they deal with tranches of queries based on how costly the optimizer thinks the query is to potentially execute. There are 3 main search phases, and each phase can run more aggressive(expensive) heuristics trying to find a cheaper plan than any prior solution. Ultimately, there is a decision process at the end of each phase that tries to determine whether it should return the plan it found so far or should it keep searching. This process uses the total time taken so far vs. the estimated cost of the best plan found so far. So, on different machines with different speeds of CPUs it is possible (albeit rare) to get different plans due to timing out in an earlier phase with a plan vs. continuing into the next search phase. There are also a few similar scenarios related to timing out of the last phase and potentially running out of memory on very, very expensive queries that consume all the memory on the machine (not usually a problem on 64-bit but it was a larger concern back on 32-bit servers). Ultimately, if you get a different plan the performance at runtime would differ. I don't think it is remotely likely that the difference in compilation time would EVER lead to any of these conditions happening.

Net-net: Please use whichever of the two you want as none of this matters in any practical form. (There are far, far larger factors that impact performance in SQL beyond this topic, honestly).

I hope this helps. I did write a book chapter about how the optimizer works but I don't know if its appropriate to post it here (as I get tiny royalties from it still I believe). So, instead of posting that I'll post a link to a talk I gave at SQLBits in the UK about how the optimizer works at a high level so you can see the different main phases of the search in a bit more detail if you want to learn about that. Here's the video link: https://sqlbits.com/Sessions/Event6/inside_the_sql_server_query_optimizer

share edit follow

edited Jan 15 '19 at 3:14

David Browne - Microsoft

41k4 gold badges20 silver badges43 bronze badges

answered Jan 15 '19 at 3:05

Conor Cunningham MSFT

2,7191 gold badge10 silver badges15 bronze badges

2

my belief is that 1 also undergoes the same expansion. I base this on the perf tests here stackoverflow.com/questions/1597442/… also see the example in that answer of a query using 1 failing unexpectedly when column level permissions are in play – Martin Smith Oct 14 '19 at 13:50

728x90

'DB > MySQL' 카테고리의 다른 글

MySQL8 root 권한 부여 (0)	2022.07.19
윈도우10 APM 설치시 'MySQL 서비스를 시작할 수 없습니다' (0)	2022.07.19
MySQL select max returns wrong result (0)	2020.09.05
mysql ERROR 1045 (28000): Access denied for user (0)	2020.01.20
ERROR 1932 (42S02): Table doesn't exist in engine (0)	2020.01.10

스터디룸

스터디룸

태그

최근글

댓글

공지사항

아카이브

'DB > MySQL' 카테고리의 다른 글

관련글

티스토리툴바