파이썬에서 x **. 5 또는 math.sqrt (x) 중 어느 것이 더 빠릅니까?
나는 이것을 오랫동안 궁금해했다. 제목에서 알 수 있듯이 실제 기능이 더 빠르거나 단순히 절반의 힘을 올리는 것입니까?
최신 정보
이것은 조기 최적화 문제가 아닙니다. 이것은 단순히 기본 코드가 실제로 어떻게 작동하는지에 대한 질문입니다. 파이썬 코드의 작동 원리는 무엇입니까?
귀도 반 로섬 (Guido van Rossum)에게이 방법의 차이점을 알고 싶었던 이메일을 보냈습니다.
내 이메일:
파이썬에서 제곱근을 수행하는 방법은 적어도 3 가지가 있습니다 : math.sqrt, '**'연산자 및 pow (x, .5). 각 구현의 차이점이 궁금합니다. 효율성이 더 좋을 때?
그의 답변 :
pow와 **는 동일합니다. math.sqrt는 복소수에서는 작동하지 않으며 C sqrt () 함수에 연결됩니다. 어느 쪽이 더 빠른지 모르겠다.
의견에 따라 코드를 업데이트했습니다.
import time
import math
def timeit1():
s = time.time()
for i in xrange(750000):
z=i**.5
print "Took %f seconds" % (time.time() - s)
def timeit2(arg=math.sqrt):
s = time.time()
for i in xrange(750000):
z=arg(i)
print "Took %f seconds" % (time.time() - s)
timeit1()
timeit2()
이제 math.sqrt
함수는 로컬 인수에 직접 있으므로 검색이 가장 빠릅니다.
업데이트 : 파이썬 버전은 여기서 중요합니다. timeit1
파이썬이 "i **. 5"를 구문 분석 할 때 문법적으로 어떤 메소드 ( __pow__
또는 일부 변형) 를 호출하는지 알고 있기 때문에 더 빠를 것이라고 생각 했었습니다. 따라서 검색 오버 헤드를 겪을 필요는 없습니다. math.sqrt
변형이 않습니다. 그러나 나는 틀렸다.
파이썬 2.5 : 0.191000 vs. 0.224000
파이썬 2.6 : 0.195000 대 0.139000
또한 psyco가 math.sqrt
더 잘 다루는 것 같습니다 .
Python 2.5 + Psyco 2.0 : 0.109000 대 0.043000
Python 2.6 + Psyco 2.0 : 0.128000 대 0.067000
| Interpreter | x**.5, | sqrt, | sqrt faster, % |
| | seconds | seconds | |
|----------------+---------+---------+----------------|
| Python 3.2rc1+ | 0.32 | 0.27 | 19 |
| Python 3.1.2 | 0.136 | 0.088 | 55 |
| Python 3.0.1 | 0.155 | 0.102 | 52 |
| Python 2.7 | 0.132 | 0.079 | 67 |
| Python 2.6.6 | 0.121 | 0.075 | 61 |
| PyPy 1.4.1 | 0.083 | 0.0159 | 422 |
| Jython 2.5.1 | 0.132 | 0.22 | -40 |
| Python 2.5.5 | 0.129 | 0.125 | 3 |
| Python 2.4.6 | 0.131 | 0.123 | 7 |
#+TBLFM: $4=100*($2-$3)/$3;%.0f
머신에서 생성 된 테이블 결과 :
$ uname -vms
Linux #42-Ubuntu SMP Thu Dec 2 02:41:37 UTC 2010 x86_64
$ cat /proc/cpuinfo | grep 'model name' | head -1
model name : Intel(R) Core(TM) i7 CPU 920 @ 2.67GHz
결과를 재현하려면
- 소스 받기 :
git clone git://gist.github.com/783011.git gist-783011
- 설치
tox
:pip install tox
- 파일이
tox
있는 디렉토리에서 실행tox.ini
하십시오.
- 최적화의 첫 번째 규칙 : 하지 마십시오
- 두 번째 규칙 : 아직 하지 마십시오
타이밍은 다음과 같습니다 (Python 2.5.2, Windows).
$ python -mtimeit -s"from math import sqrt; x = 123" "x**.5"
1000000 loops, best of 3: 0.445 usec per loop
$ python -mtimeit -s"from math import sqrt; x = 123" "sqrt(x)"
1000000 loops, best of 3: 0.574 usec per loop
$ python -mtimeit -s"import math; x = 123" "math.sqrt(x)"
1000000 loops, best of 3: 0.727 usec per loop
이 테스트는 x**.5
보다 약간 빠릅니다 sqrt(x)
.
파이썬 3.0의 경우 결과는 정반대입니다.
$ \Python30\python -mtimeit -s"from math import sqrt; x = 123" "x**.5"
1000000 loops, best of 3: 0.803 usec per loop
$ \Python30\python -mtimeit -s"from math import sqrt; x = 123" "sqrt(x)"
1000000 loops, best of 3: 0.695 usec per loop
$ \Python30\python -mtimeit -s"import math; x = 123" "math.sqrt(x)"
1000000 loops, best of 3: 0.761 usec per loop
math.sqrt(x)
x**.5
다른 컴퓨터 (Ubuntu, Python 2.6 및 3.1) 보다 항상 빠릅니다 .
$ python -mtimeit -s"from math import sqrt; x = 123" "x**.5"
10000000 loops, best of 3: 0.173 usec per loop
$ python -mtimeit -s"from math import sqrt; x = 123" "sqrt(x)"
10000000 loops, best of 3: 0.115 usec per loop
$ python -mtimeit -s"import math; x = 123" "math.sqrt(x)"
10000000 loops, best of 3: 0.158 usec per loop
$ python3.1 -mtimeit -s"from math import sqrt; x = 123" "x**.5"
10000000 loops, best of 3: 0.194 usec per loop
$ python3.1 -mtimeit -s"from math import sqrt; x = 123" "sqrt(x)"
10000000 loops, best of 3: 0.123 usec per loop
$ python3.1 -mtimeit -s"import math; x = 123" "math.sqrt(x)"
10000000 loops, best of 3: 0.157 usec per loop
How many square roots are you really performing? Are you trying to write some 3D graphics engine in Python? If not, then why go with code which is cryptic over code that is easy to read? The time difference is would be less than anybody could notice in just about any application I could forsee. I really don't mean to put down your question, but it seems that you're going a little too far with premature optimization.
In these micro-benchmarks, math.sqrt
will be slower, because of the slight time it takes to lookup the sqrt
in the math namespace. You can improve it slightly with
from math import sqrt
Even then though, running a few variations through timeit, show a slight (4-5%) performance advantage for x**.5
Interestingly, doing
import math
sqrt = math.sqrt
sped it up even more, to within 1% difference in speed, with very little statistical significance.
I will repeat Kibbee, and say that this is probably a premature optimization.
In python 2.6 the (float).__pow__()
function uses the C pow()
function and the math.sqrt()
functions uses the C sqrt()
function.
In glibc compiler the implementation of pow(x,y)
is quite complex and it is well optimized for various exceptional cases. For example, calling C pow(x,0.5)
simply calls the sqrt()
function.
The difference in speed of using .**
or math.sqrt
is caused by the wrappers used around the C functions and the speed strongly depends on optimization flags/C compiler used on the system.
Edit:
Here are the results of Claudiu's algorithm on my machine. I got different results:
zoltan@host:~$ python2.4 p.py
Took 0.173994 seconds
Took 0.158991 seconds
zoltan@host:~$ python2.5 p.py
Took 0.182321 seconds
Took 0.155394 seconds
zoltan@host:~$ python2.6 p.py
Took 0.166766 seconds
Took 0.097018 seconds
using Claudiu's code, on my machine even with "from math import sqrt" x**.5 is faster but using psyco.full() sqrt(x) becomes much faster, at least by 200%
Most likely math.sqrt(x), because it's optimized for square rooting.
Benchmarks will provide you the answer you are looking for.
For what it's worth (see Jim's answer). On my machine, running python 2.5:
PS C:\> python -m timeit -n 100000 10000**.5
100000 loops, best of 3: 0.0543 usec per loop
PS C:\> python -m timeit -n 100000 -s "import math" math.sqrt(10000)
100000 loops, best of 3: 0.162 usec per loop
PS C:\> python -m timeit -n 100000 -s "from math import sqrt" sqrt(10000)
100000 loops, best of 3: 0.0541 usec per loop
Someone commented about the "fast Newton-Raphson square root" from Quake 3... I implemented it with ctypes, but it's super slow in comparison to the native versions. I'm going to try a few optimizations and alternate implementations.
from ctypes import c_float, c_long, byref, POINTER, cast
def sqrt(num):
xhalf = 0.5*num
x = c_float(num)
i = cast(byref(x), POINTER(c_long)).contents.value
i = c_long(0x5f375a86 - (i>>1))
x = cast(byref(i), POINTER(c_float)).contents.value
x = x*(1.5-xhalf*x*x)
x = x*(1.5-xhalf*x*x)
return x * num
Here's another method using struct, comes out about 3.6x faster than the ctypes version, but still 1/10 the speed of C.
from struct import pack, unpack
def sqrt_struct(num):
xhalf = 0.5*num
i = unpack('L', pack('f', 28.0))[0]
i = 0x5f375a86 - (i>>1)
x = unpack('f', pack('L', i))[0]
x = x*(1.5-xhalf*x*x)
x = x*(1.5-xhalf*x*x)
return x * num
Claudiu's results differ from mine. I'm using Python 2.6 on Ubuntu on an old P4 2.4Ghz machine... Here's my results:
>>> timeit1()
Took 0.564911 seconds
>>> timeit2()
Took 0.403087 seconds
>>> timeit1()
Took 0.604713 seconds
>>> timeit2()
Took 0.387749 seconds
>>> timeit1()
Took 0.587829 seconds
>>> timeit2()
Took 0.379381 seconds
sqrt is consistently faster for me... Even Codepad.org NOW seems to agree that sqrt, in the local context, is faster (http://codepad.org/6trzcM3j). Codepad seems to be running Python 2.5 presently. Perhaps they were using 2.4 or older when Claudiu first answered?
In fact, even using math.sqrt(i) in place of arg(i), I still get better times for sqrt. In this case timeit2() took between 0.53 and 0.55 seconds on my machine, which is still better than the 0.56-0.60 figures from timeit1.
I'd say, on modern Python, use math.sqrt and definitely bring it to local context, either with somevar=math.sqrt or with from math import sqrt.
You might want to benchmark the fast Newton-Raphson square root as well. Shouldn't take much to convert to Python.
The problem SQRMINSUM I've solved recently requires computing square root repeatedly on a large dataset. The oldest 2 submissions in my history, before I've made other optimizations, differ solely by replacing **0.5 with sqrt(), thus reducing the runtime from 3.74s to 0.51s in PyPy. This is almost twice the already massive 400% improvement that Claudiu measured.
The Pythonic thing to optimize for is readability. For this I think explicit use of the sqrt
function is best. Having said that, let's investigate performance anyway.
I updated Claudiu's code for Python 3 and also made it impossible to optimize away the calculations (something a good Python compiler may do in the future):
from sys import version
from time import time
from math import sqrt, pi, e
print(version)
N = 1_000_000
def timeit1():
z = N * e
s = time()
for n in range(N):
z += (n * pi) ** .5 - z ** .5
print (f"Took {(time() - s):.4f} seconds to calculate {z}")
def timeit2():
z = N * e
s = time()
for n in range(N):
z += sqrt(n * pi) - sqrt(z)
print (f"Took {(time() - s):.4f} seconds to calculate {z}")
def timeit3(arg=sqrt):
z = N * e
s = time()
for n in range(N):
z += arg(n * pi) - arg(z)
print (f"Took {(time() - s):.4f} seconds to calculate {z}")
timeit1()
timeit2()
timeit3()
Results vary, but a sample output is:
3.6.6 (default, Jul 19 2018, 14:25:17)
[GCC 8.1.1 20180712 (Red Hat 8.1.1-5)]
Took 0.3747 seconds to calculate 3130485.5713865166
Took 0.2899 seconds to calculate 3130485.5713865166
Took 0.2635 seconds to calculate 3130485.5713865166
What would be even faster is if you went into math.py and copied the function "sqrt" into your program. It takes time for your program to find math.py, then open it, find the function you are looking for, and then bring that back to your program. If that function is faster even with the "lookup" steps, then the function itself has to be awfully fast. Probably will cut your time in half. IN summary:
- Go to math.py
- Find the function "sqrt"
- Copy it
- Paste function into your program as the sqrt finder.
- Time it.
참고URL : https://stackoverflow.com/questions/327002/which-is-faster-in-python-x-5-or-math-sqrtx
'IT story' 카테고리의 다른 글
Windows Phone 7 시작하기 (0) | 2020.05.27 |
---|---|
GPU 프로그래밍 소개 (0) | 2020.05.27 |
Bash에서 배열을 슬라이스하는 방법 (0) | 2020.05.26 |
package.json에서 필요한 Node.js 버전을 지정하려면 어떻게해야합니까? (0) | 2020.05.26 |
“self.x = x; (0) | 2020.05.26 |