[pypy-issue] Issue #2202: strange results from uuid.uuid4() when generating many uuid4 values in a short period of time (pypy/pypy)

Brian Corbin issues-reply at bitbucket.org
Fri Dec 4 16:07:13 EST 2015


New issue 2202: strange results from uuid.uuid4() when generating many uuid4 values in a short period of time
https://bitbucket.org/pypy/pypy/issues/2202/strange-results-from-uuiduuid4-when

Brian Corbin:

Hi,

I've been working to switch a project over to run with pypy.
Initial results show some significant speed ups in key parts of the project!
I'm down to just a couple of test cases failing in this project when running under pypy.
One of the test failures seems to be related to some strange behavior when generating a lot of 
uuid4 values back-to-back.   One of these tests creates some sample dictionaries that each has a uuid associated with it.   The test case used these uuids for a few things and it seemed to fail periodically because some of the uuid values ended up being the same on some of the dictionaries when we were expecting them all to have unique values.

Here's a snippet to demonstrate the behavior I've been seeing with PyPy 4.0.1 in a Ubuntu VM:

```
$ uname -a
Linux vagrant-ubuntu-trusty-64 3.13.0-45-generic #74-Ubuntu SMP Tue Jan 13 19:36:28 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux
$ python
Python 2.7.10 (5f8302b8bf9f, Nov 18 2015, 10:46:46)
[PyPy 4.0.1 with GCC 4.8.4] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>>> import uuid
>>>> uuid._uuid_generate_random
<_ctypes.function.CFuncPtrFast object at 0x00007f9fe88565a8>
>>>> for i in range(1000):
....     x = str(uuid.uuid4())
....     if '00000' in x:
....         print(x)
....
04000000-0000-0000-0000-000000000000
ffffffff-ffff-ffff-788f-050000000000
ffffffff-ffff-ffff-788f-050000000000
04000000-0000-0000-ffff-ffffffffffff
908823e8-9f7f-0000-2805-000000000000
01000000-0000-0000-00c8-3de89f7f0000
000133e8-9f7f-0000-0000-000000000000
183fcbe8-9f7f-0000-0000-000000000000
00010000-0000-0000-000f-050000000000
00013ee8-9f7f-0000-788f-050000000000
ffffffff-ffff-ffff-788f-050000000000
01000000-0000-0000-ffff-ffffffffffff
ffffffff-ffff-ffff-000f-050000000000
01000000-0000-0000-ffff-ffffffffffff
02000000-0000-0000-0832-cce89f7f0000
0001cde8-9f7f-0000-0000-000000000000
00000000-0000-0000-a834-cde89f7f0000
ffffffff-ffff-ffff-e015-080000000000
00000000-0000-0000-d6ff-ffffffffffff
00000000-0000-0000-0000-000000000000
14000000-0000-0000-0000-000000000000
d6ffffff-ffff-ffff-4090-060000000000
60e93de8-9f7f-0000-0000-000000000000
01000000-0000-0000-783b-cfe89f7f0000
00000000-0000-0000-387b-080000000000
ffffffff-ffff-ffff-0100-000000000000
ffffffff-ffff-ffff-0000-000000000000
00000000-0000-0000-0000-000000000000
b01fcfe8-9f7f-0000-2805-000000000000
00000000-0000-0000-ffff-ffffffffffff
00000000-0000-0000-809b-40e89f7f0000
01000000-0000-0000-7868-d0e89f7f0000
01000000-0000-0000-0100-000000000000
01000000-0000-0000-0001-3de89f7f0000
48770100-0000-0000-0000-000000000000
00000000-0000-0000-2805-000000000000
00000000-0000-0000-0200-000000000000
00000000-0000-0000-0000-000000000000
08000000-0000-0000-0300-000000000000
00000000-0000-0000-20cd-4dee9f7f0000
00000000-0000-0000-20c7-d1e89f7f0000
02000000-0000-0000-e00a-d2e89f7f0000
18480700-0800-0000-0000-000000000000
48770100-0000-0000-0000-000000000000
04000000-0000-0000-60d4-31e89f7f0000
2a000000-0000-0000-58b2-d2e89f7f0000
28050000-0000-0000-0200-000000000000
10710100-0000-0000-f065-080000000000
01000000-0000-0000-0101-cfe89f7f0000
00000000-0000-0000-0000-000000000000
```

It seems like `_uuid_generate_random` path here https://bitbucket.org/pypy/pypy/src/80ce6004b46af7a7ece6e30f1a53f8c98ad97203/lib-python/2.7/uuid.py?at=default&fileviewer=file-view-default#uuid.py-609 is producing different results from 
```
import os
return UUID(bytes=os.urandom(16), version=4)
```

The latter approach actually works as expected and our test suite passes when that's used under pypy.  

For now, I've switched the project over to just use `UUID(bytes=os.urandom(16), version=4)` in place of `uuid.uuid4()` while trying to determine what's going on there.  
I wanted to share this in case anyone else runs into strange behavior when they're generating a number of uuid4 values in a short period of time with PyPy 4.0.1.

Thank y'all for all the hard work on pypy!

Brian




More information about the pypy-issue mailing list