Explanation of speed difference between `random_walk` and `random_walk_faster` looks misleading

I was reading the Introduction of the book and I found myself trying to understand why `random_walk_faster` is ~7 times faster than `random_walk`. That's a huge difference!

Those are the functions:

```
def random_walk(n):
    position = 0
    walk = [position]
    for i in range(n):
        position += 2*random.randint(0, 1)-1
        walk.append(position)
    return walk

def random_walk_faster(n=1000):
    from itertools import accumulate
    # Only available from Python 3.6
    steps = random.choices([-1,+1], k=n)
    return [0]+list(accumulate(steps))
```

Running some tests, I figure out that most of the difference in speed is related to the way we compute the next random step, not because we are using a "vectorized approach" instead of a procedural, as stated in the explanation:

> In fact, we've just vectorized our function. Instead of looping for picking sequential steps and add them to the current position, we first generated all the steps at once and used the accumulate function to compute all the positions.

>  We got rid of the loop and this makes things faster.

This last statement is also not accurate. Actually, we are just replacing an explicit loop for other loops hidden in the machinery of Python (that can, of course, rely on faster functions implemented in C).

Looking at the implementation of `random.choices`, the trick to gain speed is to use something like `population[floor(random.random() * len(population)]` to compute each  next step.


Applying this in the `random_walk`, we have:

```
from math import floor

def random_walk(n):
    population = [-1,+1]
    position = 0
    walk = [position]
    for i in range(n):
        position += population[floor(random.random() * 2)]
        walk.append(position)
    return walk
```

Now the difference of speed for 10k steps is ~1.6. 




Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Explanation of speed difference between `random_walk` and `random_walk_faster` looks misleading #101

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Explanation of speed difference between random_walk and random_walk_faster looks misleading #101

Description

Activity