Ioannis Bekiaris

PHP Generators: To Yield or Not to Yield?

When I first worked in a Scala project, I came across yield. According to the Scala’s official documentation, “ yield is part of “for comprehensions … Scala’s ‘for comprehension’ is nothing more than syntactic sugar for composition of multiple monadic operations”.

Only a few years ago I realised that yield existed in PHP as well. It was introduced back in 2012 in PHP 5.5. (RFC)

In PHP, any function containing yield is a generator function. A generator function looks just like a normal function, except that, instead of returning a value, a generator yields as many values as it needs to.

In PHP, a generator is nothing more than syntactic sugar for Iterator implementation.

So what is so special about generator functions?

Let’s try to figure this out by using a simple example:

We will create a generator function that returns a range of numbers, picking up 1 (one) as the default step.

<?php declare(strict_types=1);

function numbersRange(int $max) {
    for ($i = 0; $i < $max; $i++) {
        yield $i;
    }
}

foreach(numbersRange(10) as $number) {
    echo $number;
}

“We could do the same with a simple foreach”, someone could say; and they would be right.

<?php declare(strict_types=1);

function numbersRange(int $max) {
    $a = [];
    for ($i = 0; $i < $max; $i++) {
        $a[] = $i;
     }
    return $a;
}

But let’s start thinking more like engineers. What if the max number is extremely big? What if we pass, for example, PHP_INT_MAX as an argument in both functions?

The first one prints values without any problem. However the second will throw an error!

Fatal error: Allowed memory size of n bytes exhausted

The heart of a generator function is the yield keyword. In its simplest form, a yield statement looks much like a return statement, except that, instead of stopping execution of the function and returning, yield provides a value to the code looping over the generator and pauses execution of the generator function.

This is the reason why in the first example we do not run out of memory.

So far so good. What about processing time?

Some people are raising concerns regarding processing time while using generators.

According to the official PHP documentation:

A generator allows you to write code that uses foreach to iterate over a set of data without needing to build an array in memory, which may cause you to exceed a memory limit, or require a considerable amount of processing time to generate.

If you do care about the iteration only, then obviously iteration over a generator takes more time than iteration over an array, since generator must always generate values whenever you call it.

However, when trying to do benchmarking (comparing processing time), bear in mind that you will need some time NOT only to traverse the array but also to create it!

Bear also in mind that generator function is not intended to replace usage of arrays. Generators provide an easy, boilerplate-free way of implementing iterators.

Conclusion

In short, feel free to use generators wherever you have to traverse large amounts of data that will cause an application crash. Big files or tables in DBs can all be easily handled via the usage of a generator function!