PHP'foreach'如何实际工作?



Answers

在示例3中,不要修改数组。 在所有其他示例中,您可以修改内容或内部数组指针。 由于赋值运算符的语义,这对于PHP数组非常重要。

PHP中数组的赋值运算符更像一个懒惰的克隆。 将一个变量分配给包含数组的另一个变量将克隆该数组,这与大多数语言不同。 但是,除非需要,否则实际克隆将不会完成。 这意味着克隆只有在任何一个变量被修改(写时拷贝)时才会发生。

这里是一个例子:

$a = array(1,2,3);
$b = $a;  // This is lazy cloning of $a. For the time
          // being $a and $b point to the same internal
          // data structure.

$a[] = 3; // Here $a changes, which triggers the actual
          // cloning. From now on, $a and $b are two
          // different data structures. The same would
          // happen if there were a change in $b.

回想一下你的测试用例,你可以很容易想象foreach创建了一种带有对数组的引用的迭代器。 这个引用和我的例子中的变量$b完全一样。 但是,迭代器和引用一起只在循环过程中生效,然后它们都被丢弃。 现在你可以看到,除了3以外的所有情况下,数组在循环期间被修改,而这个额外的引用是活着的。 这触发了一个克隆,这解释了这里发生了什么!

这篇文章描述了这种复制写入行为的另一个副作用: PHP三元运算符:快还是慢?

Question

让我以此为前缀说我知道什么是foreach ,是否以及如何使用它。 这个问题涉及它在引擎盖下的工作方式,我不希望按照“这是用foreach循环数组”的方式做出任何答案。

很长一段时间,我认为foreach与数组本身一起工作。 然后,我发现它提供了许多与数组副本一起工作的事实,并且我认为这是故事的结尾。 但是我最近就这个问题进行了讨论,经过一些实验后发现这实际上并不是100%真实的。

让我表明我的意思。 对于以下测试用例,我们将使用以下数组:

$array = array(1, 2, 3, 4, 5);

测试案例1

foreach ($array as $item) {
  echo "$item\n";
  $array[] = $item;
}
print_r($array);

/* Output in loop:    1 2 3 4 5
   $array after loop: 1 2 3 4 5 1 2 3 4 5 */

这清楚地表明我们并不直接处理源数组 - 否则循环会一直持续下去,因为我们在循环过程中不断地将项目推送到数组上。 但是可以肯定的是这种情况:

测试案例2

foreach ($array as $key => $item) {
  $array[$key + 1] = $item + 2;
  echo "$item\n";
}

print_r($array);

/* Output in loop:    1 2 3 4 5
   $array after loop: 1 3 4 5 6 7 */

这支持了我们的初步结论,我们正在循环中处理源数组的副本,否则我们会在循环中看到修改后的值。 但...

如果我们查看manual ,我们会发现这样的说法:

当foreach首先开始执行时,内部数组指针会自动重置为数组的第一个元素。

正确...这似乎表明foreach依赖于源数组的数组指针。 但是,我们刚刚证明我们并不使用源数组 ,对吧? 那么,不完全。

测试案例3

// Move the array pointer on one to make sure it doesn't affect the loop
var_dump(each($array));

foreach ($array as $item) {
  echo "$item\n";
}

var_dump(each($array));

/* Output
  array(4) {
    [1]=>
    int(1)
    ["value"]=>
    int(1)
    [0]=>
    int(0)
    ["key"]=>
    int(0)
  }
  1
  2
  3
  4
  5
  bool(false)
*/

因此,尽管我们不直接使用源数组,但我们直接使用源数组指针 - 指针位于循环结尾的数组末尾这一事实表明了这一点。 除了这不是真的 - 如果是这样,那么测试用例1将永远循环。

PHP手册还指出:

由于foreach依靠内部数组指针在循环内改变它可能会导致意外的行为。

那么,让我们找出那个“意外行为”是什么(从技术上讲,任何行为都是意外的,因为我不知道该期待什么)。

测试案例4

foreach ($array as $key => $item) {
  echo "$item\n";
  each($array);
}

/* Output: 1 2 3 4 5 */

测试案例5

foreach ($array as $key => $item) {
  echo "$item\n";
  reset($array);
}

/* Output: 1 2 3 4 5 */

......没有什么意外的,实际上它似乎支持“源头”的理论。

问题

这里发生了什么? 我的C-fu不够好,仅仅通过查看PHP源代码就能够得出正确的结论,如果有人能为我翻译成英文,我将不胜感激。

在我看来, foreach与数组的副本一起工作,但在循环之后将源数组的数组指针设置为数组的末尾。

  • 这是正确的和整个故事?
  • 如果不是,它究竟在做什么?
  • 有没有在foreach期间使用调整数组指针( each()reset()each()函数会影响循环结果的情况?



解释(从php.net引用):

第一种形式循环由array_expression给出的数组。 在每次迭代中,当前元素的值被赋值为$ value,并且内部数组指针被前进一个(所以在下一次迭代时,您将查看下一个元素)。

因此,在第一个示例中,数组中只有一个元素,当指针移动时,下一个元素不存在,因此在添加新元素foreach结束后,因为它已经“确定”它是最后一个元素。

在你的第二个例子中,你从两个元素开始,并且foreach循环不在最后一个元素,所以它在下一次迭代中评估数组,因此意识到数组中有新元素。

我相信这是所有结果在文档中的每个迭代部分的解释,这可能意味着foreach在调用{}的代码之前完成所有逻辑。

测试用例

如果你运行这个:

<?
    $array = Array(
        'foo' => 1,
        'bar' => 2
    );
    foreach($array as $k=>&$v) {
        $array['baz']=3;
        echo $v." ";
    }
    print_r($array);
?>

你会得到这个输出:

1 2 3 Array
(
    [foo] => 1
    [bar] => 2
    [baz] => 3
)

这意味着它接受了修改并进行了修改,因为它是“及时”修改的。 但是,如果你这样做:

<?
    $array = Array(
        'foo' => 1,
        'bar' => 2
    );
    foreach($array as $k=>&$v) {
        if ($k=='bar') {
            $array['baz']=3;
        }
        echo $v." ";
    }
    print_r($array);
?>

你会得到:

1 2 Array
(
    [foo] => 1
    [bar] => 2
    [baz] => 3
)

Which means that array was modified, but since we modified it when the foreach already was at the last element of the array, it "decided" not to loop anymore, and even though we added new element, we added it "too late" and it was not looped through.

Detailed explanation can be read at How does PHP 'foreach' actually work? which explains the internals behind this behaviour.




Great question, because many developers, even experienced ones, are confused by the way PHP handles arrays in foreach loops. In the standard foreach loop, PHP makes a copy of the array that is used in the loop. The copy is discarded immediately after the loop finishes. This is transparent in the operation of a simple foreach loop. 例如:

$set = array("apple", "banana", "coconut");
foreach ( $set AS $item ) {
    echo "{$item}\n";
}

This outputs:

apple
banana
coconut

So the copy is created but the developer doesn't notice, because the original array isn't referenced within the loop or after the loop finishes. However, when you attempt to modify the items in a loop, you find that they are unmodified when you finish:

$set = array("apple", "banana", "coconut");
foreach ( $set AS $item ) {
    $item = strrev ($item);
}

print_r($set);

This outputs:

Array
(
    [0] => apple
    [1] => banana
    [2] => coconut
)

Any changes from the original can't be notices, actually there are no changes from the original, even though you clearly assigned a value to $item. This is because you are operating on $item as it appears in the copy of $set being worked on. You can override this by grabbing $item by reference, like so:

$set = array("apple", "banana", "coconut");
foreach ( $set AS &$item ) {
    $item = strrev($item);
}
print_r($set);

This outputs:

Array
(
    [0] => elppa
    [1] => ananab
    [2] => tunococ
)

So it is evident and observable, when $item is operated on by-reference, the changes made to $item are made to the members of the original $set. Using $item by reference also prevents PHP from creating the array copy. To test this, first we'll show a quick script demonstrating the copy:

$set = array("apple", "banana", "coconut");
foreach ( $set AS $item ) {
    $set[] = ucfirst($item);
}
print_r($set);

This outputs:

Array
(
    [0] => apple
    [1] => banana
    [2] => coconut
    [3] => Apple
    [4] => Banana
    [5] => Coconut
)

As it is shown in the example, PHP copied $set and used it to loop over, but when $set was used inside the loop, PHP added the variables to the original array, not the copied array. Basically, PHP is only using the copied array for the execution of the loop and the assignment of $item. Because of this, the loop above only executes 3 times, and each time it appends another value to the end of the original $set, leaving the original $set with 6 elements, but never entering an infinite loop.

However, what if we had used $item by reference, as I mentioned before? A single character added to the above test:

$set = array("apple", "banana", "coconut");
foreach ( $set AS &$item ) {
    $set[] = ucfirst($item);
}
print_r($set);

Results in an infinite loop. Note this actually is an infinite loop, you'll have to either kill the script yourself or wait for your OS to run out of memory. I added the following line to my script so PHP would run out of memory very quickly, I suggest you do the same if you're going to be running these infinite loop tests:

ini_set("memory_limit","1M");

So in this previous example with the infinite loop, we see the reason why PHP was written to create a copy of the array to loop over. When a copy is created and used only by the structure of the loop construct itself, the array stays static throughout the execution of the loop, so you'll never run into issues.




Links