VGTech is a blog where the developers and devops of Norways most visited website share code and tricks of the trade… Read more



Are you brilliant? We're hiring. Read more

PHP: Perform HTTP requests in parallel

PHP

Ever had to request multiple HTTP-resources in your web application? Often, you need data from one request to be able to request the second – in this case there is little you can do but wait for the first to return. However, if the requests are not dependent on each other, you can use a pretty cool trick: curl_multi_*.

Let’s say you wanted to fetch the public data for VG.no and tech.vg.no from Facebook’s Graph API. One might use file_get_contents, or a standard curl-call:

Show code
<?php
$urls = array(
    'http://graph.facebook.com/http://tech.vg.no',
    'http://graph.facebook.com/http://www.vg.no',
);

foreach ($urls as $url) {
    $ch = curl_init();
    curl_setopt($ch, CURLOPT_URL, $url);
    curl_setopt($ch, CURLOPT_HEADER, false);
    curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
    echo curl_exec($ch);
    curl_close($ch);
}

The problem with this approach is obviously that it waits for each request to return before proceeding to the next. If each request takes 5 seconds to perform, we’d have to wait 10 seconds before we’d be able to process all our data and return it to the user.

So, how do we make these request perform in parallel? curl_multi_*. This feature seems to have been introduced in PHP5, but I’m sure many (like me) have not come across it before. Let’s take a look at how we could optimize our code using these functions:

Show code
<?php
$urls = array(
    'http://graph.facebook.com/http://tech.vg.no',
    'http://graph.facebook.com/http://www.vg.no',
);

$multi = curl_multi_init();
$channels = array();

// Loop through the URLs, create curl-handles
// and attach the handles to our multi-request
foreach ($urls as $url) {
    $ch = curl_init();
    curl_setopt($ch, CURLOPT_URL, $url);
    curl_setopt($ch, CURLOPT_HEADER, false);
    curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);

    curl_multi_add_handle($multi, $ch);

    $channels[$url] = $ch;
}

// While we're still active, execute curl
$active = null;
do {
    $mrc = curl_multi_exec($multi, $active);
} while ($mrc == CURLM_CALL_MULTI_PERFORM);

while ($active && $mrc == CURLM_OK) {
    // Wait for activity on any curl-connection
    if (curl_multi_select($multi) == -1) {
        continue;
    }

    // Continue to exec until curl is ready to
    // give us more data
    do {
        $mrc = curl_multi_exec($multi, $active);
    } while ($mrc == CURLM_CALL_MULTI_PERFORM);
}

// Loop through the channels and retrieve the received
// content, then remove the handle from the multi-handle
foreach ($channels as $channel) {
    echo curl_multi_getcontent($channel);
    curl_multi_remove_handle($multi, $channel);
}

// Close the multi-handle and return our results
curl_multi_close($multi);

Using this technique, we’ve reduced the time it takes to perform the requests down to the slowest request in the batch. Nice!

However, I’m sure many (like me) are looking at this code and thinking: “Wow. That is a lot of code, and far from readable”. I agree. Thanks to a wonderful PHP HTTP client called Guzzle, we can achieve the same result with much more readable code:

Show code
<?php
use Guzzle\Http\Client,
    Guzzle\Common\Exception\MultiTransferException;

$client = new Client('http://graph.facebook.com');

try {
    $responses = $client->send(array(
        $client->get('/' . urlencode('http://tech.vg.no')),
        $client->get('/' . urlencode('http://www.vg.no')),
    ));

    foreach ($responses as $response) {
        echo $response->getBody();
    }
} catch (MultiTransferException $e) {
    echo 'The following exceptions were encountered:' . PHP_EOL;
    foreach ($e as $exception) {
        echo $exception->getMessage() . PHP_EOL;
    }
}

I’ve put together a simple demo repository showing the difference in speed between these approaches. It’s available on Github for anyone who wants to take a look.

Happy HTTP’ing!

Developer at VG with a passion for Node.js, React, PHP and the web platform as a whole. espen.codes - @rexxars


8 comments

  • Juni

    Nice! Never seen either trick before. Thanks!


  • Wesam Alalem

    Thank you for sharing, great trick indeed.


  • Sergey

    There is also a great PHP library for using multithread curl: https://github.com/barbushin/multirequest


  • Дайджест интересных новостей и материалов из мира PHP за последние две недели (15.07.2013 — 28.07.2013) - Juds

    [...] Выполняем HTTP-запросы параллельно — Небольшой пост об использовании функций curl_multi_*. Также автор рекомендует к использованию библиотеку Guzzle, которая упрощает создание RESTful-клиентов на PHP. [...]


  • Ruslan Bekenev

    I never heard about Guzzle before. Thank you it's great client.


  • arjun

    Really cool solution!,
    curl vs php_pthreads which once is faster and reliable for making parallel requests.

    thank you!


    • Espen Hovlandsdal

      I would argue that the CURL solution (especially when using Guzzle or similar) is a simpler approach. Performance-wise it shouldn't really matter much - your bottleneck will be in network, not local resources.

  • Giles Wells

    I have used this technique before. We were doing a custom social button implementation and in order to get all the counts of likes, comments, tweets, +1's, and a couple other miscellaneous counts I had to make a total of up to 24 requests to hit all the API's we needed. Thankfully facebook has a batch API or it would have been 35 requests. Dropped the total time for that ajax call down to just 2.5 seconds down from closer to 15-20 per page load.


Leave your comment