🐥 Efficiently downloading stuff with Req and concurrency in Elixir

elixir pattern

270 words, 2 min read

When working with APIs that return large amounts of data, fetching resources sequentially can be slow and inefficient. If you’re using Elixir and the Req library to accomplish this, you can improve performance by making requests concurrently while ensuring that you don’t overload the system.

In this post, we’ll explore how to do this efficiently using Task.async_stream/3 to limit concurrency.

Making API requests in parallel speeds up data retrieval, but sending too many requests at once may:

Exceed API rate limits
Overload the server
Consume too many system resources

By setting a maximum concurrency level, we ensure that requests are handled efficiently without overwhelming the system.

The Task.async_stream/3 function provides a simple way to process a list of items concurrently with a specified limit. Below is an implementation for downloading multiple items efficiently:

defmodule Client do
  def fetch_items(item_ids) do
    max_concurrency = 5

    item_ids
    |> Task.async_stream(&fetch_item/1, max_concurrency: max_concurrency, timeout: 10_000)
    |> Enum.to_list()
  end

  defp fetch_item(item_id) do
    Req.get!("https://api.someserver.com/v1/download/#{item_id}").body
  end
end

Task.async_stream/3: This function ensures that only max_concurrency tasks run simultaneously.
max_concurrency: 5: At most 5 API requests will be processed concurrently.
timeout: 10_000: Each request must complete within 10 seconds, preventing long-hanging requests.
Results Handling: Enum.to_list/1 collects results as {:ok, result} or {:exit, reason} tuples.

Since external API calls may fail due to network issues, you may want to handle errors more gracefully:

defp fetch_item(item_id) do
  case Req.get("https://api.someserver.com/v1/download/#{item_id}") do
    {:ok, response} -> {:ok, response.body}
    {:error, reason} -> {:error, route_id, reason}
  end
end

By using Task.async_stream/3, we can efficiently fetch multiple items concurrently while maintaining a controlled level of parallelism. This approach balances performance and system stability, making it ideal for API-heavy applications.

If this post was enjoyable or useful for you, please share it! If you have comments, questions, or feedback, you can email my personal email. To get new posts, subscribe use the RSS feed.