Skip to content

Download failure under heavy load #974

Open
@sgillies

Description

Under high download concurrency, httpcore and httpx errors propagate up from the StreamingBody instance at https://github.com/planetlabs/planet-client-python/blob/main/planet/clients/orders.py#L259. These errors do not manifest at lower concurrency. Streaming responses is a strategy used to keep the memory footprint of programs manageable while downloading multiple large (up to ~100 MB) TIFFs concurrently.

Possible lead: the same kind of asyncio.exceptions.CancelledError is mentioned at agronholm/anyio#534. Which was closed, concluding that callers have to expect read timeouts and work around them.

Possible workaround: separate order creation from order download. Order creation is more reliable and when it does fail, fails differently. It is probably less complicated to retry order downloads if they are de-interleaved from order creation. This project has tended to document order creation and download as tasks that are done together, but that may not be a best practice for large batches of orders.

Traceback 1:

Traceback (most recent call last):
  File "/Users/aayush.malik/.local/lib/python3.11/site-packages/anyio/streams/tls.py", line 130, in _call_sslobject_method
    result = func(*args)
             ^^^^^^^^^^^
  File "/opt/anaconda3/envs/taskpro/lib/python3.11/ssl.py", line 921, in read
    v = self._sslobj.read(len)
        ^^^^^^^^^^^^^^^^^^^^^^
ssl.SSLWantReadError: The operation did not complete (read) (_ssl.c:2576)
orders/e3c6969b-1df7-497b-8fd6-6776de33c557/SkySatCollect/20230626_072301_ssc1_u0001_pansharpened.tif:  53%|███████████████████████████████████████████████████████▋                                                  | 745k/0.00M [11:57<08:25, 1.39MB/s]
During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/aayush.malik/.local/lib/python3.11/site-packages/httpcore/backends/asyncio.py", line 33, in read
    return await self._stream.receive(max_bytes=max_bytes)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/aayush.malik/.local/lib/python3.11/site-packages/anyio/streams/tls.py", line 195, in receive
    data = await self._call_sslobject_method(self._ssl_object.read, max_bytes)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/aayush.malik/.local/lib/python3.11/site-packages/anyio/streams/tls.py", line 137, in _call_sslobject_method
    data = await self.transport_stream.receive()
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/aayush.malik/.local/lib/python3.11/site-packages/anyio/_backends/_asyncio.py", line 1265, in receive
    await self._protocol.read_event.wait()
  File "/opt/anaconda3/envs/taskpro/lib/python3.11/asyncio/locks.py", line 213, in wait
    await fut
asyncio.exceptions.CancelledError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/aayush.malik/.local/lib/python3.11/site-packages/httpcore/_exceptions.py", line 8, in map_exceptions
    yield
  File "/Users/aayush.malik/.local/lib/python3.11/site-packages/httpcore/backends/asyncio.py", line 31, in read
    with anyio.fail_after(timeout):
  File "/Users/aayush.malik/.local/lib/python3.11/site-packages/anyio/_core/_tasks.py", line 118, in __exit__
    raise TimeoutError
TimeoutError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/aayush.malik/.local/lib/python3.11/site-packages/httpx/_transports/default.py", line 60, in map_httpcore_exceptions
    yield
  File "/Users/aayush.malik/.local/lib/python3.11/site-packages/httpx/_transports/default.py", line 239, in __aiter__
    async for part in self._httpcore_stream:
  File "/Users/aayush.malik/.local/lib/python3.11/site-packages/httpcore/_async/connection_pool.py", line 346, in __aiter__
    async for part in self._stream:
  File "/Users/aayush.malik/.local/lib/python3.11/site-packages/httpcore/_async/http11.py", line 300, in __aiter__
    raise exc
  File "/Users/aayush.malik/.local/lib/python3.11/site-packages/httpcore/_async/http11.py", line 293, in __aiter__
    async for chunk in self._connection._receive_response_body(**kwargs):
  File "/Users/aayush.malik/.local/lib/python3.11/site-packages/httpcore/_async/http11.py", line 165, in _receive_response_body
    event = await self._receive_event(timeout=timeout)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/aayush.malik/.local/lib/python3.11/site-packages/httpcore/_async/http11.py", line 177, in _receive_event
    data = await self._network_stream.read(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/aayush.malik/.local/lib/python3.11/site-packages/httpcore/backends/asyncio.py", line 30, in read
    with map_exceptions(exc_map):
  File "/opt/anaconda3/envs/taskpro/lib/python3.11/contextlib.py", line 155, in __exit__
    self.gen.throw(typ, value, traceback)
  File "/Users/aayush.malik/.local/lib/python3.11/site-packages/httpcore/_exceptions.py", line 12, in map_exceptions
    raise to_exc(exc)
httpcore.ReadTimeout

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/Users/aayush.malik/Desktop/apollo-taskpro/taskpro/cli.py", line 139, in run_task
    await asyncio.gather(*[
  File "/Users/aayush.malik/Desktop/apollo-taskpro/taskpro/cli.py", line 120, in create_and_download
    await client.download_order(order['id'], directory, progress_bar=True, overwrite=True)
  File "/opt/anaconda3/envs/taskpro/lib/python3.11/site-packages/planet/clients/orders.py", line 298, in download_order
    filenames = [
                ^
  File "/opt/anaconda3/envs/taskpro/lib/python3.11/site-packages/planet/clients/orders.py", line 299, in <listcomp>
    await self.download_asset(i['location'],
  File "/opt/anaconda3/envs/taskpro/lib/python3.11/site-packages/planet/clients/orders.py", line 259, in download_asset
    await body.write(dl_path,
  File "/opt/anaconda3/envs/taskpro/lib/python3.11/site-packages/planet/models.py", line 156, in write
    async for chunk in self._response.aiter_bytes():
  File "/opt/anaconda3/envs/taskpro/lib/python3.11/site-packages/planet/models.py", line 72, in aiter_bytes
    async for c in self._http_response.aiter_bytes():
  File "/Users/aayush.malik/.local/lib/python3.11/site-packages/httpx/_models.py", line 914, in aiter_bytes
    async for raw_bytes in self.aiter_raw():
  File "/Users/aayush.malik/.local/lib/python3.11/site-packages/httpx/_models.py", line 972, in aiter_raw
    async for raw_stream_bytes in self.stream:
  File "/Users/aayush.malik/.local/lib/python3.11/site-packages/httpx/_client.py", line 146, in __aiter__
    async for chunk in self._stream:
  File "/Users/aayush.malik/.local/lib/python3.11/site-packages/httpx/_transports/default.py", line 238, in __aiter__
    with map_httpcore_exceptions():
  File "/opt/anaconda3/envs/taskpro/lib/python3.11/contextlib.py", line 155, in __exit__
    self.gen.throw(typ, value, traceback)
  File "/Users/aayush.malik/.local/lib/python3.11/site-packages/httpx/_transports/default.py", line 77, in map_httpcore_exceptions
    raise mapped_exc(message) from exc
httpx.ReadTimeout

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/opt/anaconda3/envs/taskpro/bin/taskpro", line 33, in <module>
    sys.exit(load_entry_point('taskpro', 'console_scripts', 'taskpro')())
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/aayush.malik/.local/lib/python3.11/site-packages/click/core.py", line 1130, in __call__
    return self.main(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/aayush.malik/.local/lib/python3.11/site-packages/click/core.py", line 1055, in main
    rv = self.invoke(ctx)
         ^^^^^^^^^^^^^^^^
  File "/Users/aayush.malik/.local/lib/python3.11/site-packages/click/core.py", line 1657, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/aayush.malik/.local/lib/python3.11/site-packages/click/core.py", line 1404, in invoke
    return ctx.invoke(self.callback, **ctx.params)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/aayush.malik/.local/lib/python3.11/site-packages/click/core.py", line 760, in invoke
    return __callback(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/aayush.malik/Desktop/apollo-taskpro/taskpro/cli.py", line 144, in activate_and_download_orders
    asyncio.run(run_task())
  File "/opt/anaconda3/envs/taskpro/lib/python3.11/asyncio/runners.py", line 190, in run
    return runner.run(main)
           ^^^^^^^^^^^^^^^^
  File "/opt/anaconda3/envs/taskpro/lib/python3.11/asyncio/runners.py", line 118, in run
    return self._loop.run_until_complete(task)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/anaconda3/envs/taskpro/lib/python3.11/asyncio/base_events.py", line 653, in run_until_complete
    return future.result()
           ^^^^^^^^^^^^^^^
  File "/Users/aayush.malik/Desktop/apollo-taskpro/taskpro/cli.py", line 125, in run_task
    async with planet.Session(auth=auth) as s:
  File "/opt/anaconda3/envs/taskpro/lib/python3.11/site-packages/planet/http.py", line 290, in __aexit__
    await self.aclose()
  File "/opt/anaconda3/envs/taskpro/lib/python3.11/site-packages/planet/http.py", line 293, in aclose
    await self._client.aclose()
  File "/Users/aayush.malik/.local/lib/python3.11/site-packages/httpx/_client.py", line 1968, in aclose
    await self._transport.aclose()
  File "/Users/aayush.malik/.local/lib/python3.11/site-packages/httpx/_transports/default.py", line 365, in aclose
    await self._pool.aclose()
  File "/Users/aayush.malik/.local/lib/python3.11/site-packages/httpcore/_async/connection_pool.py", line 312, in aclose
    raise RuntimeError(
RuntimeError: The connection pool was closed while 27 HTTP requests/responses were still in-flight.

Traceback 2:

Traceback (most recent call last):fa4021537/SkySatCollect/20230626_062154_ssc4_u0001_pansharpened.tif:  64%|████████████████████████████████████████████████████████████████████▋                                      | 741k/0.00M [13:31<14:16, 505kB/s]
  File "/Users/aayush.malik/.local/lib/python3.11/site-packages/httpcore/_exceptions.py", line 8, in map_exceptions██████████████████████████████████████████████████████████████████████▊                             | 901k/0.00M [13:49<08:05, 729kB/s]
    yielde3ed90-cc8a-4fa0-966f-0f11fc9ac5a1/SkySatCollect/20230626_061131_ssc13_u0001_pansharpened.tif:  60%|███████████████████████████████████████████████████████████████▏                                          | 690k/0.00M [12:59<13:57, 585kB/s]
  File "/Users/aayush.malik/.local/lib/python3.11/site-packages/httpcore/_async/http11.py", line 174, in _receive_event███████████████████████████████████▎                                                            | 626k/0.00M [11:16<19:42, 743kB/s]
    event = self._h11_state.next_event()
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/aayush.malik/.local/lib/python3.11/site-packages/h11/_connection.py", line 425, in next_event
    event = self._extract_next_receive_event()
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/aayush.malik/.local/lib/python3.11/site-packages/h11/_connection.py", line 375, in _extract_next_receive_event
    event = self._reader.read_eof()a4021537/SkySatCollect/20230626_062154_ssc4_u0001_pansharpened.tif:  64%|████████████████████████████████████████████████████████████████████▊                                      | 741k/0.00M [13:31<14:44, 489kB/s]
            ^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/aayush.malik/.local/lib/python3.11/site-packages/h11/_readers.py", line 117, in read_eof
    raise RemoteProtocolError(
h11._util.RemoteProtocolError: peer closed connection without sending complete message body (received 775503406 bytes, expected 1459578601)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/aayush.malik/.local/lib/python3.11/site-packages/httpx/_transports/default.py", line 60, in map_httpcore_exceptions
    yield
  File "/Users/aayush.malik/.local/lib/python3.11/site-packages/httpx/_transports/default.py", line 239, in __aiter__
    async for part in self._httpcore_stream:
  File "/Users/aayush.malik/.local/lib/python3.11/site-packages/httpcore/_async/connection_pool.py", line 346, in __aiter__
    async for part in self._stream:
  File "/Users/aayush.malik/.local/lib/python3.11/site-packages/httpcore/_async/http11.py", line 300, in __aiter__
    raise exc
  File "/Users/aayush.malik/.local/lib/python3.11/site-packages/httpcore/_async/http11.py", line 293, in __aiter__
    async for chunk in self._connection._receive_response_body(**kwargs):
  File "/Users/aayush.malik/.local/lib/python3.11/site-packages/httpcore/_async/http11.py", line 165, in _receive_response_body
    event = await self._receive_event(timeout=timeout)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/aayush.malik/.local/lib/python3.11/site-packages/httpcore/_async/http11.py", line 173, in _receive_event
    with map_exceptions({h11.RemoteProtocolError: RemoteProtocolError}):
  File "/opt/anaconda3/envs/taskpro/lib/python3.11/contextlib.py", line 155, in __exit__
    self.gen.throw(typ, value, traceback)
  File "/Users/aayush.malik/.local/lib/python3.11/site-packages/httpcore/_exceptions.py", line 12, in map_exceptions
    raise to_exc(exc)
httpcore.RemoteProtocolError: peer closed connection without sending complete message body (received 775503406 bytes, expected 1459578601)

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/Users/aayush.malik/Desktop/apollo-taskpro/taskpro/cli.py", line 139, in run_task
    await asyncio.gather(*[
  File "/Users/aayush.malik/Desktop/apollo-taskpro/taskpro/cli.py", line 120, in create_and_download
    await client.download_order(order['id'], directory, progress_bar=True, overwrite=True)
  File "/opt/anaconda3/envs/taskpro/lib/python3.11/site-packages/planet/clients/orders.py", line 298, in download_order
    filenames = [
                ^
  File "/opt/anaconda3/envs/taskpro/lib/python3.11/site-packages/planet/clients/orders.py", line 299, in <listcomp>
    await self.download_asset(i['location'],
  File "/opt/anaconda3/envs/taskpro/lib/python3.11/site-packages/planet/clients/orders.py", line 259, in download_asset
    await body.write(dl_path,
  File "/opt/anaconda3/envs/taskpro/lib/python3.11/site-packages/planet/models.py", line 156, in write
    async for chunk in self._response.aiter_bytes():
  File "/opt/anaconda3/envs/taskpro/lib/python3.11/site-packages/planet/models.py", line 72, in aiter_bytes
    async for c in self._http_response.aiter_bytes():
  File "/Users/aayush.malik/.local/lib/python3.11/site-packages/httpx/_models.py", line 914, in aiter_bytes
    async for raw_bytes in self.aiter_raw():
  File "/Users/aayush.malik/.local/lib/python3.11/site-packages/httpx/_models.py", line 972, in aiter_raw
    async for raw_stream_bytes in self.stream:
  File "/Users/aayush.malik/.local/lib/python3.11/site-packages/httpx/_client.py", line 146, in __aiter__
    async for chunk in self._stream:
  File "/Users/aayush.malik/.local/lib/python3.11/site-packages/httpx/_transports/default.py", line 238, in __aiter__
    with map_httpcore_exceptions():
  File "/opt/anaconda3/envs/taskpro/lib/python3.11/contextlib.py", line 155, in __exit__
    self.gen.throw(typ, value, traceback)
  File "/Users/aayush.malik/.local/lib/python3.11/site-packages/httpx/_transports/default.py", line 77, in map_httpcore_exceptions
    raise mapped_exc(message) from exc
httpx.RemoteProtocolError: peer closed connection without sending complete message body (received 775503406 bytes, expected 1459578601)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/opt/anaconda3/envs/taskpro/bin/taskpro", line 33, in <module>
    sys.exit(load_entry_point('taskpro', 'console_scripts', 'taskpro')())
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/aayush.malik/.local/lib/python3.11/site-packages/click/core.py", line 1130, in __call__
    return self.main(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/aayush.malik/.local/lib/python3.11/site-packages/click/core.py", line 1055, in main
    rv = self.invoke(ctx)
         ^^^^^^^^^^^^^^^^
  File "/Users/aayush.malik/.local/lib/python3.11/site-packages/click/core.py", line 1657, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/aayush.malik/.local/lib/python3.11/site-packages/click/core.py", line 1404, in invoke
    return ctx.invoke(self.callback, **ctx.params)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/aayush.malik/.local/lib/python3.11/site-packages/click/core.py", line 760, in invoke
    return __callback(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/aayush.malik/Desktop/apollo-taskpro/taskpro/cli.py", line 144, in activate_and_download_orders
    asyncio.run(run_task())
  File "/opt/anaconda3/envs/taskpro/lib/python3.11/asyncio/runners.py", line 190, in run
    return runner.run(main)
           ^^^^^^^^^^^^^^^^
  File "/opt/anaconda3/envs/taskpro/lib/python3.11/asyncio/runners.py", line 118, in run
    return self._loop.run_until_complete(task)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/anaconda3/envs/taskpro/lib/python3.11/asyncio/base_events.py", line 653, in run_until_complete
    return future.result()
           ^^^^^^^^^^^^^^^
  File "/Users/aayush.malik/Desktop/apollo-taskpro/taskpro/cli.py", line 125, in run_task
    async with planet.Session(auth=auth) as s:
  File "/opt/anaconda3/envs/taskpro/lib/python3.11/site-packages/planet/http.py", line 290, in __aexit__
    await self.aclose()
  File "/opt/anaconda3/envs/taskpro/lib/python3.11/site-packages/planet/http.py", line 293, in aclose
    await self._client.aclose()
  File "/Users/aayush.malik/.local/lib/python3.11/site-packages/httpx/_client.py", line 1968, in aclose
    await self._transport.aclose()
  File "/Users/aayush.malik/.local/lib/python3.11/site-packages/httpx/_transports/default.py", line 365, in aclose
    await self._pool.aclose()
  File "/Users/aayush.malik/.local/lib/python3.11/site-packages/httpcore/_async/connection_pool.py", line 312, in aclose
    raise RuntimeError(
RuntimeError: The connection pool was closed while 24 HTTP requests/responses were still in-flight.

cc @aayushmalik

Activity

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions