Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Network error request retries #5415

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

shauns
Copy link
Contributor

@shauns shauns commented Feb 18, 2025

WHY are these changes introduced?

Network errors during API calls can be transient and should be automatically retried to improve reliability. Currently, the CLI only retries on explicit rate limit errors (code 429), but should also handle common network issues like DNS failures or connection timeouts, where we are confident this is the cause of the issue.

WHAT is this pull request doing?

Network level retries are based on an elapsed time limit -- there are 10 seconds between starting the first attempt at the request and the retry loop being stopped. On each iteration of the retry loop, the delay is doubled. The first delay is 300ms, then 600 and so on.

The retry logic is implemented at a low level, and will affect all GraphQL and fetch requests made through CLI Kit. This includes non-GraphQL calls to identity.

Response-based retries -- using the HTTP response code & headers to trigger a retry -- are still supported and sit at a higher level than this.

  • Adds network-level retry capability to API requests when encountering common network errors
  • Implements exponential backoff for retries with configurable time limits
  • Adds ability to disable network retries via SHOPIFY_CLI_SKIP_NETWORK_LEVEL_RETRY environment variable
  • Includes comprehensive test coverage for the retry behaviour

Testing

Adjusting --path, run:

SHOPIFY_CLI_SKIP_NETWORK_LEVEL_RETRY=1 p shopify app init --path=<app folder> --verbose

When you get to the Remix/JS question, switch off wifi. Observe that the CLI fails immediately after.

Then:

p shopify app init --path=<app folder> --verbose

Again, switch off wifi when you get to Remix/JS question. Observe that the CLI starts to retry requests with an slower gap in between each retry. It should take no more than 10s before exiting.

Run it one more time. This time reactivate wifi once retries have begun. Observe that the CLI recovers.

Measuring impact

  • n/a - this doesn't need measurement, e.g. a linting rule or a bug-fix

Checklist

  • I've considered possible cross-platform impacts (Mac, Linux, Windows)
  • I've considered possible documentation changes

Copy link
Contributor Author

shauns commented Feb 18, 2025

This stack of pull requests is managed by Graphite. Learn more about stacking.

@shauns shauns marked this pull request as ready for review February 18, 2025 10:05
@shauns shauns requested a review from a team as a code owner February 18, 2025 10:05

This comment has been minimized.

Copy link
Contributor

github-actions bot commented Feb 18, 2025

Coverage report

St.
Category Percentage Covered / Total
🟡 Statements
76.04% (+0.13% 🔼)
9169/12058
🟡 Branches
71.19% (+0.13% 🔼)
4474/6285
🟡 Functions
75.61% (+0.09% 🔼)
2387/3157
🟡 Lines
76.59% (+0.13% 🔼)
8660/11307
Show new covered files 🐣
St.
File Statements Branches Functions Lines
🟢
... / sleep-with-backoff.ts
100% 100% 100% 100%

Test suite run success

2075 tests passing in 926 suites.

Report generated by 🧪jest coverage report action from 3daad05

@shauns shauns force-pushed the shauns/02-18-network_error_request_retries branch 2 times, most recently from 1beb73c to 51d4257 Compare February 18, 2025 10:32
@shauns shauns requested a review from a team as a code owner February 18, 2025 10:32
@shauns shauns force-pushed the shauns/02-18-network_error_request_retries branch from 51d4257 to b7c11d5 Compare February 18, 2025 10:47
Copy link
Contributor Author

shauns commented Feb 18, 2025

/snapit

Copy link
Contributor

🫰✨ Thanks @shauns! Your snapshot has been published to npm.

Test the snapshot by installing your package globally:

pnpm i -g @shopify/[email protected]

Tip

If you get an ETARGET error, install it with NPM and the flag --@shopify:registry=https://registry.npmjs.org

Caution

After installing, validate the version by running just shopify in your terminal.
If the versions don't match, you might have multiple global instances installed.
Use which shopify to find out which one you are running and uninstall it.

Copy link
Contributor

@graygilmore graygilmore left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really like this change, code makes sense to me. Didn't 🎩

@shauns shauns force-pushed the shauns/02-18-network_error_request_retries branch from b7c11d5 to 2cb3b44 Compare February 19, 2025 09:38
Copy link
Contributor

@gonzaloriestra gonzaloriestra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great! I don't understand why we don't have this since the beginning...

But something is not working as expected. It hangs forever when I enable the Wifi during the retries:

retry.fail.mp4

@shauns shauns force-pushed the shauns/02-18-network_error_request_retries branch from 2cb3b44 to 9090835 Compare February 19, 2025 15:19
Copy link
Contributor

@karreiro karreiro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for this PR, @shauns! Excellent stuff 🚀

I've 🎩 with themes and works as expected:

demo.mp4

Thanks!

}
}

if (import.meta.vitest) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wat?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but why?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code in calculateBackoffDelay is covered by other tests. But, I felt that testing it in isolation was helpful documentation, and that our API surface should not grow further by exporting it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's indeed helpful documentation, but I'd look for it in the test file, as always. If there's no easy way to test this through the exported function, then I prefer to export it than hiding it here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We shouldn't export things that aren't intended to be used by other modules. We have lots of cases of that and it makes understanding the code and correctly using it, and changing it, difficult. So of the options of export things that aren't used, not have the test, or use a headline feature of vitest, I think the latter is best. If you feel strongly then I think deletion is second best.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it has enough complexity to deserve tests, then maybe it should belong to a different file and be exported.

I also prefer to export only the required functions, but I don't think it's a big deal. But changing the expected place for tests in the project, even if it's a Vitest feature, it's a problem IMO.

My preferences in order are:

  • export or extract
  • test through sleepWithBackoffUntil
  • delete
  • in source

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Exporting is making the project increasingly unwieldy, its a non-starter. We already have test coverage at the higher layer, its just less descriptive because its focused on the wider behaviour. So I'll delete it, but am on record as saying we need to be willing to use our tools capabilities more.

@shauns shauns force-pushed the shauns/02-18-network_error_request_retries branch from 9090835 to 3daad05 Compare February 21, 2025 14:18
Copy link
Contributor

Differences in type declarations

We detected differences in the type declarations generated by Typescript for this branch compared to the baseline ('main' branch). Please, review them to ensure they are backward-compatible. Here are some important things to keep in mind:

  • Some seemingly private modules might be re-exported through public modules.
  • If the branch is behind main you might see odd diffs, rebase main into this branch.

New type declarations

packages/cli-kit/dist/private/node/sleep-with-backoff.d.ts
interface BackoffResult {
    remainingMs: number;
    iterations: number;
}
/**
 * Generator that sleeps with exponential backoff between yields, stopping before exceeding a time limit
 *
 * Yields the amount of time slept in milliseconds.
 *
 * @param maxTimeMs - Maximum total time in milliseconds before stopping
 * @param firstDelayMs - First delay in milliseconds
 * @returns Information about the backoff sequence: remaining time and iteration count
 */
export declare function sleepWithBackoffUntil(maxTimeMs?: number, firstDelayMs?: number): AsyncGenerator<number, BackoffResult, unknown>;
export {};

Existing type declarations

packages/cli-kit/dist/public/node/environment.d.ts
@@ -55,4 +55,13 @@ export declare function jsonOutputEnabled(environment?: NodeJS.ProcessEnv): bool
  *
  * @returns True if the SHOPIFY_CLI_NEVER_USE_PARTNERS_API environment variable is set.
  */
-export declare function blockPartnersAccess(): boolean;
\ No newline at end of file
+export declare function blockPartnersAccess(): boolean;
+/**
+ * If true, the CLI should not use the network level retry.
+ *
+ * If there is an error when calling a network API that looks like a DNS or connectivity issue, the CLI will by default
+ * automatically retry the request.
+ *
+ * @returns True if the SHOPIFY_CLI_SKIP_NETWORK_LEVEL_RETRY environment variable is set.
+ */
+export declare function skipNetworkLevelRetry(): boolean;
\ No newline at end of file
packages/cli-kit/dist/private/node/api.d.ts
@@ -9,6 +9,23 @@ export declare function simpleRequestWithDebugLog<T extends {
     headers: Headers;
     status: number;
 }>({ request, url }: RequestOptions<T>, errorHandler?: (error: unknown, requestId: string | undefined) => unknown): Promise<T>;
+/**
+ * Makes a HTTP request to some API, retrying if response headers indicate a retryable error.
+ *
+ * If a request fails with a 429, the retry-after header determines a delay before an automatic retry is performed.
+ *
+ * If unauthorizedHandler is provided, then it will be called in the case of a 401 and a retry performed. This allows
+ * for a token refresh for instance.
+ *
+ * If there's a network error, e.g. DNS fails to resolve, then API calls are automatically retried.
+ *
+ * @param request - A function that returns a promise of the response
+ * @param url - The URL to request
+ * @param errorHandler - A function that handles errors
+ * @param unauthorizedHandler - A function that handles unauthorized errors
+ * @param retryOptions - Options for the retry
+ * @returns The response from the request
+ */
 export declare function retryAwareRequest<T extends {
     headers: Headers;
     status: number;
@@ -16,5 +33,6 @@ export declare function retryAwareRequest<T extends {
     limitRetriesTo?: number;
     defaultDelayMs?: number;
     scheduleDelay: (fn: () => void, delay: number) => void;
+    enableNetworkLevelRetry: boolean;
 }): Promise<T>;
 export {};
\ No newline at end of file
packages/cli-kit/dist/private/node/constants.d.ts
@@ -32,6 +32,7 @@ export declare const environmentVariables: {
     themeKitAccessDomain: string;
     json: string;
     neverUsePartnersApi: string;
+    skipNetworkLevelRetry: string;
 };
 export declare const defaultThemeKitAccessDomain = "theme-kit-access.shopifyapps.com";
 export declare const systemEnvironmentVariables: {

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants