Stop Writing Manual Retries: Use Resilience4j the Right Way with Spring Boot 3 and Feign 🔥

In modern distributed systems, failures are not exceptions — they are expected behavior.

Victor L. Batista

~3 min read · February 12, 2026 (Updated: February 12, 2026) · Free: No

In modern distributed systems, failures are not exceptions — they are expected behavior.

Yet many Java applications still implement retry logic using manual try/catch blocks, nested conditionals, and ad-hoc exception handling.

It works… until it doesn't.

If your Spring Boot service depends on an external API via Feign, and you're manually deciding which exceptions are retryable, it's time to level up.

This is where Resilience4j becomes essential.

The Real Problem: Retry Logic Scattered Across the Codebase

A common pattern looks like this:

try {
    return feignClient.call();
} catch (FeignException ex) {
    if (ex.status() == 503 || ex.status() == 504) {
        // retry logic here
    } else {
        throw ex;
    }
}

Now multiply that across multiple services.

Problems:

Retry rules are duplicated
Business logic gets polluted
No centralized configuration
No metrics
Hard to maintain
Hard to evolve

Senior engineers will almost always push back on this approach — and for good reason.

What the Market Uses Today

The most common production stack in the Spring ecosystem:

Spring Boot 3
Spring Cloud OpenFeign
Resilience4j
Micrometer for metrics

This setup allows you to:

Configure retry centrally
Separate infrastructure from business logic
Apply fallback cleanly
Add observability automatically

Retry vs Fallback: Understanding the Difference

Before writing code, understand the responsibilities:

Retry

Used when failure is transient.

Examples:

Timeout
Connection reset
502, 503, 504
429 (with backoff)

Retry assumes the system might succeed if attempted again.

Fallback

Used when retry still fails — or when you want controlled degradation.

Examples:

Return default classification
Use cached data
Send message to a queue
Return partial response

Fallback is not retry.

Fallback is your safety net.

Step 1 — Add Resilience4j to Spring Boot 3

Dependency (Gradle example):

implementation 'io.github.resilience4j:resilience4j-spring-boot3'
implementation 'org.springframework.cloud:spring-cloud-starter-openfeign'gragr

This is the most common combination used in production systems.

Step 2 — Annotate Your Service with Retry

Instead of manual retry logic, do this:

import io.github.resilience4j.retry.annotation.Retry;

@Service
public class ClassificationService {

    private final ExternalFeignClient feignClient;

    public ClassificationService(ExternalFeignClient feignClient) {
        this.feignClient = feignClient;
    }

    @Retry(name = "externalServiceRetry", fallbackMethod = "fallbackClassification")
    public Classification classify(String id) {

        ExternalResponse response = feignClient.getData(id);

        return Classification.from(response);
    }

    private Classification fallbackClassification(String id, Throwable throwable) {

        // Controlled degradation
        return Classification.UNKNOWN;
    }
}

No try/catch.

No duplicated retry logic.

Business logic remains clean.

Step 3 — Define What Is Retryable (The Right Way)

In application.yml:

resilience4j:
  retry:
    instances:
      externalServiceRetry:
        max-attempts: 3
        wait-duration: 500ms
        enable-exponential-backoff: true
        exponential-backoff-multiplier: 2
        retry-exceptions:
          - feign.RetryableException
          - java.net.SocketTimeoutException
          - java.io.IOException
        ignore-exceptions:
          - feign.FeignException$BadRequest
          - feign.FeignException$Unauthorized
          - feign.FeignException$Forbidden
          - feign.FeignException$NotFound

This is key.

You define retry behavior in configuration — not inside business methods.

Advanced: Retry Only for Specific HTTP Status Codes

Sometimes you need more control.

For example, retry only for:

Then use a custom RetryConfig:

@Bean
public RetryConfig retryConfig() {
    return RetryConfig.custom()
            .maxAttempts(3)
            .waitDuration(Duration.ofMillis(400))
            .retryOnException(ex -> {
                if (ex instanceof FeignException fe) {
                    int status = fe.status();
                    return status == 429 ||
                           status == 502 ||
                           status == 503 ||
                           status == 504;
                }
                return ex instanceof IOException;
            })
            .build();
}

This is much cleaner than spreading status checks everywhere.

Why This Is Better Than Manual Retry

Using Resilience4j gives you:

Centralized Policy

Retry rules live in one place.

2. Metrics Out of the Box

You automatically get retry success/failure metrics.

3. Clean Business Code

Your service focuses only on logic.

4. Production-Ready Observability

Integrates easily with Micrometer, Prometheus, and Grafana.

5. Easy Evolution

Need to change retry from 3 attempts to 5? Just change the config.

Common Mistakes to Avoid

🚫 Retrying 4xx errors

Retrying invalid requests wastes resources.

🚫 No backoff strategy

Instant retries can overload a struggling service.

🚫 Using fallback to hide systemic problems

Fallback is controlled degradation, not silent failure.

🚫 Forgetting circuit breakers

Retry alone is not enough in unstable systems.

Final Thoughts

Manual retry logic might work in small projects.

But in real-world distributed systems, it becomes technical debt.

Resilience4j with Spring Boot 3 and OpenFeign gives you:

Structured resilience
Configurable retry behavior
Safe fallbacks
Production-grade patterns

Resilience is not about preventing failure.

It's about designing systems that behave predictably when failure happens.

And that's what separates junior implementations from production-ready architecture.

#java #microservices #software-development #software-engineering #spring-boot

Stop Writing Manual Retries: Use Resilience4j the Right Way with Spring Boot 3 and Feign 🔥

In modern distributed systems, failures are not exceptions — they are expected behavior.

In modern distributed systems, failures are not exceptions — they are expected behavior.

The Real Problem: Retry Logic Scattered Across the Codebase

What the Market Uses Today

Retry vs Fallback: Understanding the Difference

Retry

Fallback

Step 1 — Add Resilience4j to Spring Boot 3

Step 2 — Annotate Your Service with Retry

Step 3 — Define What Is Retryable (The Right Way)

Advanced: Retry Only for Specific HTTP Status Codes

Why This Is Better Than Manual Retry

Common Mistakes to Avoid

Final Thoughts

Reporting a Problem