February 27 2025

Circuit Breaker in Practice

How to stop a slow or failing dependency from dragging the rest of your system down just because every request keeps failing the same way.

Andrews Ribeiro

Founder & Engineer

4 min Intermediate Systems

#system-design#api#resilience#circuit-breaker#integrations#backend

The problem

Some integrations get sick and drag everything else with them.

The flow usually looks like this:

service A depends on service B
service B becomes slow or unstable
service A keeps trying on every request
threads, connections, and internal queues start filling up
suddenly B’s problem also became A’s problem

That happens because insisting too much is also a kind of failure.

Mental model

Circuit breaker is a mechanism for answering this question:

when a dependency is failing too much, does it still make sense to keep trying right now?

If the answer is no, the system stops insisting for a while and fails fast.

The name helps because it works like an electrical breaker:

normal: circuit closed
too much failure: circuit opens
after some time: test whether it can close again

It is not magic.

It is damage control.

Breaking the problem down

Closed: the normal behavior

While the dependency is healthy, calls continue normally.

Closed means:

calls are allowed
success and failure are measured
the dependency keeps being observed

Open: stop insisting

When error rate or timeout crosses some threshold, the circuit opens.

Opening means:

do not send new calls for a period
fail fast
free the rest of the system to continue with fallback, controlled error, or degradation

That point matters a lot.

Opening the circuit does not heal the dependency.

It only stops you from burning resources on something that is already unhealthy.

Half-open: test recovery

After an interval, the system allows a few controlled attempts to test recovery.

If they work, it closes again. If they keep failing, it opens again.

That avoids two bad extremes:

never testing again
releasing all traffic too early

Circuit breaker does not replace timeout

This mistake shows up a lot.

Without timeout, the call may hang for too long.

Without well-designed retry, a small transient failure turns into unnecessary error.

Without circuit breaker, a sick dependency keeps pulling your resources down with it.

Each mechanism solves a different part:

timeout limits waiting
retry handles transient failure
circuit breaker stops insistence once the pattern already looks like collapse

Fallback has to be honest

When the circuit opens, the system may:

return a clear error
serve degraded data
use cache
skip a secondary feature

The important part is that the fallback stays coherent with the business.

It does not help to return dangerous data just to hide that the dependency is down.

Simple example

Imagine checkout calling an antifraud service.

If antifraud starts responding in 8 seconds or fails 70 percent of the time, every new payment accumulates more waiting.

Without protection:

the user waits too long
the connection pool gets stuck
the internal queue grows
checkout starts looking broken even when the rest is fine

With a circuit breaker:

the system detects strong degradation
it opens the circuit
new requests fail fast or enter a degraded path
after some time, a few attempts test recovery

That is not only pretty in a diagram.

It is the kind of decision that stops a local incident from becoming a cascade.

Common mistakes

Assuming retry always helps.
Opening a circuit without a well-defined timeout.
Using generic thresholds without thinking about dependency behavior.
Having no fallback or no clear message for the open state.
Measuring only explicit error and ignoring timeout or bad latency as degradation signal.

How a senior thinks about it

Someone with more experience looks at a shaky integration and asks:

“If this dependency gets sick right now, will I get sick with it or can I isolate the damage?”

That question changes the design.

It moves the conversation away from “keep trying until it works” and toward resilience.

Because in production, blind insistence only looks like perseverance until the first cascade.

What the interviewer wants to see

In interviews, circuit breaker usually appears as part of service resilience.

The evaluator wants to see whether you understand the behavior, not just the pattern name.

Good signals:

you distinguish timeout, retry, and circuit breaker
you talk about failing fast
you mention closed, open, and half-open states
you connect it to resource protection and controlled degradation

A strong answer often sounds like this:

“If the dependency is failing above some threshold, I stop insisting for a while. That reduces latency, protects local resources, and prevents its failure from turning into a cascade in my system.”

Circuit breaker does not make the third-party service healthy. It only stops its illness from turning into internal bleeding in yours.

Quick summary

What to keep in your head

Circuit breaker exists to stop insisting on a dependency that is already failing too much.
The main goal is to contain damage, reduce latency, and protect your own system resources.
Timeout, retry, and circuit breaker need to work together. Retry alone can make collapse worse.
Opening the circuit does not fix the dependency, but it stops its problem from becoming your cascading problem.

Practice checklist

Use this when you answer

Can I explain the difference between timeout, retry, and circuit breaker?
Can I describe closed, open, and half-open without sounding memorized?
Can I explain when failing fast is better than insisting?
Can I treat fallback and degradation as part of the design, not as an afterthought?

You finished this article

Next step

Service Contracts and Backward Compatibility Next step →

You finished this article

Next step

Service Contracts and Backward Compatibility Next step →

Next article Your REST API Was Almost Never REST Previous article API Versioning in Practice

Circuit Breaker in Practice

The problem

Mental model

Breaking the problem down

Closed: the normal behavior

Open: stop insisting

Half-open: test recovery

Circuit breaker does not replace timeout

Fallback has to be honest

Simple example

Common mistakes

How a senior thinks about it

What the interviewer wants to see

What to keep in your head

Use this when you answer

Keep exploring

Articles

System Design

Related articles

APIs and Services Without Blurry Boundaries

Failure and recovery scenarios

Webhooks, Retries, Duplication, Signatures, and Event Order

Related articles

APIs and Services Without Blurry Boundaries

Failure and recovery scenarios

Webhooks, Retries, Duplication, Signatures, and Event Order