Skip to main content

Inflight Testing

What is Inflight Testing?

Inflight testing (also called production smoke testing or synthetic monitoring) is the practice of running automated tests against a live production or staging environment using real or synthetic traffic — while the system is running and serving users.

Unlike pre-deployment tests that validate the build, inflight tests validate the deployed, running system in its actual environment.


When Inflight Testing Occurs

Deployment starts

New version receives 5% traffic (canary)

┌─────────────────────────────────┐
│ INFLIGHT TESTS RUN NOW │
│ - Synthetic API calls │
│ - Health endpoint polling │
│ - Business metric validation │
└─────────────────────────────────┘

Pass? → Increase traffic to 25% → 50% → 100%
Fail? → Automatic rollback triggered

Types of Inflight Tests

1. Synthetic Monitoring

Automated scripts simulate real user actions on production:

  • Login and session management
  • Core transactional workflows
  • Search and read operations

Runs every 1–5 minutes continuously (not just at deployment).

2. Health Probe Checks

Verify Spring Boot Actuator endpoints after deployment:

#!/bin/bash
# inflight-health-check.sh
BASE_URL=$1
MAX_RETRIES=30
RETRY_INTERVAL=10

echo "Checking health of $BASE_URL"

for i in $(seq 1 $MAX_RETRIES); do
STATUS=$(curl -s -o /dev/null -w "%{http_code}" \
"$BASE_URL/actuator/health/readiness")

if [ "$STATUS" -eq 200 ]; then
echo "✅ Health check passed (attempt $i)"

# Verify business-critical downstream checks
curl -sf "$BASE_URL/actuator/health" | \
python3 -c "
import sys, json
health = json.load(sys.stdin)
components = health.get('components', {})
for name, comp in components.items():
status = comp.get('status')
if status != 'UP':
print(f'❌ Component {name} is {status}', file=sys.stderr)
sys.exit(1)
print('✅ All components healthy')
"
exit 0
fi

echo "⏳ Attempt $i/$MAX_RETRIES: status=$STATUS, retrying in ${RETRY_INTERVAL}s..."
sleep $RETRY_INTERVAL
done

echo "❌ Health check failed after $MAX_RETRIES attempts"
exit 1

3. Business Metric Validation

Verify key metrics return to baseline within expected time after deployment:

MetricAcceptable RangeRollback Trigger
HTTP 5xx rate< 0.1%> 1% for > 2 minutes
API p99 latency< 500ms> 1500ms for > 5 minutes
Transaction success rate> 99.5%< 98% for > 3 minutes
Kafka consumer lag< 5 min behind> 30 min behind
Active DB connections< 80% pool> 95% for > 1 minute

4. Canary Analysis

Automated statistical comparison between canary (new) and baseline (old) versions:

  • Uses tools like Kayenta (Netflix), Spinnaker Canary Analysis, or custom Grafana alerts
  • Compares latency percentiles, error rates, and business metrics between old and new pods
  • Automatically promotes or rolls back based on configured thresholds

Implementing Inflight Tests in Java/Spring

Dedicated Smoke Test Profile

@SpringBootTest(webEnvironment = SpringBootTest.WebEnvironment.NONE)
@ActiveProfiles("inflight")
@Tag("inflight")
class InflightSmokeTest {

@Value("${inflight.base-url}")
private String baseUrl;

@Value("${inflight.api-key}")
private String apiKey;

@Test
@DisplayName("Health endpoint returns UP")
void healthEndpoint_returnsUp() {
given()
.baseUri(baseUrl)
.when()
.get("/actuator/health")
.then()
.statusCode(200)
.body("status", equalTo("UP"));
}

@Test
@DisplayName("Transaction list API is responsive")
void transactionListApi_isResponsive() {
long start = System.currentTimeMillis();

given()
.baseUri(baseUrl)
.header("X-API-Key", apiKey)
.when()
.get("/api/v1/transactions?page=0&size=1")
.then()
.statusCode(200)
.time(lessThan(500L), TimeUnit.MILLISECONDS);

long elapsed = System.currentTimeMillis() - start;
assertThat(elapsed).isLessThan(500);
}

@Test
@DisplayName("Unauthenticated requests return 401")
void unauthenticatedRequest_returns401() {
given()
.baseUri(baseUrl)
.when()
.get("/api/v1/transactions")
.then()
.statusCode(401);
}
}
# application-inflight.yml
inflight:
base-url: ${INFLIGHT_BASE_URL:https://production.yourapp.com}
api-key: ${INFLIGHT_API_KEY}

Automated Rollback Integration

Integrate inflight test results with deployment orchestration:

# .github/workflows/deploy.yml (excerpt)
- name: Run inflight smoke tests
id: inflight
run: |
mvn test -Dgroups="inflight" \
-Dinflight.base-url=${{ env.PROD_URL }} \
-Dinflight.api-key=${{ secrets.INFLIGHT_API_KEY }}

- name: Rollback on inflight failure
if: failure() && steps.inflight.outcome == 'failure'
run: |
echo "🔴 Inflight tests failed — triggering rollback"
kubectl rollout undo deployment/transaction-service
./scripts/notify-slack.sh "Deployment rolled back: inflight tests failed"

Best Practices

PracticeGuidance
Non-destructiveInflight tests must not create or modify real production data — use dedicated test accounts or read-only operations
IdempotentSafe to re-run without side effects
FastEach test < 5 seconds; full suite < 2 minutes
Alert on failureFailed inflight tests page the on-call engineer immediately
Run continuouslySynthetic monitoring runs every 5 minutes, not just at deployment

Critical Rule

Inflight tests run against production. They must never create, modify, or delete real user data. Always use a dedicated synthetic test account with clearly marked test data.