Managing Failed Messages with Dead Letter Queue - QueueSaaS Blog
Relay
QueueSaaS Team

Managing Failed Messages with Dead Letter Queue

Learn how to handle failed message deliveries using QueueSaaS Dead Letter Queue. Retry failed messages, analyze failures, and maintain message reliability.

dlq reliability error-handling tutorial

Managing Failed Messages with Dead Letter Queue

Even with automatic retries, some messages may fail to deliver after exhausting all retry attempts. QueueSaaS Dead Letter Queue (DLQ) provides a centralized place to manage these failed messages, allowing you to investigate, retry, or archive them.

What is a Dead Letter Queue?

A Dead Letter Queue is a special queue that holds messages that couldn’t be delivered after all retry attempts have been exhausted. Messages are automatically moved to the DLQ when:

  • The endpoint returns a non-2xx status code after all retries
  • The endpoint times out repeatedly
  • The endpoint is unreachable
  • Network errors occur

Why Use a DLQ?

Benefits:

  • Visibility: See all failed messages in one place
  • Recovery: Retry messages after fixing endpoint issues
  • Analysis: Understand why messages are failing
  • Reliability: Never lose a message, even if delivery fails

Accessing the Dead Letter Queue

List Failed Messages

import { Client } from '@anlyonhq/scheduler';

const client = new Client({
  apiKey: process.env.ANLYON_API_KEY!,
});

// List all failed messages
const dlq = await client.dlq.list();
console.log(`Total failed messages: ${dlq.data.length}`);

dlq.data.forEach(message => {
  console.log(`Message ${message.messageId}:`);
  console.log(`  URL: ${message.url}`);
  console.log(`  Failed at: ${message.failedAt}`);
  console.log(`  Error: ${message.error}`);
  console.log(`  Retry count: ${message.retryCount}`);
});

Get DLQ Statistics

const stats = await client.dlq.getStats();
console.log(`Total failed: ${stats.data.total}`);
console.log(`Failed today: ${stats.data.failedToday}`);
console.log(`Failed this week: ${stats.data.failedThisWeek}`);

Get Specific Failed Message

const message = await client.dlq.get('message-id');
console.log(`Original URL: ${message.data.url}`);
console.log(`Failure reason: ${message.data.error}`);
console.log(`Last attempt: ${message.data.lastAttemptAt}`);
console.log(`Original body:`, message.data.body);

Retrying Failed Messages

Retry a Single Message

After fixing the endpoint issue, retry a specific message:

await client.dlq.retry('message-id');
console.log('Message queued for retry');

Retry All Failed Messages

Retry all messages in the DLQ:

const result = await client.dlq.retryAll();
console.log(`Retried ${result.data.retried} messages`);

Retry with Filter

You can also retry messages matching specific criteria (check API docs for filter options):

// Retry messages for a specific URL
const messages = await client.dlq.list();
const filtered = messages.data.filter(m => m.url.includes('api.example.com'));

for (const message of filtered) {
  await client.dlq.retry(message.messageId);
}

Understanding Failure Reasons

Each failed message includes detailed error information:

const message = await client.dlq.get('message-id');

console.log(`Status Code: ${message.data.lastStatusCode}`);
console.log(`Error: ${message.data.error}`);
console.log(`Error Type: ${message.data.errorType}`);

// Common error types:
// - "HTTP_ERROR" - Non-2xx status code
// - "TIMEOUT" - Request timed out
// - "NETWORK_ERROR" - Network connectivity issue
// - "DNS_ERROR" - DNS resolution failed

Real-World Scenarios

Scenario 1: Endpoint Was Down

Your endpoint was temporarily unavailable, but it’s back online now:

// Check DLQ for messages
const dlq = await client.dlq.list();

// Retry all messages
await client.dlq.retryAll();
console.log('All messages retried after endpoint recovery');

Scenario 2: Authentication Issue

Your endpoint started requiring authentication, causing messages to fail:

// 1. Fix the endpoint to include authentication
// 2. Retry failed messages
const failed = await client.dlq.list();
for (const message of failed.data) {
  if (message.lastStatusCode === 401) {
    await client.dlq.retry(message.messageId);
  }
}

Scenario 3: Analyzing Failure Patterns

Understand why messages are failing:

const stats = await client.dlq.getStats();
const messages = await client.dlq.list();

// Group failures by error type
const failuresByType = messages.data.reduce((acc, msg) => {
  const type = msg.errorType || 'UNKNOWN';
  acc[type] = (acc[type] || 0) + 1;
  return acc;
}, {});

console.log('Failure breakdown:', failuresByType);

// Group by URL to find problematic endpoints
const failuresByUrl = messages.data.reduce((acc, msg) => {
  acc[msg.url] = (acc[msg.url] || 0) + 1;
  return acc;
}, {});

console.log('Failures by endpoint:', failuresByUrl);

Scenario 4: Scheduled Cleanup

Archive or delete old failed messages:

const messages = await client.dlq.list();
const thirtyDaysAgo = new Date();
thirtyDaysAgo.setDate(thirtyDaysAgo.getDate() - 30);

for (const message of messages.data) {
  const failedAt = new Date(message.failedAt);
  if (failedAt < thirtyDaysAgo) {
    // Archive or delete old messages
    await client.dlq.delete(message.messageId);
    console.log(`Deleted old message: ${message.messageId}`);
  }
}

Best Practices

1. Monitor DLQ Regularly

Set up alerts when DLQ size exceeds a threshold:

const stats = await client.dlq.getStats();
if (stats.data.total > 100) {
  // Send alert to monitoring system
  console.warn('DLQ has exceeded 100 messages!');
}

2. Investigate Root Causes

Don’t just retry - understand why messages failed:

const message = await client.dlq.get('message-id');
console.log('Failure details:', {
  url: message.data.url,
  statusCode: message.data.lastStatusCode,
  error: message.data.error,
  retryCount: message.data.retryCount,
  failedAt: message.data.failedAt,
});

3. Fix Endpoints Before Retrying

Retrying without fixing the underlying issue will just fail again:

// ❌ Bad: Retry without fixing
await client.dlq.retryAll();

// ✅ Good: Fix endpoint, then retry
// 1. Fix authentication/endpoint issue
// 2. Test endpoint manually
// 3. Then retry
await client.dlq.retryAll();

4. Use DLQ for Debugging

The DLQ preserves the original message, making it perfect for debugging:

const message = await client.dlq.get('message-id');
console.log('Original request:', {
  method: message.data.method,
  url: message.data.url,
  headers: message.data.headers,
  body: message.data.body,
});

5. Set Up Alerts

Configure notifications when messages enter the DLQ:

// Use QueueSaaS notifications API
await client.notifications.updatePreferences({
  enabledTypes: ['delivery_failure_spike'],
  emailRecipients: ['ops@example.com'],
});

DLQ vs Regular Retries

Automatic Retries

QueueSaaS automatically retries failed messages:

  • Configurable retry count (default: 3-5 based on plan)
  • Exponential backoff between retries
  • Automatic retry for transient failures

Dead Letter Queue

Messages enter DLQ when:

  • All automatic retries are exhausted
  • Endpoint consistently fails
  • Manual intervention is needed

Integration with Monitoring

Integrate DLQ monitoring with your observability stack:

// Export DLQ metrics to monitoring system
const stats = await client.dlq.getStats();

// Send to Prometheus, Datadog, etc.
metrics.gauge('dlq.total_messages', stats.data.total);
metrics.gauge('dlq.failed_today', stats.data.failedToday);

Next Steps

Keep your messages reliable with Dead Letter Queue! 🔄