
Mahmoud AlatrashThe Black Box Problem Navigation is the backbone of every mobile app. Yet, in most...
Navigation is the backbone of every mobile app. Yet, in most production environments, it's completely invisible. Most teams instrument API calls and screen views, but the actual user journey between screens is treated as a black box.
This is a catastrophic mistake at scale.
When navigation breaks, you're debugging blind. When product asks for user flows, you're guessing. When QA finds an edge case, you can't reproduce it.
I built a production-grade navigation observability system to fix this. Not a logging hack. A real architectural solution with zero performance impact, strict privacy by default, and Clean Architecture.
This article breaks down exactly how I built it, and why certain architectural trade-offs matter more than textbook rules.
Let's dive in.
Building observability isn't just about dumping data into logs. After evaluating third-party tools and building prototypes, I realized that real production observability must stand on three non-negotiable pillars:
Users navigate constantly (50-100 times per session). If your observability adds even 5ms per route, you are degrading the UX and burning battery at scale.
My hard limit was < 0.5ms per navigation. Not "acceptable" overhead—invisible overhead. I achieved this not through brute-force optimization, but through architectural design: zero-copy paths and early returns.
Route arguments carry highly sensitive data: tokens, user IDs, payment details. One logging accident, and you're facing a GDPR nightmare.
Most solutions rely on developers "remembering" to enable sanitization. Humans forget. The architecture shouldn't.
My approach: Strict by Default. In production, 14 sensitive patterns are automatically redacted. The system doesn't trust the developer's memory; it relies on paranoid code.
Clean Architecture often gets called "over-engineered"—until you need to swap your routing package, or test a deeply coupled analytics logger.
Clean Architecture isn't about being fancy; it's about being free. Free to test, change, and extend.
The implementation uses 4 distinct layers. The domain layer has zero Flutter dependencies. Want to switch from auto_route to go_router? Just write a new adapter. Need to add Firebase Analytics? Add a new listener. Zero breaking changes.
We built the system as four distinct layers. No monolithic classes. No "God objects." Just clean separation of concerns:
Domain Layer (Business Logic)
↓ defines interfaces
Data Layer (Event Bus Implementation)
↓ provides infrastructure
Infrastructure Layer (Adapters & Listeners)
↓ consumes events
DI Layer (Lifecycle Management)
Each layer has one job. Each layer is independently testable. Each layer can be mocked, swapped, or extended without touching the others.
This is what Clean Architecture looks like when you actually build it, not just talk about it.
Now that you understand why we built this, let me show you how. We start at the core—the domain layer where everything begins.
The beauty of starting with pure business logic is that you're forced to think clearly. No framework noise. No implementation details. Just: "What is a navigation event, really?"
At the heart of the system is an immutable event entity that captures everything we need:
class NavigationEvent {
final NavigationEventType type; // push, pop, replace, tabChange
final RouteInfo? from; // Source route (null for app launch)
final RouteInfo to; // Destination route
final Map<String, dynamic>? arguments; // Sanitized route arguments
final DateTime timestamp; // For ordering and timing analysis
final String? sessionId; // Reserved for future session tracking
}
The entity is deliberately dumb—no logic, just data. This makes it trivial to serialize, test, and reason about.
The RouteInfo class wraps route metadata in a way that's completely decoupled from Flutter's Route class:
class RouteInfo {
final String name; // Route name (e.g., 'ProfileRoute')
final String path; // Full path (e.g., '/profile')
final Map<String, String>? parameters; // Reserved for query params
final String? title; // Reserved for screen titles
const RouteInfo({
required this.name,
required this.path,
this.parameters,
this.title,
});
}
No Flutter imports. No framework coupling. Pure Dart. This abstraction lets us swap routing libraries without touching the domain layer—exactly what platform independence means in practice.
We needed a way to broadcast navigation events to multiple consumers without tight coupling. The interface is deliberately simple:
abstract class INavigationEventBus {
void publish(NavigationEvent event);
Stream<NavigationEvent> subscribe();
bool get hasActiveListeners;
Future<void> dispose();
}
Notice what's missing: no buffering, no replay, no persistence. Navigation events are ephemeral—if a listener subscribes late, it doesn't receive historical events. This keeps the system simple and prevents memory leaks from unbounded event queues.
Listeners follow a standard lifecycle:
abstract class INavigationListener {
String get name;
void startListening(INavigationEventBus eventBus);
Future<void> stopListening();
bool get isListening;
}
This contract enforces explicit lifecycle management. Listeners must be started, and they must be stopped. No implicit subscriptions that leak.
The event bus implementation uses Dart's StreamController.broadcast():
class NavigationEventBusImpl implements INavigationEventBus {
late final StreamController<NavigationEvent> _controller;
NavigationEventBusImpl() {
_controller = StreamController<NavigationEvent>.broadcast();
}
void publish(NavigationEvent event) {
if (_controller.isClosed) return;
try {
_controller.add(event);
} catch (e, stack) {
log('Error publishing: $e');
}
}
}
If you're an experienced engineer reading this, your instincts are probably screaming right now. The code checks isClosed, then calls add(). What if dispose() gets called in that exact microsecond in between? It throws a StateError.
"You need a Mutex!" "Add a Lock!" "This will crash in production!"
And you are... technically correct. There IS a race condition. But here's what makes this interesting:
That race only happens during app shutdown. And when it does, the try-catch handles it gracefully. The app is already closing. The user doesn't care. The event was never going to be processed anyway.
Now, here's the real trade-off: adding a Mutex or Lock would fix the race. It would also add ~0.01-0.1ms overhead to every single navigation event. Every single one.
Think about what that means. It means guarding against an edge case that:
By imposing overhead on:
The architectural decision: I chose to keep the "bug." I documented it thoroughly in the codebase and explained the trade-off.
This is what pragmatic engineering looks like. Not textbook perfect. Production perfect.
Here's the architectural reality: auto_route is a third-party routing library. Today it's auto_route. Tomorrow it could be go_router or some custom solution. If I let auto_route's types leak into my domain layer—into NavigationEvent, into INavigationEventBus—I'm locked in. Swapping routing libraries means rewriting the entire observability system.
Unacceptable.
The Adapter Pattern solves this. I build a thin infrastructure layer that extends AutoRouteObserver, listens for route changes, translates them into my platform-agnostic domain events, and publishes them to the event bus. The domain layer never sees Route. It never sees AutoRouteObserver. It only sees NavigationEvent and RouteInfo.
class AutoRouteObserverAdapter extends AutoRouteObserver {
final INavigationEventBus _eventBus;
final ArgumentSanitizationConfig _sanitizationConfig;
AutoRouteObserverAdapter(
this._eventBus, [
ArgumentSanitizationConfig? sanitizationConfig,
]) : _sanitizationConfig = sanitizationConfig ?? ArgumentSanitizationConfig.strict;
@override
void didPush(Route route, Route? previousRoute) {
_publishEvent(
type: NavigationEventType.push,
route: route,
previousRoute: previousRoute,
);
}
@override
void didPop(Route route, Route? previousRoute) {
_publishEvent(
type: NavigationEventType.pop,
route: route,
previousRoute: previousRoute,
);
}
@override
void didReplace({Route? newRoute, Route? oldRoute}) {
if (newRoute == null) return;
_publishEvent(
type: NavigationEventType.replace,
route: newRoute,
previousRoute: oldRoute,
);
}
@override
void didInitTabRoute(TabPageRoute route, TabPageRoute? previousRoute) {
_publishEvent(
type: NavigationEventType.tabChange,
route: route as Route,
previousRoute: previousRoute as Route?,
);
}
@override
void didRemove(Route route, Route? previousRoute) {
_publishEvent(
type: NavigationEventType.pop,
route: route,
previousRoute: previousRoute,
);
}
void _publishEvent({
required NavigationEventType type,
required Route route,
Route? previousRoute,
}) {
if (!_eventBus.hasActiveListeners) return;
try {
final event = NavigationEvent(
type: type,
from: previousRoute != null ? _extractRouteInfo(previousRoute) : null,
to: _extractRouteInfo(route),
arguments: _extractArguments(route),
timestamp: DateTime.now(),
);
_eventBus.publish(event);
} catch (e, stack) {
log('Error creating navigation event: $e', name: 'AutoRouteObserverAdapter', error: e, stackTrace: stack);
}
}
RouteInfo _extractRouteInfo(Route route) {
final rawName = route.settings.name;
if (rawName == null || rawName.isEmpty) {
return const RouteInfo(name: 'UnknownRoute', path: '/unknown');
}
final startsWithSlash = rawName.startsWith('/');
final name = startsWithSlash ? rawName.substring(1) : rawName;
final path = startsWithSlash ? rawName : '/$rawName';
return RouteInfo(name: name, path: path);
}
Map<String, dynamic>? _extractArguments(Route route) {
try {
final args = route.settings.arguments;
if (args == null) return null;
if (args is Map<String, dynamic>) {
return _sanitizeArguments(args);
}
if (args is Map) {
final converted = <String, dynamic>{};
args.forEach((key, value) {
if (key is String) {
converted[key] = value;
}
});
return _sanitizeArguments(converted);
}
return {'value': args.toString()};
} catch (e, stack) {
log('Error extracting route arguments: $e', name: 'AutoRouteObserverAdapter', error: e, stackTrace: stack);
return null;
}
}
Map<String, dynamic> _sanitizeArguments(Map<String, dynamic> args) {
if (!_sanitizationConfig.enabled) return args;
// Performance: Zero-copy fast-path if no sensitive keys (80%+ of cases)
if (!args.keys.any(_sanitizationConfig.isSensitiveKey)) return args;
// Only allocate when needed
return args.map((key, value) => MapEntry(
key,
_sanitizationConfig.isSensitiveKey(key) ? _sanitizationConfig.placeholder : value,
));
}
}
The adapter is the only place in the codebase that imports auto_route. It's the only place that knows about Flutter's Route class. Everything downstream—the event bus, the listeners, the domain entities—operates on pure Dart abstractions.
Notice the critical optimization: if (!_eventBus.hasActiveListeners) return;. If no listeners are registered, the adapter does zero work. No route info extraction. No argument sanitization. No event creation. This keeps the hot path lean when observability is disabled.
The _extractRouteInfo method handles the string manipulation (stripping leading slashes) and caches the startsWith('/') result to avoid duplicate checks. The _sanitizeArguments method implements the zero-copy fast path—checking for sensitive keys before allocating a new map.
The adapter gets registered in the DI container:
final sanitizationConfig = kDebugMode
? ArgumentSanitizationConfig.disabled
: ArgumentSanitizationConfig.strict;
it.registerLazySingleton<AutoRouteObserverAdapter>(
() => AutoRouteObserverAdapter(
it<INavigationEventBus>(),
sanitizationConfig,
),
);
And injected into the router:
MaterialApp.router(
routerDelegate: _appRouter.delegate(
navigatorObservers: () => [
inject.sl<AutoRouteObserverAdapter>(),
],
),
routeInformationParser: _appRouter.defaultRouteParser(),
)
That's the entire surface area. One adapter. One injection point. If I swap to go_router tomorrow, I write a new adapter that implements their observer interface. The domain layer never changes. The event bus never changes. The listeners never change.
This is what architectural boundaries actually look like in production code.
Early profiling showed argument sanitization eating 60% of the overhead. Not good. It needed to be faster.
This led me to a critical architectural question: "How often do routes actually carry sensitive arguments?"
After analyzing the real-world navigation patterns, the answer was clear: Less than 20% of the time.
Most navigation is simple: Home → Profile → Settings. No tokens. No passwords. No credit cards. Just route names.
This realization triggered the main breakthrough: the zero-copy fast path.
Map<String, dynamic> _sanitizeArguments(Map<String, dynamic> args) {
if (!_sanitizationConfig.enabled) return args;
// The magic: check BEFORE allocating
if (!args.keys.any(_sanitizationConfig.isSensitiveKey)) return args;
// Only pay the cost when we actually need to
return args.map((key, value) => MapEntry(
key,
_sanitizationConfig.isSensitiveKey(key)
? _sanitizationConfig.placeholder
: value,
));
}
Look at that second line. if (!args.keys.any(_sanitizationConfig.isSensitiveKey)) return args;
That single line checks if ANY key is sensitive. If not—and this is 80%+ of the time—we return the original map. No allocation. No copy. No iteration. Just return.
Zero. Copy.
When we DO find sensitive keys, we use args.map() to create a new map with redacted values. The functional style is clean, and the Dart VM loves it.
But here's what's beautiful: we only pay that cost when we actually need to. Four out of five times, we just... don't.
The isSensitiveKey() check could be a performance bottleneck if implemented naively. We pre-compute a Set<String> of lowercase patterns at construction:
class ArgumentSanitizationConfig {
late final Set<String> _sensitiveKeysLowerSet;
ArgumentSanitizationConfig({
required this.enabled,
required this.sensitiveKeys,
}) {
_sensitiveKeysLowerSet = sensitiveKeys.map((k) => k.toLowerCase()).toSet();
}
bool isSensitiveKey(String key) {
if (!enabled) return false;
final keyLower = key.toLowerCase();
return _sensitiveKeysLowerSet.any((pattern) => keyLower.contains(pattern));
}
}
The late final keyword is crucial here—it defers initialization until first access but guarantees immutability after that. The Set provides O(1) average-case lookups, dramatically faster than iterating a list of patterns.
We deliberately chose behavior-based naming over environment-based:
static final strict = ArgumentSanitizationConfig(
enabled: true,
sensitiveKeys: defaultSensitiveKeys,
);
static final disabled = ArgumentSanitizationConfig(
enabled: false,
sensitiveKeys: [],
);
Not .production() and .development(), but .strict and .disabled. Why? Environment-based names don't scale. When you have dev, staging, QA, pre-prod, and prod environments, "production" becomes ambiguous. Behavior-based names describe what the config does, not where it runs.
This is a subtle but important architectural decision that future-proofs the design.
GetIt registration with disposal callbacks is the secret sauce that prevents memory leaks:
class NavigationServiceProvider implements ServiceProvider {
Future<void> register(GetIt it) async {
// Event bus with disposal
it.registerLazySingleton<INavigationEventBus>(
() => NavigationEventBusImpl(),
dispose: (bus) => bus.dispose(),
);
final eventBus = it<INavigationEventBus>();
// Listener with disposal
final loggingListener = LoggingNavigationListener();
loggingListener.startListening(eventBus);
it.registerSingleton<LoggingNavigationListener>(
loggingListener,
dispose: (listener) => listener.stopListening(),
);
}
}
When the app shuts down (or when we reset GetIt in integration tests), these disposal callbacks execute automatically. The StreamController closes, the StreamSubscriptions cancel, and we leak nothing.
This is production-grade lifecycle management. It works in hot reload, in integration tests, and in production.
Because of the Clean Architecture, consuming navigation events is trivial. The listener has zero coupling to auto_route, zero coupling to Flutter's Route class. It just subscribes to the stream and reacts.
class LoggingNavigationListener implements INavigationListener {
StreamSubscription<NavigationEvent>? _subscription;
@override
String get name => 'LoggingNavigationListener';
@override
void startListening(INavigationEventBus eventBus) {
if (_subscription != null) {
log('Warning: $name already listening', name: 'Navigation');
return;
}
_subscription = eventBus.subscribe().listen(
_onNavigationEvent,
onError: _onError,
cancelOnError: false, // Continue listening even if errors occur
);
log('$name started', name: 'Navigation');
}
void _onNavigationEvent(NavigationEvent event) {
// Silent in production - single boolean check (~0.001ms overhead)
if (!kDebugMode) return;
try {
// Basic navigation log
final fromRoute = event.from?.name ?? 'App Start';
log('Navigation: $fromRoute → ${event.to.name} (${event.type.value})', name: 'Navigation');
// Additional details (paths, arguments, timestamp)
_logDebugDetails(event);
} catch (e, stack) {
// Errors in logging never affect navigation
log('Error in $name while logging event: $e', name: 'Navigation', error: e, stackTrace: stack);
}
}
void _logDebugDetails(NavigationEvent event) {
if (event.from != null) {
log(' From: ${event.from!.path}', name: 'Navigation', level: 500);
}
log(' To: ${event.to.path}', name: 'Navigation', level: 500);
if (event.arguments != null && event.arguments!.isNotEmpty) {
log(' Arguments: ${event.arguments}', name: 'Navigation', level: 500);
}
log(' Timestamp: ${event.timestamp.toIso8601String()}', name: 'Navigation', level: 500);
// This is where Firebase Analytics, Datadog, or Mixpanel would go
// analytics.logEvent('navigation', {...});
}
void _onError(Object error, StackTrace stack) {
log('Error in $name event stream: $error', name: 'Navigation', error: error, stackTrace: stack);
}
@override
Future<void> stopListening() async {
await _subscription?.cancel();
_subscription = null;
}
@override
bool get isListening => _subscription != null;
}
Notice the early return: if (!kDebugMode) return;. In production builds, this is a single boolean check. No string formatting. No I/O. No allocations. Overhead: ~0.001ms.
In debug builds, the listener provides full structured logging with route paths, arguments, and timestamps. Exactly what developers need during development. Exactly zero overhead in production.
The listener knows nothing about auto_route. Nothing about adapters. It just consumes NavigationEvent objects from a stream.
Want to add Firebase Analytics? Implement INavigationListener. Want to track in Datadog? Implement INavigationListener. Want to store locally for crash reporting? Implement INavigationListener.
class FirebaseNavigationListener implements INavigationListener {
final FirebaseAnalytics _analytics;
@override
void startListening(INavigationEventBus eventBus) {
_subscription = eventBus.subscribe().listen((event) {
_analytics.logEvent(
'navigation',
parameters: {
'from': event.from?.name,
'to': event.to.name,
'type': event.type.value,
},
);
});
}
// ...rest of interface implementation
}
Add listeners without touching the router, the adapter, or the event bus. Open for extension, closed for modification. This is what the Open-Closed Principle actually looks like in production code.
Let's break down the overhead per navigation event:
| Operation | Cost | Frequency |
|---|---|---|
hasActiveListeners check |
~0.001ms | Every navigation |
| Route info extraction | ~0.1ms | Every navigation |
| Argument sanitization (fast-path) | ~0.001ms | 80% of navigations |
| Argument sanitization (copy) | ~0.2ms | 20% of navigations |
| Event creation | ~0.05ms | Every navigation |
| Stream publish | ~0.05ms | Every navigation |
| Total (typical) | ~0.3ms | Per navigation |
For context, a typical route transition animation takes 300ms. Our observability overhead is 0.1% of the transition time. Imperceptible to users, negligible in profiling.
There's one final architectural benefit that can't be ignored: testing.
Testing navigation in Flutter usually means writing slow, flaky WidgetTests that depend on the actual Router. But because the domain layer has zero Flutter dependencies, testing a listener requires zero UI code. You just inject a fake event bus:
test('Analytics listener logs navigation events', () {
final fakeEventBus = FakeNavigationEventBus();
final listener = AnalyticsNavigationListener(mockAnalytics);
listener.startListening(fakeEventBus);
fakeEventBus.publish(NavigationEvent.test());
verify(() => mockAnalytics.logEvent('navigation', any())).called(1);
});
No real navigator. No real routes. Pure, lightning-fast unit tests.
.any() and .map() are not just cleaner; they are heavily optimized.ArgumentSanitizationConfig.strict instead of .production decoupled the logic from the environment and saved future refactoring.The real win isn't the code. The win is what the code enables. Product gains visibility into user journeys. Engineering reproduces bugs with exact event sequences. QA has systematic coverage instead of blind spots. Support understands user context during troubleshooting.
This is foundational infrastructure. It's the difference between flying blind and making data-driven decisions. Between a three-day debugging marathon and a five-minute root-cause analysis.
Architecture isn't about being fancy. It's about being free. Free to test. Free to change. Free to extend. That's what Clean Architecture provides. That's what pragmatic trade-offs preserve. That's what profiling-first optimization delivers.
The complete implementation is available in the project repository under lib/core/navigation/:
11 files. Zero third-party dependencies beyond auto_route and GetIt.
Full Source Code:
🐙 Flutter Production Architecture on GitHub
If this series helped you:
Questions or improvements? Open an issue or PR on the GitHub repo. This is a living architecture—feedback makes it better.