The Ludonarrative Engine
The ludonarrative engine implements patterns generally useful for story-driven locative cultural heritage (CH) games. Through our research in D5.1 and D5.2, and ethnographic research, we’ve discovered that games for CH in most cases support the grouping of content into time slices. We can leave the specific navigation mechanics to the game designer. The engine could suggest some and make the “sliced” content available through its API.
The history and its significant events can be told by ‘eyewitnesses’ to give the player or visitor of the site an understanding of what happened at that location at that specific point in time. The player travels through time to visit the site’s respective time layers and meet the characters who need to be found to fill a treasure chest of knowledge and unfair advantages.
The engine is an agnostic game framework that ingests data/content to build the experience. Game designers using the engine need to have a set of assets produced specifically for their CH. The framework needs a set of area boundary coordinates, music, other audio, dialogues, avatars and icons and maps as well as historical background information about the CH and its significant events and the timestamps of these.
Characteristics:
(a) Heavy on audio, light on visuals, to ensure a feeling of presence in the physical site while not compromising immersion into the story content. One of the key reasons for choosing audio over visuals is that most visitors to CH have indicated that they prefer not to be looking at a screen when their surroundings are in most cases spectacular. The proliferation of spatial awareness in modern headsets makes the use of this a great advantage in both navigation mechanics to help players find their way around the CH, but also to enhance the player’s immersion with the game.
With regards to audio the following considerations have been made:
- Spatial / quality: The advantage of spatial audio is that the player can hear the direction of which a sound is coming from, which helps them in navigating around the CH
- Navigation clues: Every character has a distinct navigation clue. A short sound clip that varies in duration and volume, depending on the player’s direction. (warm/cold)
- Character identification (music): Every character has a distinct personal piece of music to help the player recognise who they are about to interact with.
- Background/ambient as a way of identifying time layer: Every time layer has an appropriate piece of background music and ambient sounds to emphasise the notion of time travel to the player. There are no tractor sounds in neolithic times, but there are in the present.
- Music as another layer of educating the player: We’ve applied music to educate players about their encounters with characters in a time-appropriate context.
- Dialogue, what the characters say when in proximity of the player: Once the player is within reach of a character, the player is greeted by a character and starts a dialogue with the player.
(b) Makes use of time travel through layers of time that can be visited, strong protagonist control (vis-a-vis the FDG paper).
- Dialogue, what the characters say when in proximity of the player: Once the player is within reach of a character, the player is greeted by a character and starts a dialogue with the player.
Features:
The main features of the engine are described below. They are all available to any app for any CH, but all are optional.
- Indicate the presence of a conflict. Describe conflict and about historical events that took place at the CH.
- The use of humans as well as non-human/fictional or mythological (e.g., fox, battle oak, salmon of knowledge) characters for storytelling. There is no need to be 100% historically accurate. Allow game designers poetic freedom to improve gameplay and to reduce the tour guide experience.
- Indicate the influence of a person/entity who was not present historically present at the site (e.g., Louis XIV).
- Indicate different uses of the site over time (time travel) The engine should allow a range of different time navigation mechanics (as decided by a developer using the engine).
- Indicate multiple (possibly conflicting) perspectives, contested heritage. In the case of a battle, it’s possible to educate on two sides of the conflict. What would the world at CH look like if the outcome of the conflict was the opposite of what had happened?
- How can music be best used to facilitate ludonarrative storytelling? Game designers are encouraged to provide era-appropriate / geo-relevant music to deliver context to help the above, but also to educate the player. Use sound/music to influence the player’s emotional experience. (hunting pack sound, battle clashes, horns, horses, etc.)
- Partner up. The game allows for spectators to follow the player’s progress remotely. Someone can play for you by proxy, (with less mobile population in mind). Spectators have access to the player’s location within the game. They can also view the contents of the player’s treasure chest.
Audio Navigation System (Game Design)
Locative games for cultural heritage sites typically rely heavily on visual interfaces such as maps, dialogues and other in-game entities such as characters/artifacts, which can detract from visitors’ engagement with the physical environment. Our Battle of the Boyne application takes a different approach by using spatial audio as the primary navigation and information delivery mechanic. This technique overlays the physical world with directional sound, creating an immersive experience without requiring constant screen interaction.
Dual-Mode Audio Mechanics
The application primarily implements two distinct audio modes that create a comprehensive player experience:
Wander Mode
When no points of interest are in immediate proximity, the system enters “Wander Mode,” designed to facilitate exploration. While in this mode, the system identifies the nearest undiscovered points of interest in the proximity (say around 200m radius) and generates spatially positioned audio cues that appear to originate from those locations. The audio cues are played in an alternating pattern to allow users to differentiate between them while the cue frequency is adjusted based on distance to provide navigation feedback.
The player navigation in wander mode is controlled by a target-locking mechanism that enhances the directional guidance. When a player turns in the direction of one of the audio cues and maintains that orientation for a predetermined lock-in period (typically 5 seconds), the system enters a “focused navigation” state, wherein, the selected POI becomes the target, other audio cues are silenced, and only the audio cue for the target POI continues to play. The frequency of this cue dynamically increases as the player approaches the target, providing intuitive proximity feedback.
If the player significantly deviates from the direction of the locked target POI, the system reverts to the “standard navigation” state, reactivating all nearby audio cues. This responsive targeting system encourages deliberate exploration while still maintaining the flexibility for players to change their minds and pursue different points of interest.
By offering multiple options rather than a single directed pathway, the system empowers users with meaningful choice, creating a non-linear exploration experience where visitors decide their own journey through the historical landscape. This approach transforms what could have been a prescribed, linear tour into a personalized discovery experience. Moreover, by focusing on just one cognitive task at a time—first direction when multiple cues are playing, then exclusively on cue frequency to determine proximity once a target is locked—the system minimizes cognitive demands on the user, creating an intuitive and effortless navigational experience.

Figure 1: Map of the Battle of the Boyne (1690 time layer), depicting wander mode with navigational audio cues and interaction mode: inner zone and outer zone.
Interact Mode
When a user enters the proximity of a point of interest, the system transitions to “Interact Mode”. In this mode, the navigation cues are silenced to focus attention on the POI in proximity whilst maintaining spatial positioning for all audio sources. The area that constitutes this mode is divided into two distinct interaction zones.
The Outer Zone
When users first enter a character’s proximity radius (typically 20-50 meters), they encounter the outer interaction zone wherein the character-specific thematic music begins to play while navigation cues from all POIs are silenced to focus attention. Throughout, the spatial positioning is maintained, guiding users toward the optimal position, i.e., music volume subtly increases as users move closer to the inner zone.
The outer zone also functions as a transitional space that gradually shifts the user’s cognitive focus from navigation to information reception, thereby preparing users for upcoming content.
The Inner Zone
As users move closer to the character’s exact location (typically within 5-15 meters), they enter the inner interaction zone. Music volume automatically reduces through dynamic parameter-controlled ducking (in FMOD) and character dialogue begins, spatially positioned at the exact location where the character is placed.
While the music continues at reduced volume, maintaining contextual continuity, the dialogues provide explicit historical information and narrative, forming the main part of the content delivered. Technical implementation uses FMOD parameters to create a sophisticated audio mix that responds dynamically to user position.
Spatial Audio Implementation and Sensor Integration (Game Development)
To create a convincing spatial audio experience in a mobile locative game, a sophisticated integration of sensor data is required. Our system combines GPS positioning, device orientation tracking, and audio spatialization to create a cohesive auditory landscape that attempts to accurately represent the physical environment.
Sensor Integration for Spatial Positioning
The spatial audio implementation relies on two primary sensor inputs:
GPS and Location Data
The application uses device GPS to determine the user’s absolute position. The data from GPS sensor helps calculating distances between the user and points of interest, determining directional relationship (bearings) between user and POIs, triggering proximity- based audio zones and behaviours and mapping real-world geography to audio space.
Device Orientation Data
The system uses the device’s gyroscope (could be extended to other head tracking sensors) to track head orientation, which is crucial for maintaining correct audio directionality. This approach collects orientation data to detect which way the user is facing, uses a calibration system to correct for drift and maintain accuracy, falls back to accelerometer data when gyroscope is unavailable and applies smoothing to prevent jitter while maintaining responsiveness.

Figure 2: High-level technical architecture diagram.
Audio Spatialization Techniques
Once the spatial relationships are determined, FMOD’s 3D audio capabilities create the directional audio experience:
Coordinate Transformation
The system transforms GPS coordinates into a local audio space through the following process:
- The user’s position becomes the origin of the audio coordinate system
- POI positions are calculated relative to this origin
- Head orientation determines the rotation of this coordinate system
- The resulting coordinates are used to position audio sources in 3D space
This transformation ensures that audio sources remain correctly positioned relative to the user as they move through and look around the environment.
Distance-Based Audio Behaviors
The system implements sophisticated distance-based audio behaviors:
- Volume attenuation follows a linear falloff curve, providing a more gradual and controllable transition with distance than the physically accurate inverse-square law
- Audio cue frequency increases as users approach sources, creating a sense of urgency and proximity
- Future enhancements could include low-pass filtering that increases with distance (simulating atmospheric absorption) and distance-based reverb to provide additional depth cues
These techniques create a realistic sense of space and distance, helping users intuitively understand their proximity to points of interest.
Dynamic Audio Mixing
The system employs FMOD’s parameter system to create context-sensitive audio mixes:
- Character proximity zones trigger different audio content
- Music-dialogue balance adjusts based on proximity and content priority
- Navigation cues are intelligently mixed or muted based on context (given by FMOD parameters)
- Environmental sounds maintain continuity between interactive moments
This parameter-driven approach allows for fluid transitions between audio states without abrupt changes that would break immersion. The system’s ability to create these complex audio behaviors with minimal performance impact was a key factor in selecting FMOD as the audio engine.
The integration of these techniques creates a responsive and immersive spatial audio experience that effectively guides users through the physical heritage site without requiring constant visual attention to a screen. The technical implementation balances accuracy and performance, ensuring that the application remains responsive on typical mobile devices while maintaining the spatial audio illusion.
Spectator Mode
The game provides a spectator mode designed to offer the experience for a diverse range of players. Research revealed that many family members of different generations frequent CH sites. To accommodate less mobile family members, players can share their experience with them. For each game session, a unique code is created and stored locally on the player device. This code can be shared with other players and when entered on another device, it grants access to the same game session, allowing the spectator to follow along with the player’s progress in real-time. The spectator sees where the player is and has access to the ‘treasure chest’ which holds music, artefacts and (historical) information about the characters the player has already encountered.
Game state and progress are shared through Firebase, ensuring that all connected devices are synchronized. The choice of Firebase was motivated by its real-time synchronization capabilities, simplified implementation and easy integration with Unity. The data transmitted and retrieved from the server is in Json format. The data is structured in the following format:
{
"sessionId": " unique_session_identifier", // the unique session id
"playerID": "player_device_identifier",
"gameProgress": {
"latitude": 34.0522, // Player's current latitude
"longitude": -118.2437, // Player's current longitude
"heading": 90.0, // Player's current heading (orientation) in degrees
"discoveredPOIs": ["POI_id_1", "POI_id_2"], // List of POIs the player has discovered
"inventory": {
"musicTracks": ["track1", "track2"], // Music tracks unlocked by the player
"artifacts": ["artifact1", "artifact2"], // Artifacts collected by the player
"characterInfo": ["character1", "character2"] // Historical characters discovered by the player
}
}
}
Time travel
The game heavily leans on the concept of time travel. Temporal collisions are defined by the inclusion of time layers for significant parts of the CH’s history. In the example of the Battle of the Boyne, there are three: Neolithic, 1690 (the actual day of the battle) and now. Players travel between these time layers by meeting special characters in the game (Fox/Raven).
The time travel system is implemented through a layered audio architecture in FMOD Studio, controlled by a central TimeLayerManager in the game engine. Each time period (Neolithic, 1690, and Modern) exists as a separate audio bank with period-specific ambient soundscapes, character dialogues, and navigation cues.
When a player encounters a time travel character (such as the Salmon of Knowledge), the system triggers a transition sequence. This is accomplished through parameter-driven events in FMOD where a global “TimePeriod” parameter shifts between values, creating a crossfade between audio environments. The transition could be synchronized with visual feedback, including subtle map overlay changes that reflect the landscape’s historical evolution.
The POI system will be extended to include TimeTravelPOI components that can trigger these shifts. These special POIs communicate with the TimeLayerManager, which broadcasts time period changes to all listening systems via an event-based architecture. This ensures that audio, visual elements, and available interaction points all update coherently when a time change occurs.
The Ludo Narrative Engine Team is:
Dr Mads Haahr – Principal Investigator
Joris Vreeke – Research Fellow
Karun Manoharan – Researcher
Svetlana Rudenko – Researcher