Version:

Lamb's problem

Part 2 - Verification and Benchmark

To validate the solutions generated with Salvus, we will compare our results with semi-analytical ones computed using EX2DDIR.
Copy
%matplotlib inline
# This notebook will use this variable to determine which
# remote site to run on.
import os

SALVUS_FLOW_SITE_NAME = os.environ.get("SITE_NAME", "local")
PROJECT_DIR = "project"
import pathlib
import numpy as np
import salvus.namespace as sn
The main purpose of SalvusProject is to manage all data associated with simulations and/or inversions. This ensures reproducibility and avoids computing the same task more than once. The cell below reloads the project state after the first part of the tutorial.
print("Opening existing project.")
p = sn.Project(path=PROJECT_DIR)
Opening existing project.
To verify that we indeed still have all the information available, we can do a few sanity checks like plotting the domain and querying a list of events and simulations.
p.viz.nb.domain()
p.events.list()
['event_0']
p.simulations.list()
['my_first_simulation']
Indeed, all entities from part 1 are still available in the project :)
With our domain initialized and our project reloaded, we can now go right ahead and start preparing some scientific data. We want to compare the simulated data to a semi-analytic solution to Lamb's problem, as described in the introduction of part 1. These data are stored in an HDF5 file named reference_data.h5 in the data directory.
Some data formats, such as ASDF or SEGY describe their data with associated headers. We'll see how to add these types of data in a later tutorial, but in this case we are just reading in raw waveform traces with little to no meta information. Because of this we'll need to assist Salvus a little and tell the project to what events this raw data refers to.
p.waveforms.add_external(
    data_name="reference",
    event="event_0",
    data_filename="data/reference_data.h5",
)
We've already used the function p.viz.nb.waveforms() to visualize seismograms in part 1. To compare the simulated seismograms to the reference data, we just need to pass a list instead of a single simulation name.
p.viz.nb.waveforms(
    ["EXTERNAL_DATA:reference", "my_first_simulation"],
    receiver_field="displacement",
)
Hmm. While the basic shape and behaviour of the waveforms looks good, there are certainly some discrepancies between what we've computed and the reference solution. Now our goal is to see if we can get the signals to match perfectly within the time which they overlap.

Challenge 1: Why doesn't it match?

The most obvious issue with the signals above is that they differ greatly at later times. There's quite an obvious issue here: the reference solution was computed assuming an infinite domain, but the domain we defined is finite in extent. In fact this is the main issue: we need to add absorbing boundaries.
To preserve the stability of the wavefield solution in the presence of complex or anisotropic media, Salvus employs a two-stage approach to absorbing boundaries. First, we apply absorbing boundary conditions at the edge of the mesh as outlined here. These conditions provide good absorbing characteristics for wave impacting the boundary at close to normal incidence, and are sufficient for most cases. If a more substantial absorbing profile is desired, one can also pad the simulated domain with a damping layer. This approach follows that given in this paper. Adding damping layers are advantageous in that they can almost completely cancel any boundary reflections, but do require one to enlarge the computational domain and therefore increase the cost of the resultant simulations. We have found that damping layers provide a good quality / performance tradeoff when 3.5 or more wavelengths are present in the absorbing layer.
In previous versions of Salvus absorbing boundary attachment was unfortunately a manual and tedious process. Fortunately, we now provide an interface to automatically extend the domain in a more user-friendly manner. To activate this feature, we first need to set a few parameters to tell the simulation that we do indeed want a layer of extended absorbing boundaries.
abp = sn.AbsorbingBoundaryParameters(
    reference_velocity=3000.0,
    number_of_wavelengths=3.5,
    reference_frequency=15.0,
)
And now we just simply add a new simulation configuration with the boundaries to our project.
sc = p.entities.get(
    entity_type="simulation_configuration", entity_name="my_first_simulation"
)
p += sn.SimulationConfiguration(
    name="simulation_2",
    max_frequency_in_hertz=sc.max_frequency_in_hertz,
    elements_per_wavelength=sc.elements_per_wavelength,
    model_configuration=sc.model_configuration,
    event_configuration=sc.event_configuration,
    absorbing_boundaries=abp,
)
Note that you can either use += or add_to_project to add new entities to the project.
Visualizing this new configuration, we can see that the mesh has been padded with absorbing boundaries on all sides but the free surface.
p.visualizations.nb.simulation_setup(
    simulation_configuration="simulation_2",
    events=p.events.list(),
)
[2024-11-15 13:04:55,469] INFO: Creating mesh. Hang on.
<salvus.flow.simple_config.simulation.waveform.Waveform object at 0x743cd20ce0d0>
Now, let's take another look at the simulated data and see how we've done.
p.simulations.launch(
    ranks_per_job=2,
    site_name=SALVUS_FLOW_SITE_NAME,
    events=p.events.list(),
    simulation_configuration="simulation_2",
)
[2024-11-15 13:04:55,963] INFO: Submitting job ...
Uploading 1 files...

🚀  Submitted job_2411151304141410_4978f36e23@local
1
p.simulations.query(block=True)
True
p.viz.nb.waveforms(
    ["EXTERNAL_DATA:reference", "simulation_2"], receiver_field="displacement"
)
Looking a lot better!

Challenge 2: Why does it still not match?

While the signals now look a lot closer, there are still some minor differences. To remedy this, let's take another look at the spectrum of our source.
sc.event_configuration.wavelet.plot()
Indeed while there is almost no energy above 30Hz30Hz, there is still at least some. This is where the final differences are coming from. In the cell below, create a new simulation configuration which matches the data even better. Feel free to play with either the max_frequency_in_hertz parameter, or the elements_per_wavelength parameter, noting the increase or decrease in simulation time as you go.
p += sn.SimulationConfiguration(
    name="simulation_3",
    max_frequency_in_hertz=sc.max_frequency_in_hertz,
    elements_per_wavelength=2.0,
    model_configuration=sc.model_configuration,
    event_configuration=sc.event_configuration,
    absorbing_boundaries=abp,
)
p.simulations.launch(
    ranks_per_job=2,
    site_name=SALVUS_FLOW_SITE_NAME,
    events=p.events.list(),
    simulation_configuration="simulation_3",
)
[2024-11-15 13:04:58,132] INFO: Creating mesh. Hang on.
[2024-11-15 13:04:58,234] INFO: Submitting job ...
Uploading 1 files...

🚀  Submitted job_2411151304238952_9c0b04d050@local
1
p.simulations.query(block=True)
True
p.viz.nb.waveforms(
    ["EXTERNAL_DATA:reference", "simulation_3"],
    receiver_field="displacement",
)
In the end, the accuracy of the simulation is always a tradeoff between the time to completion and the error in the resultant solution. In a real full-waveform inversion, where there is significant noise present in the data, you may be ok with a 5% error in your waveforms if it results in an appreciable speedup in each simulation. Such an error level will likely be below the noise level of your data. If you are looking to do a true convergence test, it may be worth it to increase the elements_per_wavelength parameter and pay the extra simulation cost. As with many things, it is a fine balance between science, experience, and art.
This concludes the second part. With the benchmark data we've gained confidence in the accuracy of our simulations.
PAGE CONTENTS