Version:

This documentation is not for the latest stable Salvus version.

Mapping between simulation and inversion space

It's not the destination, it's the journey
Full-waveform inversion is a non-convex problem, and the misfit landscape usually contains plenty of local minima. Therefore a suitable parameterization of the model and design of the descent method are crucial to not get trapped.
Salvus distinguishes between the model parameterizations used for the simulation and the inversion, respectively. The simulation requires a complete description of all material coefficients entering the wave equation defined on the entire domain. These are absolute quantities like velocities in m/s or density in kg/m3^3.
However, the parameters used in the inversion can differ, for instance, by covering only a subset of the domain, a subset of material properties, different scaling relations, etc.
The link between the representation of a model in simulation space, on the one hand, and the inversion space, on the other hand, is established through the Mapping class in SalvusOpt. This gives a lot of flexibility, for instance, to use different discretizations (e.g., a coarser mesh for the inversion or an event-dependent mesh for the simulation) or model parameterizations (e.g., inverting only for a subset of the physical parameters).
This micro-tutorial presents strategies to mitigate unwanted sensitivities in the model gradients, in particular close to sources or receivers.
Copy
%matplotlib inline
%config Completer.use_jedi = False
import os
SALVUS_FLOW_SITE_NAME = os.environ.get("SITE_NAME", "local")
import matplotlib.pyplot as plt
import numpy as np
import pathlib
import time
import xarray as xr

import salvus.namespace as sn

Project setup

To demonstrate different ways to use the mapping function, we construct a simple 2D domain with a 1D background model and a line of receivers at the surface.
nx, ny = 10, 10
x = np.linspace(0.0, 3000.0, nx)
y = np.linspace(-1000.0, 0.0, nx)
xx, yy = np.meshgrid(x, y, indexing="ij")

vp = 1500.0 - yy
rho = 1000.0 - yy

ds = xr.Dataset(
    data_vars={
        "vp": (["x", "y"], vp),
        "rho": (["x", "y"], rho),
    },
    coords={"x": x, "y": y},
)
ds.vp.T.plot()
<matplotlib.collections.QuadMesh at 0x7fc138734110>
p = sn.Project.from_volume_model(
    path="project",
    volume_model=sn.model.volume.cartesian.GenericModel(name="model", data=ds),
    load_if_exists=True,
)
src = sn.simple_config.source.cartesian.ScalarPoint2D(x=500.0, y=-500.0, f=1.0)
rec = sn.simple_config.receiver.cartesian.collections.ArrayPoint2D(
    x=np.linspace(100.0, 2900.0, 10), y=0.0, fields=["phi"]
)
p += sn.Event(event_name="event", sources=src, receivers=rec)
p.viz.nb.domain()
ec = sn.EventConfiguration(
    waveform_simulation_configuration=sn.WaveformSimulationConfiguration(
        end_time_in_seconds=2.0
    ),
    wavelet=sn.simple_config.stf.Ricker(center_frequency=5.0),
)
p += sn.SimulationConfiguration(
    name="sim_model",
    elements_per_wavelength=2,
    tensor_order=4,
    max_frequency_in_hertz=10.0,
    model_configuration=sn.ModelConfiguration(
        background_model=None, volume_models="model"
    ),
    event_configuration=ec,
    absorbing_boundaries=sn.AbsorbingBoundaryParameters(
        reference_velocity=2000.0,
        number_of_wavelengths=3.5,
        reference_frequency=5.0,
    ),
)
Next, we create synthetic data for testing different gradient parameterizations. To keep things simple, we work without observed data, and just compute the L2 norm of the signal at the receivers. With the definition of the misfit, we have everything needed to compute gradients.
p += sn.MisfitConfiguration(
    name="misfit",
    observed_data=None,
    misfit_function="L2_energy_no_observed_data",
    receiver_field="phi",
)
gradients = {}
while not gradients:
    gradients = p.actions.inversion.compute_gradients(
        simulation_configuration="sim_model",
        misfit_configuration="misfit",
        wavefield_compression=sn.WavefieldCompression(
            forward_wavefield_sampling_interval=1
        ),
        events=p.events.list(),
        site_name=SALVUS_FLOW_SITE_NAME,
        ranks_per_job=4,
    )
    time.sleep(2.0)
raw_gradient = sn.UnstructuredMesh.from_h5(gradients["event"])
[2024-03-15 09:29:20,147] INFO: Creating mesh. Hang on.
[2024-03-15 09:29:20,589] INFO: Submitting job ...
[2024-03-15 09:29:20,763] INFO: Launched simulations for 1 events. Please check again to see if they are finished.
[2024-03-15 09:29:22,907] INFO: Submitting job ...
Uploading 1 files...

🚀  Submitted job_2403150929908948_246f7ec1bb@local
[2024-03-15 09:29:22,948] INFO: Launched adjoint simulations for 1 events. Please check again to see if they are finished.
[2024-03-15 09:29:24,974] INFO: 1 events have already been submitted. They will not be submitted again.
Gradients are always computed in the simulation space first. This is the result of the adjoint simulation. We call this the raw_gradient. In the widget below, we notice that the sensitivity at the source location and - to a smaller degree - at the receiver locations has significantly higher amplitudes than everywhere else in the domain. This is the result of all energy passing through these points, which is why the waveforms are clearly most sensitive to changes at those locations. However, this clearly does not look like a reasonable model update.
raw_gradient
<salvus.mesh.unstructured_mesh.UnstructuredMesh at 0x7fc12b147c10>
This section demonstrates different preconditioning options to obtain gradients in the inversion space that contain more desirable features for model updates and put stronger emphasis on the sensitivity in certain areas. To this end, we use several configurations for mapping the raw gradients from the simulation into the inversion space. We focus on sensitivities with respect to VP, but could do the same for RHO.
The following utility function helps to analyze different preconditioning options for the gradient by selecting the relevant region from the domain, symmetrizing the color scale around zero and by optionally clipping it at a certain threshold.
def visualize_gradient(grad, clip=None):
    g = grad.model.copy()
    mask = np.logical_and(
        g.get_element_centroid()[:, 1] > -1000.0,
        np.abs(g.get_element_centroid()[:, 0] - 1500.0) < 1500.0,
    )
    g = g.apply_element_mask(mask)

    if clip:
        scale = (
            clip
            * np.max(np.abs(g.elemental_fields["VP"]))
            * np.ones_like(g.elemental_fields["VP"])
        )
        g.elemental_fields["VP"] = np.minimum(g.elemental_fields["VP"], scale)
        g.elemental_fields["VP"] = np.maximum(g.elemental_fields["VP"], -scale)

    display(g)
Some mapping routines require a prior model, for instance, to compute relative perturbations. Here, the prior model is just the mesh of the initial simulation.
prior = p.simulations.get_mesh("sim_model")
The first mapping is just the identity operator. By parameterizing the inversion space using an absolute scaling of the physical parameters, there is no difference between both discretizations. Hence the mapped gradient is the same as the raw gradient.
map1 = sn.Mapping(
    inversion_parameters=["VP"],
    scaling="absolute",
)
grad1 = map1.adjoint(
    mesh=raw_gradient.copy(),
    prior=prior,
)
visualize_gradient(grad1, clip=None)
<salvus.mesh.unstructured_mesh.UnstructuredMesh at 0x7fc16209c4d0>
Only after clipping the color scale at 10% of the maximum amplitude, we start seeing some sensitivity to changes of VP in locations different from the source / receivers.
visualize_gradient(grad1, clip=0.1)
<salvus.mesh.unstructured_mesh.UnstructuredMesh at 0x7fc1620bb290>
To remove the strong source imprint, we can cutout sensitivities around the source location. Here, we specify a radius of 200 m. The mapped gradient now does not contain any sensitivity in areas closer than 200 m to the source location.
map2 = sn.Mapping(
    inversion_parameters=["VP"],
    scaling="absolute",
    source_cutout_radius_in_meters=200.0,
)
grad2 = map2.adjoint(
    mesh=raw_gradient.copy(),
    prior=prior,
    event=p.waveforms.get(data_name="sim_model", events="event")[0],
)
visualize_gradient(grad2, clip=0.1)
<salvus.mesh.unstructured_mesh.UnstructuredMesh at 0x7fc161dd8850>
Similarly, we impose cutouts around the receiver locations.
map3 = sn.Mapping(
    inversion_parameters=["VP"],
    scaling="absolute",
    source_cutout_radius_in_meters=200.0,
    receiver_cutout_radius_in_meters=100.0,
)
grad3 = map3.adjoint(
    mesh=raw_gradient.copy(),
    prior=prior,
    event=p.waveforms.get(data_name="sim_model", events="event")[0],
)
visualize_gradient(grad3, clip=0.1)
<salvus.mesh.unstructured_mesh.UnstructuredMesh at 0x7fc162100250>
Sensitivities at depths above the source are now much more pronounced. They remain visible when adjusting the clipping to an 80% threshold.
visualize_gradient(grad3, clip=0.8)
<salvus.mesh.unstructured_mesh.UnstructuredMesh at 0x7fc161d4c4d0>
Even with the receiver cutout, there is strong sensitivity at the surface, which often introduces traps of local minima. For a conservative start of the inversion, it is often useful to completely remove sensitivities to very shallow depths independent of the receiver locations. This can be done by defining a region of interest. This is not necessarily limited to the surface of the domain, but a general tool to disregard sensitivities in certain areas of the domain.
The region of interest needs to be defined on the same mesh like the other model parameters. Here, we apply a mask on the coordinates of all the grid points in the mesh and define the region of interest as the area below 100 m depth.
mesh = p.simulations.get_mesh(simulation_configuration="sim_model")
roi = np.zeros_like(mesh.connectivity)
mask = mesh.points[mesh.connectivity][:, :, 1] < -100.0
roi[mask] = 1.0
mesh.attach_field("region_of_interest", roi)

map4 = sn.Mapping(
    inversion_parameters=["VP"],
    scaling="absolute",
    source_cutout_radius_in_meters=200.0,
    region_of_interest=roi,
)
grad4 = map4.adjoint(
    mesh=raw_gradient.copy(),
    prior=prior,
    event=p.waveforms.get(data_name="sim_model", events="event")[0],
)
visualize_gradient(grad4, clip=0.8)
<salvus.mesh.unstructured_mesh.UnstructuredMesh at 0x7fc161e14c10>
The mapped gradient is starting to get into a form that can provide meaningful model updates. Because of the increasing velocity at greater depths, the sensitivity decreases. This is a result of the longer wavelengths at higher velocities. To put more emphasis on deeper regions, we can switch to a parameterization of the inversion space that considers the relative deviation from the prior model.
In this example, the differences are subtle, but you will notice slightly higher sensitivities at depths below the source location when comparing the mapped gradient to the previous one.
map5 = sn.Mapping(
    inversion_parameters=["VP"],
    scaling="relative_deviation_from_prior",
    source_cutout_radius_in_meters=200.0,
    region_of_interest=roi,
)
grad5 = map5.adjoint(
    mesh=raw_gradient.copy(),
    prior=prior,
    event=p.waveforms.get(data_name="sim_model", events="event")[0],
)
visualize_gradient(grad5, clip=0.8)
<salvus.mesh.unstructured_mesh.UnstructuredMesh at 0x7fc122881090>
Both, cutouts and a region of interest introduce discontinuities in the mapped gradient, which is not desired in the model update. A smoothing preconditioner, which can be built into the computation of the model update, ensures a smooth transition in the proposed model update. As a side effect, we see non-zero updates even within the region of interest of the cutout radii.
As an example, we apply an anisotropic Gaussian smoothing with smoothing lengths of 100 m in horizontal and 50 m in vertical direction.
from salvus.opt.models import UnstructuredModel
smoothed_gradient = UnstructuredModel(
    name="smoothed_gradient",
    model=p.actions.inversion.smooth_model(
        model=grad5.model,
        smoothing_configuration=sn.ConstantSmoothing(
            smoothing_lengths_in_meters={"VP": [100.0, 50.0]}
        ),
        site_name="local",
        ranks_per_job=1,
    ),
    fields=["VP"],
)
visualize_gradient(smoothed_gradient, clip=0.8)

<salvus.mesh.unstructured_mesh.UnstructuredMesh at 0x7fc161daec50>
Computing gradients with respect to the material properties in time-domain simulations requires access to the forward wavefield during the adjoint run. This can lead to a significant memory overhead, because forward and adjoint simulations
A common mitigation is the use of checkpoints, which enable the on-the-fly recomputation of chunks of the forward wavefield during the adjoint simulation. To fit larger chunks in RAM and thus reduce the number of required checkpoints, the recomputed forward wavefield can be buffered on a coarser temporal grid. Although this yields slightly inconsistent gradients, the savings in memory usually outweigh the loss of accuracy.
Setting the forward_wavefield_sampling_interval in WavefieldCompression is thus an important tuning parameter. Depending on the application and meshing strategy a factor between 5 and 100 is typically achieved.
The example below shows the effect of using different sampling intervals. Note that this does not influence the accuracy of the waveforms, but only affects the gradients.
gradients = {}
for i in [1, 5, 10, 20]:
    gradient_files = {}
    while not gradient_files:
        gradient_files = p.actions.inversion.compute_gradients(
            simulation_configuration="sim_model",
            misfit_configuration="misfit",
            wavefield_compression=sn.WavefieldCompression(
                forward_wavefield_sampling_interval=i
            ),
            events=p.events.list(),
            site_name=SALVUS_FLOW_SITE_NAME,
            ranks_per_job=4,
        )
        time.sleep(2.0)
    gradients[i] = sn.UnstructuredMesh.from_h5(gradient_files["event"])
[2024-03-15 09:29:31,624] INFO: The following events have been simulated before, but checkpoints are not available for this combination of `site_name` and `ranks_per_job` and wavefield compression settings. They will be run again: ['event']
[2024-03-15 09:29:31,643] INFO: Submitting job ...
[2024-03-15 09:29:31,691] INFO: Launched simulations for 1 events. Please check again to see if they are finished.
[2024-03-15 09:29:33,836] INFO: Submitting job ...
Uploading 1 files...

🚀  Submitted job_2403150929838191_a0846cf62d@local
[2024-03-15 09:29:33,875] INFO: Launched adjoint simulations for 1 events. Please check again to see if they are finished.
[2024-03-15 09:29:35,903] INFO: 1 events have already been submitted. They will not be submitted again.
[2024-03-15 09:29:38,033] INFO: The following events have been simulated before, but checkpoints are not available for this combination of `site_name` and `ranks_per_job` and wavefield compression settings. They will be run again: ['event']
[2024-03-15 09:29:38,053] INFO: Submitting job ...
[2024-03-15 09:29:38,100] INFO: Launched simulations for 1 events. Please check again to see if they are finished.
[2024-03-15 09:29:40,217] INFO: Submitting job ...
Uploading 1 files...

🚀  Submitted job_2403150929219247_c01275291d@local
[2024-03-15 09:29:40,255] INFO: Launched adjoint simulations for 1 events. Please check again to see if they are finished.
[2024-03-15 09:29:42,279] INFO: 1 events have already been submitted. They will not be submitted again.
[2024-03-15 09:29:44,411] INFO: The following events have been simulated before, but checkpoints are not available for this combination of `site_name` and `ranks_per_job` and wavefield compression settings. They will be run again: ['event']
[2024-03-15 09:29:44,430] INFO: Submitting job ...
[2024-03-15 09:29:44,481] INFO: Launched simulations for 1 events. Please check again to see if they are finished.
[2024-03-15 09:29:46,612] INFO: Submitting job ...
Uploading 1 files...

🚀  Submitted job_2403150929614099_5770682148@local
[2024-03-15 09:29:46,655] INFO: Launched adjoint simulations for 1 events. Please check again to see if they are finished.
[2024-03-15 09:29:48,686] INFO: 1 events have already been submitted. They will not be submitted again.
Now, we compare the mapped gradients before applying the preconditioner. While there is hardly any visual difference for sampling intervals of 1, 5, and 10, ringing artifacts apprear for a sampling interval of 20. This is commonly observed when the re-sampling of forward wavefield is too coarse.
If a smoothing preconditioner is used during the inversion a small degree of ringing is acceptable as this will vanish after smoothing.
for i in [1, 5, 10, 20]:
    grad = map5.adjoint(
        mesh=gradients[i].copy(),
        prior=prior,
        event=p.waveforms.get(data_name="sim_model", events="event")[0],
    )
    print(f"----------------------------------------------------------")
    print(f"Mapped gradient with sampling interval {i}:")
    visualize_gradient(grad, clip=0.8)
----------------------------------------------------------
Mapped gradient with sampling interval 1:
<salvus.mesh.unstructured_mesh.UnstructuredMesh at 0x7fc1226c6b90>
----------------------------------------------------------
Mapped gradient with sampling interval 5:
<salvus.mesh.unstructured_mesh.UnstructuredMesh at 0x7fc12c36dc90>
----------------------------------------------------------
Mapped gradient with sampling interval 10:
<salvus.mesh.unstructured_mesh.UnstructuredMesh at 0x7fc161e3dc90>
----------------------------------------------------------
Mapped gradient with sampling interval 20:
<salvus.mesh.unstructured_mesh.UnstructuredMesh at 0x7fc1220bc710>
Feel free to check out the other inversion tutorials to see the interplay of mapping function, wavefield compression and preconditioner in a more realistic setting.
PAGE CONTENTS