Download this notebook here
.
Step 2: Generate data samples¶
In this section, we will use the amplitude model that we created with the expert system in the previous step to generate a data sample via hit & miss Monte Carlo. We do this with the tensorwaves.data.generate
module.
First, we load a YAML recipe file that was created in the previous step:
[1]:
import yaml
with open("amplitude_model_helicity.yml") as input_file:
recipe = yaml.load(input_file.read(), Loader=yaml.SafeLoader)
2.1 Define kinematics of the particle reaction¶
A recipe file defines the kinematics, the particles involved in the reaction,
the dynamics used for the model on which to perform the eventual optimization,
etc. In Step 1, we decided to use the helicity
formalism to analyse the problem, so here we need to use the
HelicityKinematics
class.
[2]:
from tensorwaves.physics.helicity_formalism.kinematics import HelicityKinematics
from tensorwaves.physics.particle import extract_particles
kin = HelicityKinematics.from_recipe(recipe)
particles = extract_particles(recipe)
print("Initial state mass:", kin.reaction_kinematics_info.initial_state_masses)
print("Final state masses:", kin.reaction_kinematics_info.final_state_masses)
print("Involved particles:", list(particles))
Initial state mass: [3.0969]
Final state masses: [0.0, 0.1349766, 0.1349766]
Involved particles: ['J/psi', 'f0(1500)', 'f0(980)', 'gamma', 'pi0']
A Kinematics
object defines the kinematics of the particle reaction we are
studying. From the final state masses listed here, we can see we are dealing
with the reaction \(J/\psi \to \gamma\pi^0\pi^0\).
2.2 Generate phase space sample¶
Kinematics define the constraints of the phase-space. As such, we have enough
information to generate a phase-space sample for this particle reaction.
We do this with the generate_phsp
function. By default, this function uses
TFPhaseSpaceGenerator
as a, well… phase-space generator (using tensorflow
in the back-end) and generates random numbers with
TFUniformRealNumberGenerator
. You can specify this with the arguments of
generate_phsp
function.
[3]:
from tensorwaves.data.generate import generate_phsp
phsp_sample = generate_phsp(300000, kin)
phsp_sample.shape
[3]:
(3, 300000, 4)
As you can see, the phase-space sample is a three-dimensional array: 300.000 events of four-momentum tuples for three particles.
2.3 Generate intensity-based sample¶
‘Data samples’ are more complicated than phase space samples in that they
represent the intensity profile resulting from a reaction. You therefore need
an IntensityTF
object (or, more generally, a Function
instance) and a phase space over which to generate that intensity
distribution. We call such a data sample an intensity-based sample.
An intensity-based sample is generated with the function generate_data
. Its
usage is similar to generate_phsp
, but now you have to give an
IntensityTF
in addition to the Kinematics
object. An IntensityTF
object can be created with the IntensityTF
class:
[4]:
from tensorwaves.physics.helicity_formalism.amplitude import IntensityBuilder
builder = IntensityBuilder(particles, kin, phsp_sample)
intensity = builder.create_intensity(recipe)
We first decrease the debugging level a bit to avoid some redundant warnings.
[5]:
import logging
logging.disable(logging.WARNING)
logging.getLogger().setLevel(logging.INFO)
That’s it, now we have enough info to create an intensity-based data sample. Notice how the structure is the sample as the phase-space sample we saw before:
[6]:
from tensorwaves.data.generate import generate_data
data_sample = generate_data(30000, kin, intensity)
data_sample.shape
[6]:
(3, 30000, 4)
2.4 Visualize kinematic variables¶
We now have a phase space sample and an intensity-based sample. Their data
structure isn’t the most informative though: it’s just a collection of
four-momentum tuples. However, we can use the Kinematics
class to convert
those four-momentum tuples to a data set of kinematic variables.
Now we can use the Kinematics.convert()
method to convert the phase
space and data samples of four-momentum tuples to kinematic variables.
[7]:
phsp_set = kin.convert(phsp_sample)
data_set = kin.convert(data_sample)
list(data_set.keys())
[7]:
['mSq_2_3_4',
'mSq_3_4',
'mSq_2',
'mSq_3',
'mSq_4',
'theta+3_4+2',
'phi+3_4+2',
'theta+3+4_vs_2',
'phi+3+4_vs_2']
The data set is just a dict
of kinematic variables (keys are the names,
values is a list of computed values for each event). The numbers you see here
are final state IDs as defined in the YAML recipe file.
The format of this data set dict
allows us to easily convert it to a
pandas.DataFrame
and plot its content in the form of a histogram:
[8]:
import pandas as pd
data_frame = pd.DataFrame(data_set)
phsp_frame = pd.DataFrame(phsp_set)
data_frame
[8]:
mSq_2_3_4 | mSq_3_4 | mSq_2 | mSq_3 | mSq_4 | theta+3_4+2 | phi+3_4+2 | theta+3+4_vs_2 | phi+3+4_vs_2 | |
---|---|---|---|---|---|---|---|---|---|
0 | 9.59079 | 2.177779 | 0.0 | 0.018219 | 0.018219 | 2.235770 | 2.233095 | 2.667842 | -2.262883 |
1 | 9.59079 | 0.926526 | 0.0 | 0.018219 | 0.018219 | 1.643615 | 2.534691 | 0.555462 | -1.843396 |
2 | 9.59079 | 2.234231 | 0.0 | 0.018219 | 0.018219 | 2.508784 | -0.713266 | 2.602192 | -1.401680 |
3 | 9.59079 | 2.368775 | 0.0 | 0.018219 | 0.018219 | 1.358385 | 1.280403 | 1.550725 | -0.283965 |
4 | 9.59079 | 2.804714 | 0.0 | 0.018219 | 0.018219 | 2.699407 | 2.404664 | 2.746580 | 2.064780 |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
29995 | 9.59079 | 0.624491 | 0.0 | 0.018219 | 0.018219 | 1.076987 | -3.009379 | 1.871969 | -0.215016 |
29996 | 9.59079 | 2.331755 | 0.0 | 0.018219 | 0.018219 | 0.943062 | 0.577190 | 0.872151 | -2.718025 |
29997 | 9.59079 | 1.719634 | 0.0 | 0.018219 | 0.018219 | 1.066655 | -0.749248 | 1.906211 | 0.170694 |
29998 | 9.59079 | 2.553971 | 0.0 | 0.018219 | 0.018219 | 1.111827 | -2.476996 | 0.384796 | -0.601401 |
29999 | 9.59079 | 1.072305 | 0.0 | 0.018219 | 0.018219 | 2.753851 | 2.361013 | 1.740276 | 1.062023 |
30000 rows × 9 columns
[9]:
phsp_frame["mSq_3_4"].hist(bins=100)
[9]:
<matplotlib.axes._subplots.AxesSubplot at 0x7fefe833ff60>
[10]:
data_frame["mSq_3_4"].hist(bins=100, alpha=0.5, density=True)
[10]:
<matplotlib.axes._subplots.AxesSubplot at 0x7fefda3a7e10>
2.5 Export data sets¶
tensorwaves
currently has no export functionality for data samples. However, as we work with numpy.ndarray
, it’s easy to just ‘pickle’ these data samples with numpy.save
:
[11]:
import numpy as np
np.save("data_sample", data_sample)
np.save("phsp_sample", phsp_sample)
In the next step, we will illustrate how to ‘perform a fit’ with tensorwaves
by optimizing the intensity model to these data samples.