Dataset for: Deep learning prediction of noise-driven nonlinear instabilities in fibre optics

Dataset: DatensatzDataset

Personen

  • Yassin Boussafa (Urheber*in)
  • Lynn Sader (Urheber*in)
  • Van Thuy Hoang (Urheber*in)
  • Bruno P. Chaves (Urheber*in)
  • Alexis Bougaud (Urheber*in)
  • Marc Fabert (Urheber*in)
  • Alessandro Tonello (Urheber*in)
  • John M. Dudley (Urheber*in)
  • Michael Kues (Urheber*in)
  • Benjamin Wetzel (Urheber*in)

Forschungseinrichtungen

Externe Organisationen

  • XLIM UMR CNRS 7252
  • Universite de Limoges
  • Centre national de la recherche scientifique (CNRS)
  • Institute FEMTO-ST

Details

Datum der Bereitstellung9 Apr. 2025
Herausgeber (Verlag)Zenodo
AnsprechpersonMichael Kues

Beschreibung

This dataset accompanies the study "Deep learning prediction of noise-driven nonlinear instabilities in fibre optics" and includes four curated datasets used to train and evaluate artificial neural networks (ANNs) for predicting spectral features resulting from modulation instability (MI) in nonlinear fibre propagation.
The datasets are:
1. Numerical – 2 seeds
GNLSE-based simulations with two coherent input seeds. Each seeding scenario (90 000 in total) includes:
- Input seed parameters (wavelengths and spectral phases)
- Output average spectra
- Output spectral correlation maps computed from 500 Monte Carlo realizations
2. Numerical – 4 seeds
Same as above, with four coherent seeds per scenario (105 000 in total). Spectral correlation maps also computed from 500 GNLSE simulations per configuration.
3. Experimental – 2 seeds
Real-time DFT measurements of MI with two coherent input seeds. Each case includes:
- Input seed parameters (defined via programmable filtering)
- Output average spectra
- Output spectral correlation maps computed from 1000 sequential DFT traces
4. Experimental – 4 seeds
Same as above, using four coherent seeds. Spectral correlation maps derived from 1000 DFT measurements per seeding configuration.
Notes:
All data are provided in physical units prior to ANN standardization, ensuring transparency and compatibility with custom preprocessing pipelines.
Data provided are the ones used to train the networks provided in Figs. 3, 5, 6, 7 of the main manuscript. Traces windowing and sampling were however performed, in line with the described "Methods" section of the manuscript, to keep the datasize reasonable, and compatible with ANN processing. Full raw data (including all GNLSE realizations and unprocessed DFT traces) are available upon request due to their large size.