4. Train ConvNN Model: Regression Method

In the following steps, you will:

  • Load the brown dwarf dataset used to train the ML models.

  • Prepare the X and y variables to deploy the trained ML models.

  • Visualize them for a few cases.

We will need the following modules from TelescopeML:

  • DeepBuilder: to prepare the synthetic brown dwarf dataset and load the trained machine learning (ML) models.

  • Predictor: to prepare the observational brown dwarf dataset and deploy the trained ML models.

  • StatVisAnalyzer: to provide statistical tests and plotting functions.

  • IO_utils: to provide functions to load the trained ML models.

from TelescopeML.DataMaster import *
from TelescopeML.DeepTrainer import *
from TelescopeML.Predictor import *
from TelescopeML.IO_utils import load_or_dump_trained_model_CNN
from TelescopeML.StatVisAnalyzer import *
No Bottleneck unit testing available.
Loading BokehJS ...
Loading BokehJS ...
Loading BokehJS ...

3.1 Data Preperation

3.1.1 Load the Synthetic spectra - training dataset

We computed a low-resolution spectrum (\(R\)=200) utilizing atmopshric brown dwarfs grid model, Sonora-Bobcat for spectral range $:nbsphinx-math:sim`$0.9-2.4 :math:mu m`. An open-source atmospheric radiative transfer Python package, PICASO was employed for generating these datasets. This dataset encompass 30,888 synthetic spectra (or instances or rows).

Each spectrum has 104 wavelengths (i.e., 0.897, 0.906, …, 2.512 μm) and 4 output atmospheric parameters:

  • gravity (log g)

  • temperature (Teff)

  • carbon-to-oxygen ratio (C/O)

  • metallicity ([M/H])

import os

__reference_data_path__ = os.getenv("TelescopeML_reference_data")

# Note: insert the directory of the reference_data if you get an error reading the reference data!!!
# __reference_data_path__ = 'INSERT_DIRECTORY_OF_reference_data'


Load the dataset and check few instances

train_BD = pd.read_csv(os.path.join(__reference_data_path__,
                                    'browndwarf_R100_v4_newWL_v3.csv.bz2'), compression='bz2')
gravity temperature c_o_ratio metallicity 2.512 2.487 2.462 2.438 2.413 2.389 ... 0.981 0.971 0.962 0.952 0.943 0.933 0.924 0.915 0.906 0.897
0 5.0 1100 0.25 -1.0 9.103045e-08 1.181658e-07 1.307868e-07 1.269229e-07 1.159179e-07 8.925110e-08 ... 1.257751e-07 9.640859e-08 7.612550e-08 6.901364e-08 6.247359e-08 4.112384e-08 5.127995e-08 4.897355e-08 4.087795e-08 2.791689e-08
1 5.0 1100 0.25 -0.7 9.103045e-08 1.181658e-07 1.307868e-07 1.269229e-07 1.159179e-07 8.925110e-08 ... 1.257751e-07 9.640859e-08 7.612550e-08 6.901364e-08 6.247359e-08 4.112384e-08 5.127995e-08 4.897355e-08 4.087795e-08 2.791689e-08
2 5.0 1100 0.25 -0.5 9.103045e-08 1.181658e-07 1.307868e-07 1.269229e-07 1.159179e-07 8.925110e-08 ... 1.257751e-07 9.640859e-08 7.612550e-08 6.901364e-08 6.247359e-08 4.112384e-08 5.127995e-08 4.897355e-08 4.087795e-08 2.791689e-08
3 5.0 1100 0.25 -0.3 9.103045e-08 1.181658e-07 1.307868e-07 1.269229e-07 1.159179e-07 8.925110e-08 ... 1.257751e-07 9.640859e-08 7.612550e-08 6.901364e-08 6.247359e-08 4.112384e-08 5.127995e-08 4.897355e-08 4.087795e-08 2.791689e-08
4 5.0 1100 0.25 0.0 9.103045e-08 1.181658e-07 1.307868e-07 1.269229e-07 1.159179e-07 8.925110e-08 ... 1.257751e-07 9.640859e-08 7.612550e-08 6.901364e-08 6.247359e-08 4.112384e-08 5.127995e-08 4.897355e-08 4.087795e-08 2.791689e-08

5 rows × 108 columns

3.1.2 Check atmospheric parameters

  • gravity (log g)

  • temperature (Teff)

  • carbon-to-oxygen ratio (C/O)

  • metallicity ([M/H])

output_names = ['gravity', 'temperature', 'c_o_ratio', 'metallicity']
gravity temperature c_o_ratio metallicity
0 5.0 1100 0.25 -1.0
1 5.0 1100 0.25 -0.7
2 5.0 1100 0.25 -0.5
3 5.0 1100 0.25 -0.3
4 5.0 1100 0.25 0.0
wavelength_names = [item for item in train_BD.columns.to_list() if item not in output_names]
['2.512', '2.487', '2.462', '2.438', '2.413']
wavelength_values = [float(item) for item in wavelength_names]
[2.512, 2.487, 2.462, 2.438, 2.413, 2.389, 2.366, 2.342, 2.319, 2.296]
wl_synthetic = pd.read_csv(os.path.join(__reference_data_path__,
0 2.511960
1 2.486966
2 2.462220
3 2.437720
4 2.413464

3.1.3 Prepare Inputs and outputs for ML models (X,y)

  • X: 104 wavelengths and their corresponding flux values

  • y: output variables: ‘gravity’, ‘temperature’, ‘c_o_ratio’, ‘metallicity’

# Training  variables
X = train_BD.drop(

# Target/Output feature variables
y = train_BD[['gravity', 'c_o_ratio', 'metallicity', 'temperature', ]]

log-transform the ‘temperature’ variable toreduce the skewness of the data, making it more symmetric and normal-like for the ML model

y.loc[:, 'temperature'] = np.log10(y['temperature'])
# check the output variables
gravity c_o_ratio metallicity temperature
0 5.0 0.25 -1.0 3.041393
1 5.0 0.25 -0.7 3.041393
2 5.0 0.25 -0.5 3.041393
3 5.0 0.25 -0.3 3.041393
4 5.0 0.25 0.0 3.041393

3.2 Build the CNN model and Processing the Data

Here we instintiate BuildRegressorCNN class from DeepBuilder module to prepare the datasets and take the trained CNN (Convolutional Neural Networks) for us:

  • Take the synthetic datasets

  • Process them, e.g.

    • Divide them to three sets: train, validation, and test sets

    • Scale y variables

    • Scale X variables

    • Create new features

3.2.1 Instintiate BuildRegressorCNN class from DeepBuilder module

data_processor = DataProcessor(

3.2.2 Split the dataset into train, validate and test sets


3.2.3 Standardize X Variables Row-wise

# Scale the X features using MinMax Scaler
            data = data_processor.X_train_standardized_rowwise[:, ::-1],
            title='Scaled main 104 Features',
            xlabel='Wavelength [$\mu$m]',
            ylabel='Scaled Values',
            fig_size=(18, 5),
            saved_file_name = 'Scaled_input_fluxes',
            __reference_data__ = __reference_data_path__,

3.2.4 Standardize y Variables Column-wise

# Standardize the y features using Standard Scaler
            data = data_processor.y_train_standardized_columnwise,
            title='Scaled main 104 Features',
            ylabel='Scaled Output Values',
            xticks_list=['','$\log g$', 'T$_{eff}$', 'C/O ratio', '[M/H]'],
            fig_size=(5, 5),
            saved_file_name = 'Scaled_output_parameters',
            __reference_data__ = __reference_data_path__,

3.2.5 Feature engeenering: Take Min and Max of each row (BD spectra)

# train
data_processor.X_train_min = data_processor.X_train.min(axis=1)
data_processor.X_train_max = data_processor.X_train.max(axis=1)

# validation
data_processor.X_val_min = data_processor.X_val.min(axis=1)
data_processor.X_val_max = data_processor.X_val.max(axis=1)

# test
data_processor.X_test_min = data_processor.X_test.min(axis=1)
data_processor.X_test_max = data_processor.X_test.max(axis=1)
df_MinMax_train = pd.DataFrame((data_processor.X_train_min, data_processor.X_train_max)).T
df_MinMax_val = pd.DataFrame((data_processor.X_val_min, data_processor.X_val_max)).T
df_MinMax_test = pd.DataFrame((data_processor.X_test_min, data_processor.X_test_max)).T
df_MinMax_train.rename(columns={0:'min', 1:'max'}, inplace=True)
df_MinMax_val.rename(columns={0:'min', 1:'max'}, inplace=True)
df_MinMax_test.rename(columns={0:'min', 1:'max'}, inplace=True)
min max
0 8.265340e-12 3.445259e-08
1 8.080712e-22 8.397132e-14
2 2.734403e-07 8.632182e-06
3 4.414951e-16 3.373262e-10
4 3.722576e-07 6.859888e-06

3.2.6 Scale Min Max features - ColumnWise

                                        X_train = df_MinMax_train.to_numpy(),
                                        X_val   = df_MinMax_val.to_numpy(),
                                        X_test  = df_MinMax_test.to_numpy(),
            data = data_processor.X_test_standardized_columnwise,
            title='Scaled Min Max Inputs - ColumnWise',
            ylabel='Scaled Output Values',
            xticks_list= ['','Min','Max'],
            fig_size=(5, 5),
            saved_file_name = 'Scaled_input_Min_Max_fluxes',
            __reference_data__ = __reference_data_path__,

3.3 Train CNN model

3.3.1 Instintiate TrainRegressorCNN class from DeepTrainer Module

ML pipeline

train_cnn_model = TrainRegressorCNN(
            # input dataset: StandardScaled instances
            X1_train = data_processor.X_train_standardized_rowwise,
            X1_val   = data_processor.X_val_standardized_rowwise,
            X1_test  = data_processor.X_test_standardized_rowwise,

            # input dataset: Min Max of each instance
            X2_train = data_processor.X_train_standardized_columnwise,
            X2_val   = data_processor.X_val_standardized_columnwise,
            X2_test  = data_processor.X_test_standardized_columnwise,

            # 1st target
            y1_train = data_processor.y_train_standardized_columnwise[:,0],
            y1_val   = data_processor.y_val_standardized_columnwise[:,0],
            y1_test  = data_processor.y_test_standardized_columnwise[:,0],

            # 2nd target
            y2_train = data_processor.y_train_standardized_columnwise[:,1],
            y2_val   = data_processor.y_val_standardized_columnwise[:,1],
            y2_test  = data_processor.y_test_standardized_columnwise[:,1],

            # 3rd target
            y3_train = data_processor.y_train_standardized_columnwise[:,2],
            y3_val   = data_processor.y_val_standardized_columnwise[:,2],
            y3_test  = data_processor.y_test_standardized_columnwise[:,2],

            # 4th target
            y4_train = data_processor.y_train_standardized_columnwise[:,3],
            y4_val   = data_processor.y_val_standardized_columnwise[:,3],
            y4_test  = data_processor.y_test_standardized_columnwise[:,3],

3.3.2 Define the Hyperparameters

hyperparameters = {
         'Conv__MaxPooling1D': 3,
         'Conv__NumberBlocks': 2,
         'Conv__NumberLayers': 3,
         'Conv__filters': 32,
         'Conv__kernel_size': 4,
         'FC1__NumberLayers': 3,
         'FC1__dropout': 0.0013358917126831819,
         'FC1__units': 256,
         'FC2__NumberBlocks': 1,
         'FC2__NumberLayers': 4,
         'FC2__dropout': 0.0018989744374361271,
         'FC2__units': 128,
         'lr': 0.00018890368162236508

3.3.3 Build a CNN model

train_cnn_model.build_model(config=hyperparameters, )
Model: "model"
 Layer (type)                Output Shape                 Param #   Connected to
 input_1 (InputLayer)        [(None, 104, 1)]             0         []

 Conv__B1_L1 (Conv1D)        (None, 104, 32)              160       ['input_1[0][0]']

 Conv__B1_L2 (Conv1D)        (None, 104, 128)             16512     ['Conv__B1_L1[0][0]']

 Conv__B1_L3 (Conv1D)        (None, 104, 288)             147744    ['Conv__B1_L2[0][0]']

 Conv__B1__MaxPooling1D (Ma  (None, 34, 288)              0         ['Conv__B1_L3[0][0]']

 Conv__B2_L1 (Conv1D)        (None, 34, 128)              147584    ['Conv__B1__MaxPooling1D[0][0]

 Conv__B2_L2 (Conv1D)        (None, 34, 288)              147744    ['Conv__B2_L1[0][0]']

 Conv__B2_L3 (Conv1D)        (None, 34, 512)              590336    ['Conv__B2_L2[0][0]']

 Conv__B2__MaxPooling1D (Ma  (None, 11, 512)              0         ['Conv__B2_L3[0][0]']

 flatten (Flatten)           (None, 5632)                 0         ['Conv__B2__MaxPooling1D[0][0]

 FC1__B1_L1 (Dense)          (None, 256)                  1442048   ['flatten[0][0]']

 FC1__B1_L2 (Dense)          (None, 1024)                 263168    ['FC1__B1_L1[0][0]']

 FC1__B1_L3 (Dense)          (None, 2304)                 2361600   ['FC1__B1_L2[0][0]']

 FC1__B1_L3__Dropout (Dropo  (None, 2304)                 0         ['FC1__B1_L3[0][0]']

 input_2 (InputLayer)        [(None, 2)]                  0         []

 Concatenated_Layer (Concat  (None, 2306)                 0         ['FC1__B1_L3__Dropout[0][0]',
 enate)                                                              'input_2[0][0]']

 FC2__B1_L1 (Dense)          (None, 128)                  295296    ['Concatenated_Layer[0][0]']

 FC2__B1_L2 (Dense)          (None, 512)                  66048     ['FC2__B1_L1[0][0]']

 FC2__B1_L3 (Dense)          (None, 1152)                 590976    ['FC2__B1_L2[0][0]']

 FC2__B1_L4 (Dense)          (None, 2048)                 2361344   ['FC2__B1_L3[0][0]']

 FC2__B1_L4__Dropout (Dropo  (None, 2048)                 0         ['FC2__B1_L4[0][0]']

 output__gravity (Dense)     (None, 1)                    2049      ['FC2__B1_L4__Dropout[0][0]']

 output__c_o_ratio (Dense)   (None, 1)                    2049      ['FC2__B1_L4__Dropout[0][0]']

 output__metallicity (Dense  (None, 1)                    2049      ['FC2__B1_L4__Dropout[0][0]']

 output__temperature (Dense  (None, 1)                    2049      ['FC2__B1_L4__Dropout[0][0]']

Total params: 8438756 (32.19 MB)
Trainable params: 8438756 (32.19 MB)
Non-trainable params: 0 (0.00 Byte)

3.3.4 Train the CNN model using the datasets

history, model =  train_cnn_model.fit_cnn_model(batch_size=1000,
Epoch 1/2
26/26 [==============================] - 32s 1s/step - loss: 2.0709 - output__gravity_loss: 0.8436 - output__c_o_ratio_loss: 0.4558 - output__metallicity_loss: 0.4111 - output__temperature_loss: 0.3604 - output__gravity_mae: 1.2371 - output__c_o_ratio_mae: 0.8332 - output__metallicity_mae: 0.7632 - output__temperature_mae: 0.6080 - val_loss: 0.7365 - val_output__gravity_loss: 0.3041 - val_output__c_o_ratio_loss: 0.2354 - val_output__metallicity_loss: 0.1756 - val_output__temperature_loss: 0.0214 - val_output__gravity_mae: 0.6611 - val_output__c_o_ratio_mae: 0.5466 - val_output__metallicity_mae: 0.4765 - val_output__temperature_mae: 0.1522
Epoch 2/2
26/26 [==============================] - 31s 1s/step - loss: 0.5539 - output__gravity_loss: 0.2413 - output__c_o_ratio_loss: 0.1425 - output__metallicity_loss: 0.1498 - output__temperature_loss: 0.0202 - output__gravity_mae: 0.5749 - output__c_o_ratio_mae: 0.4192 - output__metallicity_mae: 0.4384 - output__temperature_mae: 0.1561 - val_loss: 0.4836 - val_output__gravity_loss: 0.1971 - val_output__c_o_ratio_loss: 0.0843 - val_output__metallicity_loss: 0.1858 - val_output__temperature_loss: 0.0164 - val_output__gravity_mae: 0.5207 - val_output__c_o_ratio_mae: 0.3192 - val_output__metallicity_mae: 0.4981 - val_output__temperature_mae: 0.1421

3.3.5 Check the Trained CNN Archeticture and Summary

Checking the architecture of a CNN and its summary is importnat because it provide insights about: - Model Design and Structure - Model complexity - Hyperparameter Tuning:

data_processor.trained_ML_model = model
data_processor.history = history

3.3.6 Trained Model Outcomes

Note: This is not the final trained model because the budget is very low and the batch number is very high. For this reason, we add “_TEST” to the end of the model name, but we will not be using it.

load_or_dump_trained_model_CNN( trained_model = data_processor,
                                                        load_or_dump = 'dump')
/usr/local/anaconda3/envs/dl2/lib/python3.9/site-packages/keras/src/engine/training.py:3079: UserWarning: You are saving your model as an HDF5 file via `model.save()`. This file format is considered legacy. We recommend using instead the native Keras format, e.g. `model.save('my_model.keras')`.

3.4 Check the performance of the Trained Model

3.4.1 Load the Saved Trained CNN Models

loaded_model, history = load_or_dump_trained_model_CNN(output_indicator='tuned_bohb_batch32_v3_1000epoch_out10',
                                                      load_or_dump = 'load')
train_cnn_model.trained_model = loaded_model
train_cnn_model.trained_model_history = history

3.4.2 Double-check the Trained CNN Archeticture and Summary

                        # to_file="model.png",
Model: "model_1"
 Layer (type)                Output Shape                 Param #   Connected to
 input_3 (InputLayer)        [(None, 104, 1)]             0         []

 Conv__B1_L1 (Conv1D)        (None, 104, 32)              160       ['input_3[0][0]']

 Conv__B1_L2 (Conv1D)        (None, 104, 128)             16512     ['Conv__B1_L1[0][0]']

 Conv__B1_L3 (Conv1D)        (None, 104, 288)             147744    ['Conv__B1_L2[0][0]']

 Conv__B1__MaxPooling1D (Ma  (None, 34, 288)              0         ['Conv__B1_L3[0][0]']

 Conv__B2_L1 (Conv1D)        (None, 34, 128)              147584    ['Conv__B1__MaxPooling1D[0][0]

 Conv__B2_L2 (Conv1D)        (None, 34, 288)              147744    ['Conv__B2_L1[0][0]']

 Conv__B2_L3 (Conv1D)        (None, 34, 512)              590336    ['Conv__B2_L2[0][0]']

 Conv__B2__MaxPooling1D (Ma  (None, 11, 512)              0         ['Conv__B2_L3[0][0]']

 flatten_1 (Flatten)         (None, 5632)                 0         ['Conv__B2__MaxPooling1D[0][0]

 FC1__B1_L1 (Dense)          (None, 256)                  1442048   ['flatten_1[0][0]']

 FC1__B1_L2 (Dense)          (None, 1024)                 263168    ['FC1__B1_L1[0][0]']

 FC1__B1_L3 (Dense)          (None, 2304)                 2361600   ['FC1__B1_L2[0][0]']

 FC1__B1_L3__Dropout (Dropo  (None, 2304)                 0         ['FC1__B1_L3[0][0]']

 input_4 (InputLayer)        [(None, 2)]                  0         []

 Concatenated_Layer (Concat  (None, 2306)                 0         ['FC1__B1_L3__Dropout[0][0]',
 enate)                                                              'input_4[0][0]']

 FC2__B1_L1 (Dense)          (None, 128)                  295296    ['Concatenated_Layer[0][0]']

 FC2__B1_L2 (Dense)          (None, 512)                  66048     ['FC2__B1_L1[0][0]']

 FC2__B1_L3 (Dense)          (None, 1152)                 590976    ['FC2__B1_L2[0][0]']

 FC2__B1_L4 (Dense)          (None, 2048)                 2361344   ['FC2__B1_L3[0][0]']

 FC2__B1_L4__Dropout (Dropo  (None, 2048)                 0         ['FC2__B1_L4[0][0]']

 output__gravity (Dense)     (None, 1)                    2049      ['FC2__B1_L4__Dropout[0][0]']

 output__c_o_ratio (Dense)   (None, 1)                    2049      ['FC2__B1_L4__Dropout[0][0]']

 output__metallicity (Dense  (None, 1)                    2049      ['FC2__B1_L4__Dropout[0][0]']

 output__temperature (Dense  (None, 1)                    2049      ['FC2__B1_L4__Dropout[0][0]']

Total params: 8438756 (32.19 MB)
Trainable params: 8438756 (32.19 MB)
Non-trainable params: 0 (0.00 Byte)

3.4.3 Check the training history through Loss metric

# train_cnn_model.trained_model_history
plot_ML_model_loss_bokeh(trained_ML_model_history = train_cnn_model.trained_model_history,
                title = 'Hyperparameter-Tuned CNN model')

3.4.4 Plot the Performance of the trained CNN models - Regression metrics

  • Plot predicted against actual scatter plots for all parameters

  • Plot Residual histograms (predicted - Actual)

  • Report regression metrics: R\(^2\) and skewness for training and test sets

i = 6
                    trained_ML_model = train_cnn_model.trained_model,
                    trained_DataProcessor = data_processor,
                    Xtrain = [data_processor.X_train_standardized_rowwise[::i],

                    Xtest  = [data_processor.X_test_standardized_rowwise[::i],

                    ytrain = data_processor.y_train_standardized_columnwise[::i],

                    ytest  = data_processor.y_test_standardized_columnwise[::i],

                    target_i = 4,

                    xy_top   = [0.05, 0.7],
                    xy_bottom= [0.05, 0.85],
                    __reference_data__ = __reference_data_path__,
                    __print_results__ = False,
                    __save_plots__ = True
131/131 [==============================] - 4s 30ms/step
17/17 [==============================] - 0s 27ms/step