Skip to content

CMA code review#19

Merged
ntolley merged 2 commits intontolley:cmafrom
katduecker:cma_katduecker_update
Feb 18, 2026
Merged

CMA code review#19
ntolley merged 2 commits intontolley:cmafrom
katduecker:cma_katduecker_update

Conversation

@katduecker
Copy link
Copy Markdown

Hey @ntolley this PR is related to my code review of your CMA PR.
I tried CMA on the new model and got it to work, but I changed a few things I noticed in the process:

  • I was getting a lot of values outside the pre-defined bounds (which I believe is also why you had to hard-code the 0.01 threshold for sigma when setting parameters). Added bounds.
  • The loss I got when simulating a dipole with the optimized network and comparing to the target dipole and the last value in optim.obj_ differed substantially. There were two reasons for that:
    1. obj was tracking the best loss over all iterations but the best params of the last iteration were selected and not best params overall. This is related to Carolina's PR [WIP] COBYLA solver: Track best-performing parameters during optimization jonescompneurolab/hnn-core#1240. I am now selecting the best params
    1. dt was set to 0.5 in the objective function, and the dipole is quite a bit different when using the default dt. I have added an option for the user to set dt
  • The optimization would only run over 1 trial and I have updated the code to allow optimization to an average over several trials
  • As outlined in my code review, I don't think it's ideal that set_params_opt_drive didn't set the external_drives, because it will look like the synaptic weights of all drives are zero. I think this is really hard for users to interpret as we teach people to define and look for synaptic weights in external_drives. I have changed the code to update this.
  • IMO, setting the weight to 10**param is great because you can explore a wide space and learning seems to happen fast, but I think it's confusing how is somewhat hidden in the code and not easy to spot by users. Could discuss at dev meeting?

I believe that the enormous speed-up you achieved was because of dt=0.5. Just running an optimization of all drives with dt=0.025 and 5 trials at a time and I got 3 iterations in 2 hours...
I still think the convergence is wonderful and the fits I got with dt=0.5 were great! Especially for a large parameter space over all drives this is super impressive!

Here is a script I used to make sure this works as intended.
debug_cma.py

Hope this is useful!

set_params=set_params_batch,
save_outputs=False,
save_dpl=True,
dt=obj_fun_kwargs.get('dt', 0.025),
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A shoot I totally missed this! Definitely going to slow things down when optimizing with the full dt, but considering how much more steadily this converges I think it's worth the wait :)

save_outputs=False,
save_dpl=True,
dt=obj_fun_kwargs.get('dt', 0.025),
n_trials=obj_fun_kwargs.get('n_trials', 1),
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at the code, I actually think this may work perfectly fine for n_trials as

  1. The BatchSimulate class can indeed do multi-trial simulations (but I suspect there may be a more efficient way to run it
  2. The post-processing code down below does indeed average the dipoles over multiple trials

n_jobs=50,
n_jobs=obj_fun_kwargs.get('n_jobs', 50),
combinations=False,
backend='loky',
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@katduecker to clarify from a question earlier, with 'loky' set as the backend then each core (n_jobs) is assigned just one simulation

I'll have to check, but I believe setting 'mpi' here would run every single simulation serially, but sped up as they are distributed over multiple cores

We have previously observed that there are severe diminishing returns on the speed given by MPI for n_proc > 20 so I think simulating batches as embarrassingly parallel is definitely the right way to go in this case

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not using MPI in my script and 200 iterations with a population size of 250 is still about 27 hours for 3 trials...
See script attached
opt_ERP_cma.py

Comment on lines +693 to +696
for receptor in ['ampa', 'nmda']:
net.external_drives[name][f'weights_{receptor}'] = {
ct: 0.0 for ct in target_cell_types
}
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@katduecker can you help explain why this part is necessary?

It seems like all of these values would be overwritten by L707 below every time this function is called

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh yeah, this is probably old. Does it save the external drives correctly when removing this?
What I mean to do was to save the external drives in the right place where users can find them.

dist_cell_type = ['L5_pyramidal', 'L2_pyramidal', 'L2_basket']
default_range = {'mu': (0, tstop), 'sigma': (0, 20), 'ampa': (-5, 1), 'nmda': (-5, 1)}
default_values = {'mu': tstop // 2, 'sigma': 2, 'ampa': -3, 'nmda': -3}
default_range = {'mu': (0, tstop), 'sigma': (0, 20), 'numspikes':(0,1), 'ampa': (-5, 1), 'nmda': (-5, 1)}
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I might exclude numspikes as a default parameter to optimize

optimization over discrete parameters is a notoriously hard problem and can severely impact performance with continuous parameters

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if the default range is (0,1), does that effectively make this a function which turns on/off certain drives?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the default range is overwritten anyway downstream. I optimized numspikes and it works really well. Would be nice to give users the option, I think.

# Very important this remains above 0.0
net.external_drives[name]['dynamics']['sigma'] = max(0.01, param_values[f'{name}_sigma'])
net.external_drives[name]['dynamics']['sigma'] = param_values[f'{name}_sigma']
net.external_drives[name]['dynamics']['numspikes'] = max(1, int(np.round(param_values[f'{name}_numspikes'])))
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since the default range above is (0,1), it seems like this would always be numspikes=1?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No it's overwritten!

@ntolley ntolley merged commit 87e8e51 into ntolley:cma Feb 18, 2026
0 of 9 checks passed
@ntolley
Copy link
Copy Markdown
Owner

ntolley commented Feb 18, 2026

@katduecker thanks for all these changes! I had a few clarifying questions but on the whole I agree with all the updates and will go ahead and merge into the PR

Very important comment on the speed: this is not intended to be used with MPI and I think that is one reason why it's crippingly slow.

Fixing the code so that the dt is correct (thanks for finding that bug!) definitely slows things down, but I was able to run your script with dt=0.025, ntrials=3, and popsize=50 for 3 iterations in about 10 minutes when I removed the MPI handler.

Note this was on and Oscar node with 64 CPU cores and 200 GB of RAM (this means that each of the 50 parameter sets in the population were run in parallel, 1 core per parameter set in the population)

@ntolley
Copy link
Copy Markdown
Owner

ntolley commented Feb 18, 2026

Here's the exact code I ran after modifying your script, the CMA outputs indicate a run time of ~9 minutes

The runtime drops down to 1.5 minutes for dt=0.5 so I'm pretty confident the dt is being updated correctly

import numpy as np
import matplotlib.pyplot as plt
from hnn_core import jones_2009_model, simulate_dipole, JoblibBackend
from hnn_core.network_models import add_erp_drives_to_jones_model
from hnn_core.optimization import Optimizer, add_opt_drives, set_params_opt_drives
from hnn_core.dipole import _corr, _rmse, average_dipoles

tstop = 200
dt = 0.025  # Just used for testing, 0.025 is the default but will be slower
scaling_factor = 3000
smooth_win = 30

# Create target dipole
net_target = jones_2009_model()
add_erp_drives_to_jones_model(net_target)
target_dpl = simulate_dipole(net_target, tstop=tstop, dt=dt, verbose=False)
target_dpl = target_dpl[0].copy().smooth(smooth_win).scale(scaling_factor)

# Create base network with drives to be optimized
net_base = jones_2009_model()
net_base._verbose = False
constraints, initial_params = add_opt_drives(net_base, n_prox=2, n_dist=1)

# Run optimization
max_iter = 3
optim = Optimizer(net_base, tstop=tstop, constraints=constraints, solver='cma',
                set_params=set_params_opt_drives, initial_params=initial_params, max_iter=max_iter, obj_fun="dipole_corr")
                
popsize = 50
optim.fit(target=target_dpl, n_trials=5, scale_factor=scaling_factor,
        smooth_window_len=smooth_win, dt=dt, popsize=popsize)

opt_dpl = simulate_dipole(optim.net_, tstop=tstop, n_trials=5, dt=dt, verbose=False)
opt_dpl = [dpl.smooth(smooth_win).scale(scaling_factor) for dpl in opt_dpl]
avg_dpl = average_dipoles(opt_dpl)
obj = _corr(avg_dpl, target_dpl, tstop=tstop)
print(f'obj_ == corr:{np.isclose(optim.obj_[-1], obj)}')
print(f'obj_ =:{optim.obj_[-1]}')
print(f'corr =:{obj}')
image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants