Conversation
| set_params=set_params_batch, | ||
| save_outputs=False, | ||
| save_dpl=True, | ||
| dt=obj_fun_kwargs.get('dt', 0.025), |
There was a problem hiding this comment.
A shoot I totally missed this! Definitely going to slow things down when optimizing with the full dt, but considering how much more steadily this converges I think it's worth the wait :)
| save_outputs=False, | ||
| save_dpl=True, | ||
| dt=obj_fun_kwargs.get('dt', 0.025), | ||
| n_trials=obj_fun_kwargs.get('n_trials', 1), |
There was a problem hiding this comment.
Looking at the code, I actually think this may work perfectly fine for n_trials as
- The
BatchSimulateclass can indeed do multi-trial simulations (but I suspect there may be a more efficient way to run it - The post-processing code down below does indeed average the dipoles over multiple trials
| n_jobs=50, | ||
| n_jobs=obj_fun_kwargs.get('n_jobs', 50), | ||
| combinations=False, | ||
| backend='loky', |
There was a problem hiding this comment.
@katduecker to clarify from a question earlier, with 'loky' set as the backend then each core (n_jobs) is assigned just one simulation
I'll have to check, but I believe setting 'mpi' here would run every single simulation serially, but sped up as they are distributed over multiple cores
We have previously observed that there are severe diminishing returns on the speed given by MPI for n_proc > 20 so I think simulating batches as embarrassingly parallel is definitely the right way to go in this case
There was a problem hiding this comment.
I am not using MPI in my script and 200 iterations with a population size of 250 is still about 27 hours for 3 trials...
See script attached
opt_ERP_cma.py
| for receptor in ['ampa', 'nmda']: | ||
| net.external_drives[name][f'weights_{receptor}'] = { | ||
| ct: 0.0 for ct in target_cell_types | ||
| } |
There was a problem hiding this comment.
@katduecker can you help explain why this part is necessary?
It seems like all of these values would be overwritten by L707 below every time this function is called
There was a problem hiding this comment.
Oh yeah, this is probably old. Does it save the external drives correctly when removing this?
What I mean to do was to save the external drives in the right place where users can find them.
| dist_cell_type = ['L5_pyramidal', 'L2_pyramidal', 'L2_basket'] | ||
| default_range = {'mu': (0, tstop), 'sigma': (0, 20), 'ampa': (-5, 1), 'nmda': (-5, 1)} | ||
| default_values = {'mu': tstop // 2, 'sigma': 2, 'ampa': -3, 'nmda': -3} | ||
| default_range = {'mu': (0, tstop), 'sigma': (0, 20), 'numspikes':(0,1), 'ampa': (-5, 1), 'nmda': (-5, 1)} |
There was a problem hiding this comment.
I think I might exclude numspikes as a default parameter to optimize
optimization over discrete parameters is a notoriously hard problem and can severely impact performance with continuous parameters
There was a problem hiding this comment.
if the default range is (0,1), does that effectively make this a function which turns on/off certain drives?
There was a problem hiding this comment.
I think the default range is overwritten anyway downstream. I optimized numspikes and it works really well. Would be nice to give users the option, I think.
| # Very important this remains above 0.0 | ||
| net.external_drives[name]['dynamics']['sigma'] = max(0.01, param_values[f'{name}_sigma']) | ||
| net.external_drives[name]['dynamics']['sigma'] = param_values[f'{name}_sigma'] | ||
| net.external_drives[name]['dynamics']['numspikes'] = max(1, int(np.round(param_values[f'{name}_numspikes']))) |
There was a problem hiding this comment.
since the default range above is (0,1), it seems like this would always be numspikes=1?
|
@katduecker thanks for all these changes! I had a few clarifying questions but on the whole I agree with all the updates and will go ahead and merge into the PR Very important comment on the speed: this is not intended to be used with MPI and I think that is one reason why it's crippingly slow. Fixing the code so that the Note this was on and Oscar node with 64 CPU cores and 200 GB of RAM (this means that each of the 50 parameter sets in the population were run in parallel, 1 core per parameter set in the population) |

Hey @ntolley this PR is related to my code review of your CMA PR.
I tried CMA on the new model and got it to work, but I changed a few things I noticed in the process:
I believe that the enormous speed-up you achieved was because of dt=0.5. Just running an optimization of all drives with dt=0.025 and 5 trials at a time and I got 3 iterations in 2 hours...
I still think the convergence is wonderful and the fits I got with dt=0.5 were great! Especially for a large parameter space over all drives this is super impressive!
Here is a script I used to make sure this works as intended.
debug_cma.py
Hope this is useful!