Benchmarking KD

We need to benchmark the following algorithms on three datasets (MNIST, CIFAR10, CIFAR100). This is so that we are sure that our implementations are fairly accurate on most datasets. 

We also need to ensure that the distillation works with a variety of student networks. @Het-Shah has suggested that we report results on ResNet18, MobileNet v2 and ShuffleNet v2 as student networks. ResNet50 can be the teacher network for all the distillations.

- [ ] VanillaKD
- [ ] TAKD
- [ ] Noisy Teacher
- [ ] Attention
- [ ] BANN
- [ ] Bert2lstm
- [ ] RCO
- [ ] Messy Collab
- [ ] Soft Random
- [ ] CSKD
- [ ] DML
- [ ] Self-training
- [ ] Virtual Teacher
- [ ] RKD Loss
- [ ] KA/ProbShift
- [ ] KA/LabelSmoothReg

If you wish to work on any of the above algorithms, just mention the algorithms in the discussion.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Benchmarking KD #105

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Benchmarking KD #105

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions