Skip to content

Support embeddings with different sizes and improve evaluation script#14

Open
andreasvc wants to merge 4 commits into
SapienzaNLP:masterfrom
andreasvc:master
Open

Support embeddings with different sizes and improve evaluation script#14
andreasvc wants to merge 4 commits into
SapienzaNLP:masterfrom
andreasvc:master

Conversation

@andreasvc
Copy link
Copy Markdown

No description provided.

- The attention() class had a hard-coded dimension of 2048 for the input,
  which works for Deberta, but not other models such as mmbert;
  the input dimension is now specified with a parameter.
- Set the encoder model to training mode after loading
  https://huggingface.co/docs/transformers/model_doc/auto#transformers.AutoModelForPreTraining.from_pretrained
- mmbert stores token embeddings in "self.encoder.tok_embeddings"
  instead of "self.encoder.word_embeddings"; both are now supported
- Suppress warnings about multiprocessing for data loaders; loading the
  data is not a bottleneck so the warnings are unnecessary
- Depending on the GPU, you may get warnings about setting matmul_precision.
  Added code to set matmul precision to medium, which seems like a good
  trade-off (but your mileage may vary on different hardware)
- Disable evaluation on the test set during training. Apparently, the
  code has bugs and is not expected to work. Therefore it is now
  disabled, to avoid giving the impression that there is an actual issue
  with training the model.
  Evaluation is supposed to be performed with the evaluate.py script.
- store the .conll output of the model, useful if you want to run other
  evaluation scripts on the output.
- write output to a separate directory and use filenames of the form
  '{subset}_{modality}', e.g. 'test_output' and 'test_gold' to clearly
  indicate the type of file.
  The output is written to a directory based on the dataset:
  'experiments/xcore/myexperiment/wandb/run-2026{...}/files/{dataset}'
  A model can therefore be evaluated on multiple datatsets.
- pretty-print evaluation results
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant