Skip to content

[bug report]单机版模型导出,入参不生效 #524

@alexab612

Description

@alexab612

环境 python3.7
easyrec版本: 0.8.5
tf版本 2.9
正常执行异步训练完成之后
执行导出命令

   python3 -m easy_rec.python.export --pipeline_config_path ${config} --model_dir ${model_path} --export_dir ${export_model_path} --export_done_file EXPORT_DONE

会提示模型路径不存在,

allow_soft_placement: true
, '_keep_checkpoint_max': 10, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': None, '_train_distribute': None, '_device_fn': None, '_protocol': None, '_eval_distribute': None, '_experimental_distribute': None, '_service': None, '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x150495f6a278>, '_task_type': 'worker', '_task_id': 0, '_global_id_in_cluster': 0, '_master': '', '_evaluation_master': '', '_is_chief': True, '_num_ps_replicas': 0, '_num_worker_replicas': 1}
[2025-03-12 00:00:00,000][INFO] check_mode: False 
Traceback (most recent call last):
  File "/home/apps/miniconda3/envs/py36_tf12_env/lib/python3.6/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "/home/apps/miniconda3/envs/py36_tf12_env/lib/python3.6/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/home/apps/miniconda3/envs/py36_tf12_env/lib/python3.6/site-packages/easy_rec-0.8.5-py3.6.egg/easy_rec/python/export.py", line 150, in <module>
    tf.app.run()
  File "/home/apps/miniconda3/envs/py36_tf12_env/lib/python3.6/site-packages/tensorflow/python/platform/app.py", line 125, in run
    _sys.exit(main(argv))
  File "/home/apps/miniconda3/envs/py36_tf12_env/lib/python3.6/site-packages/easy_rec-0.8.5-py3.6.egg/easy_rec/python/export.py", line 140, in main
    FLAGS.verbose, **extra_params)
  File "/home/apps/miniconda3/envs/py36_tf12_env/lib/python3.6/site-packages/easy_rec-0.8.5-py3.6.egg/easy_rec/python/main.py", line 801, in export
    ckpt_path = _get_ckpt_path(pipeline_config, checkpoint_path)
  File "/home/apps/miniconda3/envs/py36_tf12_env/lib/python3.6/site-packages/easy_rec-0.8.5-py3.6.egg/easy_rec/python/main.py", line 271, in _get_ckpt_path
    % pipeline_config.model_dir
AssertionError: pipeline_config.model_dir(experiments/demo) does not exist
easy_rec version: 0.8.5`

看日志会先执行
export.py中的变量更新,
但是在这里 检查模型目录下pipline.config文件时, 没有这个文件(不清楚为什么2.9版本没有生成, tf1.12就会生成),就导致pipeline_config_path 没有更新,同时model_dir也没有更新

接着 在export.py 这里 更新了一次配置文件
然后在main.py中的 这里 也更新了一次.
如果仅在export.py 里面更新配置,会在main.py这里被重置.

所以需要将手动修改3处,

  1. 在export.py里加载 pipeline_config 之后 新增.
  if FLAGS.model_dir:
    pipeline_config.model_dir = FLAGS.model_dir
    logging.info('update model_dir to %s' % pipeline_config.model_dir)
  1. 在export.py调用main.py时的pipeline_config_path入参改为 pipeline_config.
  export_out_dir = export(FLAGS.export_dir, pipeline_config,
                          FLAGS.checkpoint_path, FLAGS.asset_files,
                          FLAGS.verbose, **extra_params)
  1. 在main.py中注释掉export函数中的
 pipeline_config = config_util.get_configs_from_pipeline_file(pipeline_config)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions