Skip to content

Improve compare tool#440

Open
vinothk-master wants to merge 50 commits intomercedes-benz:mainfrom
vinothk-master:improve_compare_tool
Open

Improve compare tool#440
vinothk-master wants to merge 50 commits intomercedes-benz:mainfrom
vinothk-master:improve_compare_tool

Conversation

@vinothk-master
Copy link
Copy Markdown
Contributor

@vinothk-master vinothk-master commented Aug 13, 2025

Hi @kayoub5

I have made the below changes,

  • provide folder with multiple pdx files as argument
  • load all pdx files in folder, sort them alphabetically, compare databases pairwise
  • When multiple pdx files are analyzed, a summary of all changes might be helpful
  • calculate metrics (number of added services, number of changed services, number of removed services, total number of services per ecu variant, ...)
  • display metrics graphically

Thanks
Vinoth Kannan

vinothk-master and others added 30 commits February 22, 2025 17:19
New changes from parent repo
@andlaus
Copy link
Copy Markdown
Member

andlaus commented Aug 13, 2025

thanks for the contribution. @kakoeh: can you review this?

Copy link
Copy Markdown
Member

@andlaus andlaus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this looks promising, but it still needs a bit of work. (I have not yet looked at the "meat" of the PR and I have some doubts about the usefulness of this feature as it is implemented by this PR (any opinions @kakoeh ?))

One thing I noticed is that the example PDX files get modified even though there is absolutely no reason why they should be (github does not allow this to be commented inline in reviews. The reason why this is there is that there are timestamps in PDX files, thus mksomersaultpdx.py creates a slightly different file every time it is called...)

Comment thread odxtools/cli/compare.py Outdated
summary_results = []
for i in range(len(pdx_files) - 1):
file_a = pdx_files[i]
file_b = pdx_files[i + 1]
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this does not really make sense IMO: the desired order of files to be compared pairwise might not be lexicographical. Possibly the whole pdx_files list ought to be specified as command line parameters? (this would also remove the necessity of walking over the specified directory.)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @andlaus
Thanks for the suggestion!

Actually as per the comment from @kayoub5,
load all pdx files in folder, sort them alphabetically, compare databases pairwise

this is the condition, should we go with alphabetically or do we have to compare each and every file without sorting ?

Could you please provide me your input?

Thanks
Vinoth Kannan

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is still the opportunity to pass multiple PDX files in a custom order with the command line parameter -db. From my perspective, -f FOLDER would be used when many files exists and / or typing the list of databases is considered too cumbersome. But @andlaus you're right, lexicographical order might not be desired. Assuming that the files in the folder are already sorted as wanted, pdx_files.sort() could be omitted.

Comment thread odxtools/cli/compare.py Outdated
Comment thread odxtools/cli/compare.py Outdated
Comment thread odxtools/cli/compare.py Outdated
rich_print("Please specify either a database or variant for a comparison")
#elif args.folder:
elif hasattr(args, "folder") and args.folder:
print("Now printing the pdx files in folder")
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
print("Now printing the pdx files in folder")
print(f"PDX files in folder {args.folder}: {','.join(pdx_files)}")

(you need to move this to line 743)

Comment thread odxtools/cli/compare.py Outdated
if args.output:
with open(args.output, "w") as f:
json.dump(summary_results, f, indent=4)
print_change_metrics(summary_results)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isn't there already a function which compares two complete PDX files? why not calling that here?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @andlaus

I have reused the compare_dianostic_layer() function to compare different pdx files

Suggested change
print_change_metrics(summary_results)
changes = task.compare_diagnostic_layers(layer_a, layer_b)

Thanks
Vinoth

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @vinothk-master,
@andlaus is right, there is already a function which compares 2 PDX files: compare_databases()
Thus, db_changes = task.compare_databases(db_a, db_b) could be used and replace the loop for name in diagnostic_layer_names where the change metrics are calculated per diagnostic layer.
Instead of printing changes for each diagnostic layer individually with print_dl_metrics([layer_a, layer_b]), task.print_database_changes(db_changes) could then be used.

Copy link
Copy Markdown
Contributor Author

@vinothk-master vinothk-master Sep 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @kakoeh,

I have made the changes as mentioned above and included a new function called print_change_metrics to print the table, especially for the services added/changed/deleted.

Thanks
Vinoth

Comment thread odxtools/cli/compare.py Outdated
if dl.short_name in task.diagnostic_layer_names
]

# print("NAMES: ",task.diagnostic_layer_names)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dead code

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @andlaus
Removed the dead code.
Thanks
Vinoth

Copy link
Copy Markdown
Contributor Author

@vinothk-master vinothk-master left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi Team

  • Remove the dead codes
  • Made the changes from hasattr to getattr
    Made the above changes
    Thanks
    Vinoth

Comment thread odxtools/cli/compare.py Outdated
Comment thread odxtools/cli/compare.py Outdated
Comment thread odxtools/cli/compare.py Outdated
if args.output:
with open(args.output, "w") as f:
json.dump(summary_results, f, indent=4)
print_change_metrics(summary_results)
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @andlaus

I have reused the compare_dianostic_layer() function to compare different pdx files

Suggested change
print_change_metrics(summary_results)
changes = task.compare_diagnostic_layers(layer_a, layer_b)

Thanks
Vinoth

Comment thread odxtools/cli/compare.py Outdated
summary_results = []
for i in range(len(pdx_files) - 1):
file_a = pdx_files[i]
file_b = pdx_files[i + 1]
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @andlaus
Thanks for the suggestion!

Actually as per the comment from @kayoub5,
load all pdx files in folder, sort them alphabetically, compare databases pairwise

this is the condition, should we go with alphabetically or do we have to compare each and every file without sorting ?

Could you please provide me your input?

Thanks
Vinoth Kannan

Comment thread odxtools/cli/_print_utils.py Outdated
table.add_column("Services Changed", justify="right", style="yellow")
table.add_column("Services Deleted", justify="right", style="yellow")
table.add_column("Changed Parameters", justify="right", style="yellow")
#
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Variants and Diag Layer refer to the same object, therefore I wouldn't print the comparison details for each Diag Layer. See suggestion for table design here.

Copy link
Copy Markdown
Contributor Author

@vinothk-master vinothk-master Sep 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @kakoeh,

Thanks for the input, and I have made the changes as per the requirement.

Thanks
Vinoth

Comment thread odxtools/cli/_print_utils.py Outdated
table.add_row(file_pair, str(m["diag_layer"]), str(m["diag_layer_type"]),
str(m["num_variants_added"]), str(m["num_variants_changed"]),
str(m["num_variants_deleted"]), str(m["num_new_services"]),
str(m["num_deleted_services"]), str(m["num_renamed_services"]),
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to the column headers, the order is Services Added, Services Changed, Services Deleted. Moreover, both renamed services and services with parameter changes should be characterized as Services Changed from my point of view.

Suggested change
str(m["num_deleted_services"]), str(m["num_renamed_services"]),
str(m["num_renamed_services"]+m["num_changed_parameters"]),
str(m["num_deleted_services"]))

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @kakoeh
Thanks for the input and I have made the changes as per the requirement.

Thanks
Vinoth

Comment thread odxtools/cli/compare.py Outdated
Comment thread odxtools/cli/compare.py Outdated
Comment thread odxtools/cli/compare.py Outdated
Comment thread odxtools/cli/compare.py Outdated
except Exception:
old_names = [
item for sublist in changes.changed_name_of_service for item in sublist
]

This comment was marked as outdated.

Comment thread odxtools/cli/compare.py Outdated
getattr(changes, "changed_parameters_of_service", []) or [])

# collect which services had parameter changes (unique)
for param_detail in getattr(changes, "changed_parameters_of_service", []) or []:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
for param_detail in getattr(changes, "changed_parameters_of_service", []) or []:
for param_detail in changes.changed_parameters_of_service:

Comment thread odxtools/cli/compare.py
"num_deleted_services": len(changes.deleted_services),
"num_renamed_services": len(changes.changed_name_of_service[0]),
"num_changed_parameters": len(changes.changed_parameters_of_service)
})

This comment was marked as outdated.

Comment thread odxtools/cli/compare.py Outdated
print(json.dumps(summary_results, indent=4))
if args.output:
with open(args.output, "w") as f:
json.dump(summary_results, f, indent=4)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Idea:
If the content of SpecsChangesVariants and ServiceDiff are mapped to the dictionary summary_results in an independent function, the changes could be printed to a json file as well if arguments -db or -v are given.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @kakoeh
I haven't done the json implementation in this merge request. I felt that once the table output is printed as per the requirement we will move to this Json implementation. Could you please let me know will this work ?

Thanks
Vinoth Kannan

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @vinothk-master
Sure, that's a good idea

Comment thread odxtools/cli/compare.py Outdated

print("New services:", services_b - services_a)
print("Deleted services:", services_a - services_b)
print_dl_metrics([layer_a, layer_b])

This comment was marked as outdated.

Comment thread pdx_files_folder/somersault.pdx Outdated
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with @andlaus, based on the added code, there is no necessity to upload the files somersault.pdx, somersault_modified.pdx and somersault_modified_1.pdx. Maybe it's best to add *.pdx to your .gitignore file

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it's best to add *.pdx to your .gitignore file

the problem with that is that .gitignore is version controlled, i.e., it is quite easy to accidentally commit such a change. that said, maybe PDX files in the odxtools source tree ought to be ignored by everyone, i.e. changes to the somersault pdx files would need to be explicitly staged using git add -f? opinions?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds reasonable

@andlaus
Copy link
Copy Markdown
Member

andlaus commented Aug 27, 2025

@vinothk-master: how do you intend to proceed with this PR? (if you want to continue working on it, it would IMO be great if you coordinated you efforts with @kakoeh and her work on #442 ...)

@vinothk-master
Copy link
Copy Markdown
Contributor Author

@vinothk-master: how do you intend to proceed with this PR? (if you want to continue working on it, it would IMO be great if you coordinated you efforts with @kakoeh and her work on #442 ...)

Hi @andlaus, I like to work on this PR, Let me cross check with #442 and provide my input.
Thanks
Vinoth Kannan

Copy link
Copy Markdown
Contributor

@kakoeh kakoeh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apart from the points discussed below, the code needs some improvements. The current implementation yields
image
but

python -m odxtools compare .\examples\somersault.pdx -db .\examples\somersault_modified.pdx

shows that changes in diagnostic services appear in other diagnostic layers too.
According to the output of the database comparison and the discussion in #252, I would have expected an output such as
image

table.add_column(
"Services Changed", justify="center", style="yellow", no_wrap=False, max_width=10)
table.add_column(
"Services Deleted", justify="center", style="yellow", no_wrap=False, max_width=10)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since #442, each data object is associated with a separate color, e.g.:

  • rich table headers: bold cyan
  • file name: orange1
  • diagnostic layer name: green3
  • diagnostic layer type: medium_spring_green
  • numbers: yellow

It would be great to be consistent to this formatting here too.

Comment thread odxtools/cli/compare.py

else:
# no databases & no variants specified
rich_print("Please specify either a database or variant for a comparison")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still, no information is printed if neither -db, -v nor -f were specified. To catch all combinations of optional arguments and reject superfluous ones, I suggest following structure of the if else block:

if args.database:

        if args.folder: 
            rich_print("Both options '-db' and '-f' were specified. Please select one of these options.")

        if args.variants:
            # filter considered diagnostic layers (task.diagnostic_layer_names)

        # compare specified databases

elif args.folder:

        if args.variants:
            # filter considered diagnostic layers (task.diagnostic_layer_names)

        # compare pdx files in folder
        
elif args.variants:
        # no databases or folder specified 
        # -> comparison of diagnostic layers in args.pdx_file

else:
        # no databases, variants or folder specified
        rich_print("Please specify either a database (-db DATABASE [DATABASE ...]), variant (-v VARIANT [VARIANT ...]) or folder (-f FOLDER) for a comparison")

Comment thread odxtools/cli/compare.py
# no databases & no variants specified
rich_print("Please specify either a database or variant for a comparison")
elif getattr(args, "folder", None):
pdx_files = []
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still, args.pdx_file is not included in the comparison. Since this argument is mandatory, it does not make sense to skip this file, from my point of view.

Comment thread odxtools/cli/compare.py
summary_results: list[dict[str, int | str | None]] = []
file_a = pdx_files[pdx]
file_b = pdx_files[pdx + 1]
db_changes = task.compare_databases([load_file(file_a)][0], [load_file(file_b)][0])
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why the use of lists and indices?

Suggested change
db_changes = task.compare_databases([load_file(file_a)][0], [load_file(file_b)][0])
db_changes = task.compare_databases(load_file(file_a), load_file(file_b))

Comment thread odxtools/cli/compare.py
file_a = pdx_files[pdx]
file_b = pdx_files[pdx + 1]
db_changes = task.compare_databases([load_file(file_a)][0], [load_file(file_b)][0])

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if db_changes is None:
continue

Performance can be enhanced when no changes are detected if the output of db_changes is checked. Moreover, the properties of db_changes can be annotated directly (e.g. db_changes.deleted_diagnostic_layers) as the function compare_databases returns either an object of type SpecsChangesVariants or None.

Comment thread odxtools/cli/compare.py

for layer in getattr(db_changes, "new_diagnostic_layers", []):
summary["new_layers"].append({
"short_name": getattr(layer, "short_name", None),
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

short_name is a property of objects of type DiagLayer. Therefore lines 1026-1029 could be summarized to:

            for layer in db_changes.new_diagnostic_layers:
                summary["new_layers"].append(layer.short_name)

Comment thread odxtools/cli/compare.py
"services": [svc.short_name for svc in getattr(layer, "diag_comms_raw", [])],
}
summary["deleted_layers"].append(deleted_info)
summary["service_changes"] = getattr(db_changes, "service_changes", {})
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

db_changes has no attribute service_changes but changed_diagnostic_layers

Comment thread odxtools/cli/compare.py
"Services Deleted":
len(summary["deleted_layers"][0]["services"]
if summary["deleted_layers"] else []),
})
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems as if something has gotten mixed up here. The number of added / deleted services is not the same as the number of added / deleted diagnostic layers as services and diagnostic layers are different types of data objects. Please revise that section again.

@andlaus
Copy link
Copy Markdown
Member

andlaus commented Mar 6, 2026

@vinothk-master: do you intend to work on this in the not too distant future (address @kakoeh's comments) or should this PR be closed?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants