The problem
Initial worksheet to client:
| field_collection_site | site_qualifier | sitenumberorname_orig |
|-----------------------+----------------+----------------------------------------------|
| CA-PLA-2907 | | CA-PLA-2907|CA-PLA-2908|CA-PLA-38?|CA-YUB-05 |
| CA-PLA-2908 | | CA-PLA-2907|CA-PLA-2908|CA-PLA-38?|CA-YUB-05 |
| CA-PLA-38? | | CA-PLA-2907|CA-PLA-2908|CA-PLA-38?|CA-YUB-05 |
| CA-YUB-5 | | CA-PLA-2907|CA-PLA-2908|CA-PLA-38?|CA-YUB-05 |
Returned worksheet looks like:
| field_collection_site | site_qualifier | sitenumberorname_orig |
|-----------------------+----------------+----------------------------------------------|
| CA-PLA-2907 | | CA-PLA-2907|CA-PLA-2908|CA-PLA-38?|CA-YUB-05 |
| CA-PLA-2908 | | CA-PLA-2907|CA-PLA-2908|CA-PLA-38?|CA-YUB-05 |
| CA-PLA-38 | uncertain | CA-PLA-2907|CA-PLA-2908|CA-PLA-38?|CA-YUB-05 |
| CA-YUB-5 | | CA-PLA-2907|CA-PLA-2908|CA-PLA-38?|CA-YUB-05 |
*__returned_compiled looks like:
| field_collection_site | site_qualifier | sitenumberorname_orig | corrected |
|-----------------------+----------------+-----------------------------------------------+--------------------------------------|
| CA-PLA-2907 | | CA-PLA-2907|CA-PLA-2908|CA-PLA-38?|CA-YUB-05 | |
| CA-PLA-2908 | | CA-PLA-2907|CA-PLA-2908|CA-PLA-38?|CA-YUB-05 | |
| CA-PLA-38 | uncertain | CA-PLA-2907|CA-PLA-2908|CA-PLA-38?|CA-YUB-05 | field_collection_site|site_qualifier |
| CA-YUB-5 | | CA-PLA-2907|CA-PLA-2908|CA-PLA-38?|CA-YUB-05 | |
THEN, because *__corrections keeps only rows that actually make corrections, we get:
| field_collection_site | site_qualifier | sitenumberorname_orig | corrected |
|-----------------------+----------------+-----------------------------------------------+--------------------------------------|
| CA-PLA-38 | uncertain | CA-PLA-2907|CA-PLA-2908|CA-PLA-38?|CA-YUB-05 | field_collection_site|site_qualifier |
*__base_job_cleaned merges "CA-PLA-38" in for all 4 rows in the base job (on which worksheet is based):
| sitenumberorname_orig | field_collection_site | site_qualifier |
|----------------------------------------------+-----------------------+----------------|
| CA-PLA-2907|CA-PLA-2908|CA-PLA-38?|CA-YUB-05 | CA-PLA-38 | uncertain |
| CA-PLA-2907|CA-PLA-2908|CA-PLA-38?|CA-YUB-05 | CA-PLA-38 | uncertain |
| CA-PLA-2907|CA-PLA-2908|CA-PLA-38?|CA-YUB-05 | CA-PLA-38 | uncertain |
| CA-PLA-2907|CA-PLA-2908|CA-PLA-38?|CA-YUB-05 | CA-PLA-38 | uncertain |
This is the data returned in *__final, and merged into the migration project from there.
We have lost the three other site values.
Potential solutions
Make *__corrections return not just rows with corrections, but most recent corrected row for the whole-row match
NOPE.
Doesn't change how corrections will get merged in based on the :corrected row by *__base_job_cleaned
Remind self if the collate settings might be useful for this
todo
Redo the format of mod.base_job
todo
This is likely what will need to be done. And this limitation and a workaround pattern needs to be documented.
Fancy magic added to IterativeCleanup mixin
There's probably some way to automagically handle this or add a setting/mode to deal with it. But heck if I have time to dive into all that right now.
Remember, the problem and how you might solve it without creating OTHER problems is complicated by the fact that this needs to deal with additional iterations of a worksheet being returned as well.
The problem
Initial worksheet to client:
Returned worksheet looks like:
*__returned_compiled looks like:
THEN, because *__corrections keeps only rows that actually make corrections, we get:
*__base_job_cleaned merges "CA-PLA-38" in for all 4 rows in the base job (on which worksheet is based):
This is the data returned in *__final, and merged into the migration project from there.
We have lost the three other site values.
Potential solutions
Make *__corrections return not just rows with corrections, but most recent corrected row for the whole-row match
NOPE.
Doesn't change how corrections will get merged in based on the
:correctedrow by *__base_job_cleanedRemind self if the
collatesettings might be useful for thistodo
Redo the format of
mod.base_jobtodo
This is likely what will need to be done. And this limitation and a workaround pattern needs to be documented.
Fancy magic added to IterativeCleanup mixin
There's probably some way to automagically handle this or add a setting/mode to deal with it. But heck if I have time to dive into all that right now.
Remember, the problem and how you might solve it without creating OTHER problems is complicated by the fact that this needs to deal with additional iterations of a worksheet being returned as well.