You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When diagnosing specific SQL queries, analyze the SQL plan nodes for these patterns:
161
+
162
+
-**File I/O efficiency**: Check scan/write node metrics for `files read`, `bytes read`, `files written`, `bytes written`. Calculate average file size — small files (< 3MB) are a common hidden bottleneck.
163
+
-**Join strategy**: Look for `SortMergeJoin` nodes where one input is significantly smaller than the other. These may benefit from broadcast hints or AQE tuning.
164
+
-**Broadcast sizing**: Check `BroadcastExchange` node `data size` metric. Broadcasts > 1 GB cause excessive memory pressure and network overhead.
165
+
-**Cross joins**: Identify `BroadcastNestedLoopJoin` or `CartesianProduct` nodes. Calculate total scanned rows from input sizes — cross joins on large tables are extremely dangerous.
166
+
-**Filter complexity**: Inspect `Filter` node conditions. Very long conditions (> 1000 chars) with large IN-lists or OR chains should be converted to joins.
167
+
-**Partition pruning**: For Delta Lake and Iceberg tables, verify that scan nodes show partition filters being applied. Full scans on partitioned tables waste I/O.
168
+
-**Partition sizing**: Check stage task distribution for oversized partitions (> 5GB). These cause OOM risk, long tail tasks, and GC pressure.
169
+
170
+
Use `sql <exec-id>` for node-level metrics and `sql-plan <exec-id> --view final` for post-AQE plan structure.
171
+
172
+
## Lakehouse Awareness
173
+
174
+
When analyzing workloads on Delta Lake or Apache Iceberg tables:
175
+
176
+
### Delta Lake
177
+
-**OPTIMIZE**: Recommend `OPTIMIZE` for tables with small file problems detected in scan metrics
178
+
-**Z-ORDER**: Check if queries filter on z-ordered columns; if not, the z-ordering provides no benefit
179
+
-**Liquid Clustering**: For Databricks, check if cluster key filters are being applied in scans
180
+
-**Full scans**: Flag scans on partitioned Delta tables without partition filters
181
+
182
+
### Apache Iceberg
183
+
-**Copy-on-Write overhead**: For update/delete workloads, check if files replaced >> records changed — this indicates COW overhead
184
+
-**Merge-on-Read**: Recommend `write.merge-mode=merge-on-read` for update-heavy tables
185
+
-**Table maintenance**: Recommend `rewrite_data_files` for small file compaction
186
+
-**Bulk replace detection**: If > 60% of table files are replaced in a single operation, flag potential misuse
187
+
188
+
### General Lakehouse Checks
189
+
- File sizes in scan/write metrics (target ~128MB per file)
190
+
- Partition filter pushdown in scan nodes
191
+
- Table statistics availability for cost-based optimization
0 commit comments