Skip to content

perf: split row groups by file range (morsel splitting)#23285

Draft
Dandandan wants to merge 3 commits into
apache:mainfrom
Dandandan:split-row-groups-by-range
Draft

perf: split row groups by file range (morsel splitting)#23285
Dandandan wants to merge 3 commits into
apache:mainfrom
Dandandan:split-row-groups-by-range

Conversation

@Dandandan

@Dandandan Dandandan commented Jul 1, 2026

Copy link
Copy Markdown
Contributor

Which issue does this PR close?

  • N/A (perf improvement, no tracking issue)

Rationale for this change

Speed up parquet scanning on few row groups.

Are these changes tested?

Yes

Are there any user-facing changes?

New config option datafusion.execution.parquet.split_row_groups_by_range (default true). With the default, partitions of a ranged parquet scan that previously returned no rows (their range contained no row group start) now return the rows proportional to their byte range; total scan output is unchanged.

🤖 Generated with Claude Code

…ans (morsel splitting)

When a parquet scan is restricted to a byte range that only partially
overlaps a row group, read the proportional slice of the row group's rows
(via a RowSelection) instead of assigning the whole row group to the one
range containing its first data page.

Since FileGroupPartitioner already tiles files into byte ranges, this lets
all partitions decode disjoint slices of the same row group in parallel,
parallelizing scans of files with fewer row groups than partitions (e.g. a
single large row group). Row boundaries are computed with identical integer
arithmetic on both sides of each range boundary, so every row is read
exactly once.

The offset index is now loaded whenever the access plan contains row
selections so each partition fetches and decodes only the pages covering
its slice.

Controlled by `datafusion.execution.parquet.split_row_groups_by_range`
(default: true).

TPC-DS SF=1 with one row group per file: 2.2x faster overall (16.8s ->
7.5s), 82/99 queries faster, up to 5.3x. TPC-H SF=1 (multi-row-group
files): 1.16x faster overall, no significant regressions.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
@github-actions github-actions Bot added documentation Improvements or additions to documentation sqllogictest SQL Logic Tests (.slt) common Related to common crate proto Related to proto crate datasource Changes to the datasource crate labels Jul 1, 2026
@Dandandan

Copy link
Copy Markdown
Contributor Author

run benchmarks

@adriangbot

Copy link
Copy Markdown

🤖 Benchmark running (GKE) | trigger
Instance: c4a-highmem-16 (12 vCPU / 65 GiB) | Linux bench-c4860089109-783-fqfvk 6.12.85+ #1 SMP Mon May 11 08:17:35 UTC 2026 aarch64 GNU/Linux

CPU Details (lscpu)
Architecture:                            aarch64
CPU op-mode(s):                          64-bit
Byte Order:                              Little Endian
CPU(s):                                  16
On-line CPU(s) list:                     0-15
Vendor ID:                               ARM
Model name:                              Neoverse-V2
Model:                                   1
Thread(s) per core:                      1
Core(s) per cluster:                     16
Socket(s):                               -
Cluster(s):                              1
Stepping:                                r0p1
BogoMIPS:                                2000.00
Flags:                                   fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 sve asimdfhm dit uscat ilrcpc flagm sb paca pacg dcpodp sve2 sveaes svepmull svebitperm svesha3 svesm4 flagm2 frint svei8mm svebf16 i8mm bf16 dgh rng bti
L1d cache:                               1 MiB (16 instances)
L1i cache:                               1 MiB (16 instances)
L2 cache:                                32 MiB (16 instances)
L3 cache:                                80 MiB (1 instance)
NUMA node(s):                            1
NUMA node0 CPU(s):                       0-15
Vulnerability Gather data sampling:      Not affected
Vulnerability Indirect target selection: Not affected
Vulnerability Itlb multihit:             Not affected
Vulnerability L1tf:                      Not affected
Vulnerability Mds:                       Not affected
Vulnerability Meltdown:                  Not affected
Vulnerability Mmio stale data:           Not affected
Vulnerability Reg file data sampling:    Not affected
Vulnerability Retbleed:                  Not affected
Vulnerability Spec rstack overflow:      Not affected
Vulnerability Spec store bypass:         Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1:                Mitigation; __user pointer sanitization
Vulnerability Spectre v2:                Mitigation; CSV2, BHB
Vulnerability Srbds:                     Not affected
Vulnerability Tsa:                       Not affected
Vulnerability Tsx async abort:           Not affected
Vulnerability Vmscape:                   Not affected

Comparing split-row-groups-by-range (6d1a25a) to 3bb9314 (merge-base) diff using: tpch
Results will be posted here when complete


File an issue against this benchmark runner

@adriangbot

Copy link
Copy Markdown

🤖 Benchmark running (GKE) | trigger
Instance: c4a-highmem-16 (12 vCPU / 65 GiB) | Linux bench-c4860089109-781-6xlcg 6.12.85+ #1 SMP Mon May 11 08:17:35 UTC 2026 aarch64 GNU/Linux

CPU Details (lscpu)
Architecture:                            aarch64
CPU op-mode(s):                          64-bit
Byte Order:                              Little Endian
CPU(s):                                  16
On-line CPU(s) list:                     0-15
Vendor ID:                               ARM
Model name:                              Neoverse-V2
Model:                                   1
Thread(s) per core:                      1
Core(s) per cluster:                     16
Socket(s):                               -
Cluster(s):                              1
Stepping:                                r0p1
BogoMIPS:                                2000.00
Flags:                                   fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 sve asimdfhm dit uscat ilrcpc flagm sb paca pacg dcpodp sve2 sveaes svepmull svebitperm svesha3 svesm4 flagm2 frint svei8mm svebf16 i8mm bf16 dgh rng bti
L1d cache:                               1 MiB (16 instances)
L1i cache:                               1 MiB (16 instances)
L2 cache:                                32 MiB (16 instances)
L3 cache:                                80 MiB (1 instance)
NUMA node(s):                            1
NUMA node0 CPU(s):                       0-15
Vulnerability Gather data sampling:      Not affected
Vulnerability Indirect target selection: Not affected
Vulnerability Itlb multihit:             Not affected
Vulnerability L1tf:                      Not affected
Vulnerability Mds:                       Not affected
Vulnerability Meltdown:                  Not affected
Vulnerability Mmio stale data:           Not affected
Vulnerability Reg file data sampling:    Not affected
Vulnerability Retbleed:                  Not affected
Vulnerability Spec rstack overflow:      Not affected
Vulnerability Spec store bypass:         Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1:                Mitigation; __user pointer sanitization
Vulnerability Spectre v2:                Mitigation; CSV2, BHB
Vulnerability Srbds:                     Not affected
Vulnerability Tsa:                       Not affected
Vulnerability Tsx async abort:           Not affected
Vulnerability Vmscape:                   Not affected

Comparing split-row-groups-by-range (6d1a25a) to 3bb9314 (merge-base) diff using: clickbench_partitioned
Results will be posted here when complete


File an issue against this benchmark runner

@adriangbot

Copy link
Copy Markdown

🤖 Benchmark running (GKE) | trigger
Instance: c4a-highmem-16 (12 vCPU / 65 GiB) | Linux bench-c4860089109-782-xkkgl 6.12.85+ #1 SMP Mon May 11 08:17:35 UTC 2026 aarch64 GNU/Linux

CPU Details (lscpu)
Architecture:                            aarch64
CPU op-mode(s):                          64-bit
Byte Order:                              Little Endian
CPU(s):                                  16
On-line CPU(s) list:                     0-15
Vendor ID:                               ARM
Model name:                              Neoverse-V2
Model:                                   1
Thread(s) per core:                      1
Core(s) per cluster:                     16
Socket(s):                               -
Cluster(s):                              1
Stepping:                                r0p1
BogoMIPS:                                2000.00
Flags:                                   fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 sve asimdfhm dit uscat ilrcpc flagm sb paca pacg dcpodp sve2 sveaes svepmull svebitperm svesha3 svesm4 flagm2 frint svei8mm svebf16 i8mm bf16 dgh rng bti
L1d cache:                               1 MiB (16 instances)
L1i cache:                               1 MiB (16 instances)
L2 cache:                                32 MiB (16 instances)
L3 cache:                                80 MiB (1 instance)
NUMA node(s):                            1
NUMA node0 CPU(s):                       0-15
Vulnerability Gather data sampling:      Not affected
Vulnerability Indirect target selection: Not affected
Vulnerability Itlb multihit:             Not affected
Vulnerability L1tf:                      Not affected
Vulnerability Mds:                       Not affected
Vulnerability Meltdown:                  Not affected
Vulnerability Mmio stale data:           Not affected
Vulnerability Reg file data sampling:    Not affected
Vulnerability Retbleed:                  Not affected
Vulnerability Spec rstack overflow:      Not affected
Vulnerability Spec store bypass:         Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1:                Mitigation; __user pointer sanitization
Vulnerability Spectre v2:                Mitigation; CSV2, BHB
Vulnerability Srbds:                     Not affected
Vulnerability Tsa:                       Not affected
Vulnerability Tsx async abort:           Not affected
Vulnerability Vmscape:                   Not affected

Comparing split-row-groups-by-range (6d1a25a) to 3bb9314 (merge-base) diff using: tpcds
Results will be posted here when complete


File an issue against this benchmark runner

@github-actions

github-actions Bot commented Jul 1, 2026

Copy link
Copy Markdown

Thank you for opening this pull request!

Reviewer note: cargo-semver-checks reported the current version number is not SemVer-compatible with the changes in this pull request (compared against the base branch).

Details
     Cloning apache/main
    Building datafusion-common v54.0.0 (current)
       Built [  37.233s] (current)
     Parsing datafusion-common v54.0.0 (current)
      Parsed [   0.064s] (current)
    Building datafusion-common v54.0.0 (baseline)
       Built [  35.700s] (baseline)
     Parsing datafusion-common v54.0.0 (baseline)
      Parsed [   0.064s] (baseline)
    Checking datafusion-common v54.0.0 -> v54.0.0 (no change; assume patch)
     Checked [   0.691s] 223 checks: 222 pass, 1 fail, 0 warn, 30 skip

--- failure constructible_struct_adds_field: externally-constructible struct adds field ---

Description:
A pub struct constructible with a struct literal has a new pub field. Existing struct literals must be updated to include the new field.
        ref: https://doc.rust-lang.org/reference/expressions/struct-expr.html
       impl: https://github.com/obi1kenobi/cargo-semver-checks/tree/v0.48.0/src/lints/constructible_struct_adds_field.ron

Failed in:
  field ParquetOptions.split_row_groups_by_range in /home/runner/work/datafusion/datafusion/datafusion/common/src/config.rs:1090

     Summary semver requires new major version: 1 major and 0 minor checks failed
    Finished [  74.949s] datafusion-common
    Building datafusion-datasource v54.0.0 (current)
       Built [  37.948s] (current)
     Parsing datafusion-datasource v54.0.0 (current)
      Parsed [   0.032s] (current)
    Building datafusion-datasource v54.0.0 (baseline)
       Built [  37.829s] (baseline)
     Parsing datafusion-datasource v54.0.0 (baseline)
      Parsed [   0.034s] (baseline)
    Checking datafusion-datasource v54.0.0 -> v54.0.0 (no change; assume patch)
     Checked [   0.264s] 223 checks: 223 pass, 30 skip
     Summary no semver update required
    Finished [  77.247s] datafusion-datasource
    Building datafusion-datasource-parquet v54.0.0 (current)
       Built [  44.027s] (current)
     Parsing datafusion-datasource-parquet v54.0.0 (current)
      Parsed [   0.032s] (current)
    Building datafusion-datasource-parquet v54.0.0 (baseline)
       Built [  43.836s] (baseline)
     Parsing datafusion-datasource-parquet v54.0.0 (baseline)
      Parsed [   0.032s] (baseline)
    Checking datafusion-datasource-parquet v54.0.0 -> v54.0.0 (no change; assume patch)
     Checked [   0.147s] 223 checks: 223 pass, 30 skip
     Summary no semver update required
    Finished [  89.355s] datafusion-datasource-parquet
    Building datafusion-proto v54.0.0 (current)
       Built [  60.876s] (current)
     Parsing datafusion-proto v54.0.0 (current)
      Parsed [   0.019s] (current)
    Building datafusion-proto v54.0.0 (baseline)
       Built [  58.683s] (baseline)
     Parsing datafusion-proto v54.0.0 (baseline)
      Parsed [   0.019s] (baseline)
    Checking datafusion-proto v54.0.0 -> v54.0.0 (no change; assume patch)
     Checked [   0.267s] 223 checks: 223 pass, 30 skip
     Summary no semver update required
    Finished [ 121.046s] datafusion-proto
    Building datafusion-proto-common v54.0.0 (current)
       Built [  21.026s] (current)
     Parsing datafusion-proto-common v54.0.0 (current)
      Parsed [   0.048s] (current)
    Building datafusion-proto-common v54.0.0 (baseline)
       Built [  20.882s] (baseline)
     Parsing datafusion-proto-common v54.0.0 (baseline)
      Parsed [   0.047s] (baseline)
    Checking datafusion-proto-common v54.0.0 -> v54.0.0 (no change; assume patch)
     Checked [   1.139s] 223 checks: 222 pass, 1 fail, 0 warn, 30 skip

--- failure constructible_struct_adds_field: externally-constructible struct adds field ---

Description:
A pub struct constructible with a struct literal has a new pub field. Existing struct literals must be updated to include the new field.
        ref: https://doc.rust-lang.org/reference/expressions/struct-expr.html
       impl: https://github.com/obi1kenobi/cargo-semver-checks/tree/v0.48.0/src/lints/constructible_struct_adds_field.ron

Failed in:
  field ParquetOptions.split_row_groups_by_range in /home/runner/work/datafusion/datafusion/datafusion/proto-common/src/generated/prost.rs:826
  field ParquetOptions.split_row_groups_by_range in /home/runner/work/datafusion/datafusion/datafusion/proto-common/src/generated/prost.rs:826
  field ParquetOptions.split_row_groups_by_range in /home/runner/work/datafusion/datafusion/datafusion/proto-common/src/generated/prost.rs:826

     Summary semver requires new major version: 1 major and 0 minor checks failed
    Finished [  44.185s] datafusion-proto-common
    Building datafusion-proto-models v54.0.0 (current)
       Built [  23.768s] (current)
     Parsing datafusion-proto-models v54.0.0 (current)
      Parsed [   0.126s] (current)
    Building datafusion-proto-models v54.0.0 (baseline)
       Built [  23.983s] (baseline)
     Parsing datafusion-proto-models v54.0.0 (baseline)
      Parsed [   0.129s] (baseline)
    Checking datafusion-proto-models v54.0.0 -> v54.0.0 (no change; assume patch)
     Checked [   1.741s] 223 checks: 222 pass, 1 fail, 0 warn, 30 skip

--- failure constructible_struct_adds_field: externally-constructible struct adds field ---

Description:
A pub struct constructible with a struct literal has a new pub field. Existing struct literals must be updated to include the new field.
        ref: https://doc.rust-lang.org/reference/expressions/struct-expr.html
       impl: https://github.com/obi1kenobi/cargo-semver-checks/tree/v0.48.0/src/lints/constructible_struct_adds_field.ron

Failed in:
  field ParquetOptions.split_row_groups_by_range in /home/runner/work/datafusion/datafusion/datafusion/proto-models/src/generated/datafusion_proto_common.rs:826
  field ParquetOptions.split_row_groups_by_range in /home/runner/work/datafusion/datafusion/datafusion/proto-models/src/generated/datafusion_proto_common.rs:826

     Summary semver requires new major version: 1 major and 0 minor checks failed
    Finished [  50.715s] datafusion-proto-models
    Building datafusion-sqllogictest v54.0.0 (current)
       Built [ 178.048s] (current)
     Parsing datafusion-sqllogictest v54.0.0 (current)
      Parsed [   0.022s] (current)
    Building datafusion-sqllogictest v54.0.0 (baseline)
       Built [ 180.504s] (baseline)
     Parsing datafusion-sqllogictest v54.0.0 (baseline)
      Parsed [   0.021s] (baseline)
    Checking datafusion-sqllogictest v54.0.0 -> v54.0.0 (no change; assume patch)
     Checked [   0.095s] 223 checks: 223 pass, 30 skip
     Summary no semver update required
    Finished [ 361.702s] datafusion-sqllogictest

@github-actions github-actions Bot added the auto detected api change Auto detected API change label Jul 1, 2026
@adriangbot

Copy link
Copy Markdown

🤖 Benchmark completed (GKE) | trigger

Instance: c4a-highmem-16 (12 vCPU / 65 GiB)

CPU Details (lscpu)
Architecture:                            aarch64
CPU op-mode(s):                          64-bit
Byte Order:                              Little Endian
CPU(s):                                  16
On-line CPU(s) list:                     0-15
Vendor ID:                               ARM
Model name:                              Neoverse-V2
Model:                                   1
Thread(s) per core:                      1
Core(s) per cluster:                     16
Socket(s):                               -
Cluster(s):                              1
Stepping:                                r0p1
BogoMIPS:                                2000.00
Flags:                                   fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 sve asimdfhm dit uscat ilrcpc flagm sb paca pacg dcpodp sve2 sveaes svepmull svebitperm svesha3 svesm4 flagm2 frint svei8mm svebf16 i8mm bf16 dgh rng bti
L1d cache:                               1 MiB (16 instances)
L1i cache:                               1 MiB (16 instances)
L2 cache:                                32 MiB (16 instances)
L3 cache:                                80 MiB (1 instance)
NUMA node(s):                            1
NUMA node0 CPU(s):                       0-15
Vulnerability Gather data sampling:      Not affected
Vulnerability Indirect target selection: Not affected
Vulnerability Itlb multihit:             Not affected
Vulnerability L1tf:                      Not affected
Vulnerability Mds:                       Not affected
Vulnerability Meltdown:                  Not affected
Vulnerability Mmio stale data:           Not affected
Vulnerability Reg file data sampling:    Not affected
Vulnerability Retbleed:                  Not affected
Vulnerability Spec rstack overflow:      Not affected
Vulnerability Spec store bypass:         Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1:                Mitigation; __user pointer sanitization
Vulnerability Spectre v2:                Mitigation; CSV2, BHB
Vulnerability Srbds:                     Not affected
Vulnerability Tsa:                       Not affected
Vulnerability Tsx async abort:           Not affected
Vulnerability Vmscape:                   Not affected
Details

Comparing HEAD and split-row-groups-by-range
--------------------
Benchmark tpch_sf1.json
--------------------
┏━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━┓
┃ Query     ┃                              HEAD ┃         split-row-groups-by-range ┃       Change ┃
┡━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━┩
│ QQuery 1  │    40.50 / 41.46 ±1.18 / 43.78 ms │    38.37 / 39.47 ±1.10 / 41.26 ms │    no change │
│ QQuery 2  │    20.51 / 21.16 ±0.53 / 22.08 ms │    22.05 / 23.03 ±0.75 / 24.19 ms │ 1.09x slower │
│ QQuery 3  │    33.50 / 35.09 ±1.29 / 36.68 ms │    35.41 / 36.31 ±0.96 / 37.59 ms │    no change │
│ QQuery 4  │    19.16 / 19.67 ±0.56 / 20.67 ms │    19.24 / 19.60 ±0.62 / 20.83 ms │    no change │
│ QQuery 5  │    41.91 / 44.26 ±1.34 / 45.94 ms │    45.13 / 46.25 ±1.25 / 48.50 ms │    no change │
│ QQuery 6  │    17.45 / 17.65 ±0.12 / 17.79 ms │    17.88 / 18.49 ±0.60 / 19.63 ms │    no change │
│ QQuery 7  │    50.51 / 50.90 ±0.36 / 51.51 ms │    50.95 / 54.17 ±1.75 / 56.07 ms │ 1.06x slower │
│ QQuery 8  │    46.07 / 46.95 ±1.13 / 49.16 ms │    50.07 / 50.86 ±0.73 / 52.25 ms │ 1.08x slower │
│ QQuery 9  │    53.36 / 54.07 ±0.57 / 54.90 ms │    57.05 / 59.52 ±1.87 / 62.82 ms │ 1.10x slower │
│ QQuery 10 │    45.36 / 46.34 ±1.19 / 48.65 ms │    50.40 / 50.94 ±0.52 / 51.91 ms │ 1.10x slower │
│ QQuery 11 │    14.66 / 14.93 ±0.17 / 15.19 ms │    15.72 / 16.54 ±1.20 / 18.91 ms │ 1.11x slower │
│ QQuery 12 │    25.20 / 27.23 ±2.04 / 30.79 ms │    25.83 / 26.53 ±0.66 / 27.40 ms │    no change │
│ QQuery 13 │    33.54 / 35.28 ±1.41 / 37.13 ms │    31.69 / 34.96 ±2.08 / 38.07 ms │    no change │
│ QQuery 14 │    25.50 / 26.44 ±1.16 / 28.71 ms │    26.91 / 27.97 ±1.17 / 30.08 ms │ 1.06x slower │
│ QQuery 15 │    33.72 / 35.21 ±0.86 / 36.41 ms │    35.29 / 35.77 ±0.31 / 36.25 ms │    no change │
│ QQuery 16 │    15.29 / 15.73 ±0.53 / 16.69 ms │    14.98 / 15.28 ±0.29 / 15.82 ms │    no change │
│ QQuery 17 │ 100.22 / 103.92 ±2.82 / 108.46 ms │ 101.37 / 104.88 ±2.13 / 106.81 ms │    no change │
│ QQuery 18 │    74.98 / 76.42 ±1.22 / 78.41 ms │    78.47 / 79.75 ±1.18 / 81.65 ms │    no change │
│ QQuery 19 │    35.01 / 35.46 ±0.41 / 36.05 ms │    37.32 / 37.64 ±0.27 / 37.98 ms │ 1.06x slower │
│ QQuery 20 │    37.42 / 38.38 ±1.12 / 40.48 ms │    39.37 / 41.09 ±1.82 / 44.17 ms │ 1.07x slower │
│ QQuery 21 │    61.98 / 64.90 ±2.92 / 70.31 ms │    62.24 / 64.06 ±1.16 / 65.40 ms │    no change │
│ QQuery 22 │    15.23 / 15.36 ±0.10 / 15.52 ms │    17.72 / 18.02 ±0.21 / 18.38 ms │ 1.17x slower │
└───────────┴───────────────────────────────────┴───────────────────────────────────┴──────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━┓
┃ Benchmark Summary                        ┃          ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━┩
│ Total Time (HEAD)                        │ 866.81ms │
│ Total Time (split-row-groups-by-range)   │ 901.12ms │
│ Average Time (HEAD)                      │  39.40ms │
│ Average Time (split-row-groups-by-range) │  40.96ms │
│ Queries Faster                           │        0 │
│ Queries Slower                           │       10 │
│ Queries with No Change                   │       12 │
│ Queries with Failure                     │        0 │
└──────────────────────────────────────────┴──────────┘

Resource Usage

tpch — base (merge-base)

Metric Value
Wall time 5.0s
Peak memory 1.2 GiB
Avg memory 685.4 MiB
CPU user 24.1s
CPU sys 1.7s
Peak spill 0 B

tpch — branch

Metric Value
Wall time 5.0s
Peak memory 1.1 GiB
Avg memory 704.4 MiB
CPU user 26.1s
CPU sys 2.0s
Peak spill 0 B

File an issue against this benchmark runner

@adriangbot

Copy link
Copy Markdown

🤖 Benchmark completed (GKE) | trigger

Instance: c4a-highmem-16 (12 vCPU / 65 GiB)

CPU Details (lscpu)
Architecture:                            aarch64
CPU op-mode(s):                          64-bit
Byte Order:                              Little Endian
CPU(s):                                  16
On-line CPU(s) list:                     0-15
Vendor ID:                               ARM
Model name:                              Neoverse-V2
Model:                                   1
Thread(s) per core:                      1
Core(s) per cluster:                     16
Socket(s):                               -
Cluster(s):                              1
Stepping:                                r0p1
BogoMIPS:                                2000.00
Flags:                                   fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 sve asimdfhm dit uscat ilrcpc flagm sb paca pacg dcpodp sve2 sveaes svepmull svebitperm svesha3 svesm4 flagm2 frint svei8mm svebf16 i8mm bf16 dgh rng bti
L1d cache:                               1 MiB (16 instances)
L1i cache:                               1 MiB (16 instances)
L2 cache:                                32 MiB (16 instances)
L3 cache:                                80 MiB (1 instance)
NUMA node(s):                            1
NUMA node0 CPU(s):                       0-15
Vulnerability Gather data sampling:      Not affected
Vulnerability Indirect target selection: Not affected
Vulnerability Itlb multihit:             Not affected
Vulnerability L1tf:                      Not affected
Vulnerability Mds:                       Not affected
Vulnerability Meltdown:                  Not affected
Vulnerability Mmio stale data:           Not affected
Vulnerability Reg file data sampling:    Not affected
Vulnerability Retbleed:                  Not affected
Vulnerability Spec rstack overflow:      Not affected
Vulnerability Spec store bypass:         Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1:                Mitigation; __user pointer sanitization
Vulnerability Spectre v2:                Mitigation; CSV2, BHB
Vulnerability Srbds:                     Not affected
Vulnerability Tsa:                       Not affected
Vulnerability Tsx async abort:           Not affected
Vulnerability Vmscape:                   Not affected
Details

Comparing HEAD and split-row-groups-by-range
--------------------
Benchmark tpcds_sf1.json
--------------------
┏━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query     ┃                                  HEAD ┃             split-row-groups-by-range ┃        Change ┃
┡━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ QQuery 1  │           5.59 / 6.13 ±0.79 / 7.68 ms │           5.60 / 6.10 ±0.83 / 7.76 ms │     no change │
│ QQuery 2  │        80.37 / 80.74 ±0.32 / 81.30 ms │        38.54 / 39.20 ±0.75 / 40.56 ms │ +2.06x faster │
│ QQuery 3  │        29.54 / 29.72 ±0.13 / 29.93 ms │        16.48 / 16.59 ±0.11 / 16.79 ms │ +1.79x faster │
│ QQuery 4  │     493.81 / 499.13 ±4.33 / 504.65 ms │     236.60 / 240.54 ±2.03 / 242.21 ms │ +2.08x faster │
│ QQuery 5  │        52.00 / 52.79 ±0.46 / 53.21 ms │        47.58 / 48.44 ±0.97 / 50.24 ms │ +1.09x faster │
│ QQuery 6  │        36.64 / 37.06 ±0.28 / 37.42 ms │        18.29 / 19.22 ±0.74 / 20.07 ms │ +1.93x faster │
│ QQuery 7  │        93.85 / 95.10 ±0.71 / 95.92 ms │        37.59 / 39.06 ±1.52 / 41.76 ms │ +2.43x faster │
│ QQuery 8  │        37.13 / 39.40 ±3.81 / 47.01 ms │        23.65 / 23.97 ±0.29 / 24.50 ms │ +1.64x faster │
│ QQuery 9  │        51.96 / 56.29 ±2.45 / 58.59 ms │        61.02 / 62.59 ±1.09 / 63.80 ms │  1.11x slower │
│ QQuery 10 │        63.21 / 63.57 ±0.29 / 64.02 ms │        34.88 / 35.77 ±0.67 / 36.58 ms │ +1.78x faster │
│ QQuery 11 │     303.96 / 309.05 ±4.30 / 316.34 ms │     122.91 / 125.65 ±3.57 / 132.60 ms │ +2.46x faster │
│ QQuery 12 │        28.44 / 28.96 ±0.27 / 29.17 ms │        13.11 / 13.69 ±1.05 / 15.78 ms │ +2.12x faster │
│ QQuery 13 │     119.03 / 120.00 ±0.89 / 121.19 ms │        55.16 / 57.56 ±3.86 / 65.23 ms │ +2.08x faster │
│ QQuery 14 │     414.54 / 417.91 ±2.27 / 421.02 ms │     236.50 / 240.37 ±3.67 / 247.25 ms │ +1.74x faster │
│ QQuery 15 │        58.18 / 59.13 ±0.61 / 60.04 ms │        16.41 / 17.18 ±0.53 / 18.00 ms │ +3.44x faster │
│ QQuery 16 │           6.61 / 6.82 ±0.20 / 7.19 ms │           6.77 / 6.96 ±0.27 / 7.50 ms │     no change │
│ QQuery 17 │        80.98 / 83.16 ±1.98 / 86.70 ms │        42.74 / 44.24 ±1.19 / 45.55 ms │ +1.88x faster │
│ QQuery 18 │     123.00 / 124.02 ±0.56 / 124.48 ms │        52.20 / 54.68 ±3.04 / 60.42 ms │ +2.27x faster │
│ QQuery 19 │        41.67 / 42.29 ±0.38 / 42.80 ms │        24.31 / 24.51 ±0.16 / 24.76 ms │ +1.73x faster │
│ QQuery 20 │        35.58 / 36.44 ±0.59 / 37.06 ms │        15.63 / 15.86 ±0.12 / 15.97 ms │ +2.30x faster │
│ QQuery 21 │        17.48 / 17.89 ±0.31 / 18.41 ms │        18.35 / 18.50 ±0.13 / 18.73 ms │     no change │
│ QQuery 22 │        62.95 / 65.77 ±3.30 / 71.90 ms │        60.18 / 61.45 ±1.43 / 63.99 ms │ +1.07x faster │
│ QQuery 23 │     355.63 / 360.30 ±4.94 / 367.23 ms │     220.30 / 225.60 ±4.08 / 231.31 ms │ +1.60x faster │
│ QQuery 24 │     227.28 / 229.83 ±2.34 / 233.50 ms │     136.18 / 139.55 ±3.22 / 145.19 ms │ +1.65x faster │
│ QQuery 25 │     110.23 / 111.48 ±0.96 / 112.61 ms │        54.01 / 55.98 ±1.60 / 57.93 ms │ +1.99x faster │
│ QQuery 26 │        58.26 / 58.73 ±0.47 / 59.41 ms │        24.79 / 25.28 ±0.26 / 25.56 ms │ +2.32x faster │
│ QQuery 27 │           6.34 / 6.48 ±0.15 / 6.77 ms │           6.36 / 6.54 ±0.21 / 6.95 ms │     no change │
│ QQuery 28 │        61.29 / 61.73 ±0.36 / 62.30 ms │        54.91 / 56.13 ±0.85 / 57.22 ms │ +1.10x faster │
│ QQuery 29 │      97.30 / 100.37 ±3.93 / 108.03 ms │        46.39 / 47.61 ±0.64 / 48.25 ms │ +2.11x faster │
│ QQuery 30 │        32.29 / 32.88 ±0.49 / 33.49 ms │        25.84 / 26.32 ±0.38 / 26.98 ms │ +1.25x faster │
│ QQuery 31 │     111.56 / 112.34 ±0.51 / 112.96 ms │        71.17 / 71.95 ±0.73 / 73.14 ms │ +1.56x faster │
│ QQuery 32 │        20.51 / 22.57 ±3.34 / 29.23 ms │        14.88 / 15.01 ±0.13 / 15.26 ms │ +1.50x faster │
│ QQuery 33 │        38.57 / 39.77 ±0.81 / 40.88 ms │        30.26 / 31.39 ±1.18 / 33.25 ms │ +1.27x faster │
│ QQuery 34 │         9.97 / 10.23 ±0.32 / 10.84 ms │           8.33 / 9.20 ±0.62 / 9.90 ms │ +1.11x faster │
│ QQuery 35 │        72.88 / 74.00 ±1.02 / 75.91 ms │        39.21 / 39.72 ±0.56 / 40.47 ms │ +1.86x faster │
│ QQuery 36 │           5.94 / 6.09 ±0.22 / 6.53 ms │           5.72 / 5.86 ±0.19 / 6.24 ms │     no change │
│ QQuery 37 │           7.21 / 7.26 ±0.04 / 7.32 ms │           6.90 / 7.04 ±0.09 / 7.16 ms │     no change │
│ QQuery 38 │        63.14 / 64.13 ±1.23 / 66.53 ms │        28.81 / 30.42 ±1.34 / 32.68 ms │ +2.11x faster │
│ QQuery 39 │        86.33 / 88.37 ±1.35 / 89.93 ms │        83.35 / 85.31 ±1.28 / 87.34 ms │     no change │
│ QQuery 40 │        23.58 / 23.73 ±0.14 / 23.95 ms │        16.40 / 16.70 ±0.17 / 16.87 ms │ +1.42x faster │
│ QQuery 41 │        11.36 / 11.63 ±0.20 / 11.99 ms │        11.63 / 11.74 ±0.17 / 12.07 ms │     no change │
│ QQuery 42 │        24.15 / 24.44 ±0.38 / 25.19 ms │        14.62 / 14.78 ±0.12 / 15.00 ms │ +1.65x faster │
│ QQuery 43 │           4.83 / 4.98 ±0.16 / 5.30 ms │           4.88 / 4.99 ±0.17 / 5.32 ms │     no change │
│ QQuery 44 │           9.23 / 9.38 ±0.13 / 9.60 ms │           9.25 / 9.43 ±0.13 / 9.59 ms │     no change │
│ QQuery 45 │        38.50 / 40.56 ±1.94 / 44.24 ms │        14.96 / 15.45 ±0.34 / 15.87 ms │ +2.63x faster │
│ QQuery 46 │        12.64 / 12.87 ±0.24 / 13.27 ms │        11.67 / 11.95 ±0.25 / 12.32 ms │ +1.08x faster │
│ QQuery 47 │     233.88 / 239.74 ±4.72 / 245.27 ms │     101.84 / 103.69 ±2.92 / 109.50 ms │ +2.31x faster │
│ QQuery 48 │        95.92 / 96.52 ±0.33 / 96.91 ms │        40.61 / 40.80 ±0.21 / 41.20 ms │ +2.37x faster │
│ QQuery 49 │        77.18 / 78.54 ±1.08 / 80.49 ms │        65.67 / 66.14 ±0.35 / 66.55 ms │ +1.19x faster │
│ QQuery 50 │        59.41 / 60.88 ±0.98 / 62.34 ms │        33.33 / 36.69 ±3.16 / 42.64 ms │ +1.66x faster │
│ QQuery 51 │       98.13 / 99.98 ±1.04 / 101.31 ms │        66.80 / 68.97 ±1.87 / 72.11 ms │ +1.45x faster │
│ QQuery 52 │        24.13 / 25.60 ±1.34 / 28.04 ms │        14.80 / 15.00 ±0.14 / 15.18 ms │ +1.71x faster │
│ QQuery 53 │        30.51 / 31.14 ±0.78 / 32.64 ms │        14.61 / 14.82 ±0.13 / 15.01 ms │ +2.10x faster │
│ QQuery 54 │        56.58 / 56.85 ±0.29 / 57.33 ms │        33.41 / 34.11 ±0.59 / 35.19 ms │ +1.67x faster │
│ QQuery 55 │        23.69 / 24.02 ±0.35 / 24.58 ms │        14.27 / 14.56 ±0.21 / 14.81 ms │ +1.65x faster │
│ QQuery 56 │        39.37 / 39.86 ±0.28 / 40.18 ms │        31.81 / 33.68 ±1.38 / 35.97 ms │ +1.18x faster │
│ QQuery 57 │     178.87 / 182.69 ±5.10 / 192.72 ms │        61.05 / 62.15 ±1.27 / 64.58 ms │ +2.94x faster │
│ QQuery 58 │     114.71 / 117.25 ±1.43 / 118.73 ms │        51.66 / 53.60 ±2.25 / 57.78 ms │ +2.19x faster │
│ QQuery 59 │     117.42 / 117.96 ±0.49 / 118.78 ms │        44.43 / 44.75 ±0.26 / 45.09 ms │ +2.64x faster │
│ QQuery 60 │        39.75 / 41.40 ±2.40 / 46.17 ms │        33.37 / 34.32 ±0.97 / 35.57 ms │ +1.21x faster │
│ QQuery 61 │        12.72 / 13.22 ±0.77 / 14.74 ms │        12.84 / 14.04 ±2.06 / 18.15 ms │  1.06x slower │
│ QQuery 62 │        46.80 / 47.17 ±0.25 / 47.59 ms │        13.57 / 14.18 ±0.97 / 16.11 ms │ +3.33x faster │
│ QQuery 63 │        30.25 / 30.53 ±0.25 / 30.87 ms │        14.86 / 15.04 ±0.10 / 15.16 ms │ +2.03x faster │
│ QQuery 64 │     412.33 / 415.83 ±3.08 / 420.45 ms │     239.94 / 243.36 ±2.08 / 246.14 ms │ +1.71x faster │
│ QQuery 65 │     151.52 / 153.66 ±1.76 / 156.12 ms │     155.76 / 163.16 ±4.39 / 168.12 ms │  1.06x slower │
│ QQuery 66 │        79.27 / 80.37 ±0.89 / 81.86 ms │        53.16 / 56.09 ±3.45 / 62.81 ms │ +1.43x faster │
│ QQuery 67 │     242.18 / 248.84 ±8.88 / 266.27 ms │     105.74 / 108.11 ±1.92 / 111.47 ms │ +2.30x faster │
│ QQuery 68 │        12.29 / 14.55 ±4.06 / 22.65 ms │        11.82 / 12.33 ±0.44 / 12.99 ms │ +1.18x faster │
│ QQuery 69 │        57.48 / 58.63 ±0.75 / 59.61 ms │        32.19 / 33.11 ±0.71 / 34.07 ms │ +1.77x faster │
│ QQuery 70 │     105.92 / 109.54 ±4.42 / 117.89 ms │        68.96 / 72.06 ±3.07 / 76.01 ms │ +1.52x faster │
│ QQuery 71 │        36.55 / 39.04 ±3.78 / 46.54 ms │        28.12 / 29.64 ±1.25 / 31.60 ms │ +1.32x faster │
│ QQuery 72 │ 2114.71 / 2197.65 ±55.04 / 2288.38 ms │ 2131.18 / 2176.88 ±30.18 / 2217.96 ms │     no change │
│ QQuery 73 │          9.46 / 9.83 ±0.33 / 10.38 ms │          8.70 / 9.48 ±0.71 / 10.60 ms │     no change │
│ QQuery 74 │     171.65 / 174.45 ±2.16 / 177.36 ms │        80.66 / 84.55 ±5.07 / 94.43 ms │ +2.06x faster │
│ QQuery 75 │     149.78 / 156.59 ±7.61 / 170.33 ms │     108.45 / 115.06 ±4.82 / 122.50 ms │ +1.36x faster │
│ QQuery 76 │        35.72 / 36.14 ±0.45 / 36.98 ms │        28.49 / 29.83 ±1.04 / 31.09 ms │ +1.21x faster │
│ QQuery 77 │        61.52 / 63.91 ±4.03 / 71.96 ms │        51.09 / 53.95 ±3.54 / 60.93 ms │ +1.18x faster │
│ QQuery 78 │     184.43 / 191.00 ±7.99 / 206.64 ms │        79.79 / 82.53 ±1.79 / 84.83 ms │ +2.31x faster │
│ QQuery 79 │        66.87 / 67.48 ±0.44 / 68.05 ms │        32.17 / 33.03 ±0.45 / 33.47 ms │ +2.04x faster │
│ QQuery 80 │      99.78 / 101.44 ±1.98 / 105.10 ms │        82.32 / 83.41 ±0.98 / 85.25 ms │ +1.22x faster │
│ QQuery 81 │        25.91 / 26.75 ±1.07 / 28.86 ms │        23.79 / 24.63 ±1.02 / 26.56 ms │ +1.09x faster │
│ QQuery 82 │        16.72 / 17.24 ±0.43 / 17.70 ms │        14.75 / 15.05 ±0.20 / 15.33 ms │ +1.15x faster │
│ QQuery 83 │        40.46 / 42.30 ±2.15 / 46.43 ms │        29.83 / 31.68 ±2.89 / 37.37 ms │ +1.33x faster │
│ QQuery 84 │        30.45 / 30.97 ±0.45 / 31.62 ms │        20.77 / 22.05 ±2.22 / 26.48 ms │ +1.40x faster │
│ QQuery 85 │     107.59 / 110.62 ±3.57 / 115.27 ms │        48.97 / 49.69 ±0.64 / 50.81 ms │ +2.23x faster │
│ QQuery 86 │        25.45 / 26.03 ±0.38 / 26.63 ms │        10.34 / 10.97 ±1.03 / 13.02 ms │ +2.37x faster │
│ QQuery 87 │        63.06 / 65.06 ±1.90 / 68.62 ms │        28.93 / 31.63 ±2.75 / 36.86 ms │ +2.06x faster │
│ QQuery 88 │        63.62 / 64.02 ±0.43 / 64.79 ms │        66.80 / 67.30 ±0.38 / 67.75 ms │  1.05x slower │
│ QQuery 89 │        36.23 / 38.68 ±2.83 / 43.73 ms │        17.38 / 18.27 ±1.50 / 21.27 ms │ +2.12x faster │
│ QQuery 90 │        17.16 / 17.38 ±0.24 / 17.85 ms │        10.77 / 11.03 ±0.19 / 11.33 ms │ +1.58x faster │
│ QQuery 91 │        46.51 / 47.08 ±0.50 / 47.99 ms │        27.68 / 29.07 ±1.89 / 32.63 ms │ +1.62x faster │
│ QQuery 92 │        30.18 / 30.89 ±0.61 / 32.03 ms │        15.53 / 15.91 ±0.22 / 16.18 ms │ +1.94x faster │
│ QQuery 93 │        50.65 / 51.03 ±0.35 / 51.69 ms │        24.01 / 26.15 ±1.66 / 28.92 ms │ +1.95x faster │
│ QQuery 94 │        38.84 / 40.54 ±2.62 / 45.76 ms │        20.59 / 21.17 ±0.67 / 22.41 ms │ +1.92x faster │
│ QQuery 95 │        83.36 / 84.55 ±1.56 / 87.51 ms │        47.35 / 49.49 ±2.29 / 53.67 ms │ +1.71x faster │
│ QQuery 96 │        24.59 / 24.75 ±0.13 / 24.97 ms │        10.42 / 10.68 ±0.26 / 11.18 ms │ +2.32x faster │
│ QQuery 97 │        55.34 / 55.96 ±0.66 / 56.96 ms │        21.38 / 21.90 ±0.35 / 22.45 ms │ +2.56x faster │
│ QQuery 98 │        43.15 / 45.24 ±1.84 / 48.68 ms │        22.11 / 23.28 ±1.55 / 26.31 ms │ +1.94x faster │
│ QQuery 99 │        70.31 / 71.49 ±1.34 / 73.84 ms │        17.56 / 18.20 ±1.19 / 20.58 ms │ +3.93x faster │
└───────────┴───────────────────────────────────────┴───────────────────────────────────────┴───────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓
┃ Benchmark Summary                        ┃            ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩
│ Total Time (HEAD)                        │ 10098.39ms │
│ Total Time (split-row-groups-by-range)   │  6727.35ms │
│ Average Time (HEAD)                      │   102.00ms │
│ Average Time (split-row-groups-by-range) │    67.95ms │
│ Queries Faster                           │         83 │
│ Queries Slower                           │          4 │
│ Queries with No Change                   │         12 │
│ Queries with Failure                     │          0 │
└──────────────────────────────────────────┴────────────┘

Resource Usage

tpcds — base (merge-base)

Metric Value
Wall time 55.0s
Peak memory 2.2 GiB
Avg memory 1.5 GiB
CPU user 222.3s
CPU sys 6.2s
Peak spill 0 B

tpcds — branch

Metric Value
Wall time 35.0s
Peak memory 2.8 GiB
Avg memory 1.9 GiB
CPU user 256.8s
CPU sys 8.5s
Peak spill 0 B

File an issue against this benchmark runner

@Dandandan

Copy link
Copy Markdown
Contributor Author

run benchmarks

@adriangbot

Copy link
Copy Markdown

🤖 Benchmark running (GKE) | trigger
Instance: c4a-highmem-16 (12 vCPU / 65 GiB) | Linux bench-c4860255722-788-g5kjh 6.12.85+ #1 SMP Mon May 11 08:17:35 UTC 2026 aarch64 GNU/Linux

CPU Details (lscpu)
Architecture:                            aarch64
CPU op-mode(s):                          64-bit
Byte Order:                              Little Endian
CPU(s):                                  16
On-line CPU(s) list:                     0-15
Vendor ID:                               ARM
Model name:                              Neoverse-V2
Model:                                   1
Thread(s) per core:                      1
Core(s) per cluster:                     16
Socket(s):                               -
Cluster(s):                              1
Stepping:                                r0p1
BogoMIPS:                                2000.00
Flags:                                   fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 sve asimdfhm dit uscat ilrcpc flagm sb paca pacg dcpodp sve2 sveaes svepmull svebitperm svesha3 svesm4 flagm2 frint svei8mm svebf16 i8mm bf16 dgh rng bti
L1d cache:                               1 MiB (16 instances)
L1i cache:                               1 MiB (16 instances)
L2 cache:                                32 MiB (16 instances)
L3 cache:                                80 MiB (1 instance)
NUMA node(s):                            1
NUMA node0 CPU(s):                       0-15
Vulnerability Gather data sampling:      Not affected
Vulnerability Indirect target selection: Not affected
Vulnerability Itlb multihit:             Not affected
Vulnerability L1tf:                      Not affected
Vulnerability Mds:                       Not affected
Vulnerability Meltdown:                  Not affected
Vulnerability Mmio stale data:           Not affected
Vulnerability Reg file data sampling:    Not affected
Vulnerability Retbleed:                  Not affected
Vulnerability Spec rstack overflow:      Not affected
Vulnerability Spec store bypass:         Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1:                Mitigation; __user pointer sanitization
Vulnerability Spectre v2:                Mitigation; CSV2, BHB
Vulnerability Srbds:                     Not affected
Vulnerability Tsa:                       Not affected
Vulnerability Tsx async abort:           Not affected
Vulnerability Vmscape:                   Not affected

Comparing split-row-groups-by-range (6d1a25a) to 3bb9314 (merge-base) diff using: tpcds
Results will be posted here when complete


File an issue against this benchmark runner

@adriangbot

Copy link
Copy Markdown

🤖 Benchmark running (GKE) | trigger
Instance: c4a-highmem-16 (12 vCPU / 65 GiB) | Linux bench-c4860255722-789-jq7mq 6.12.85+ #1 SMP Mon May 11 08:17:35 UTC 2026 aarch64 GNU/Linux

CPU Details (lscpu)
Architecture:                            aarch64
CPU op-mode(s):                          64-bit
Byte Order:                              Little Endian
CPU(s):                                  16
On-line CPU(s) list:                     0-15
Vendor ID:                               ARM
Model name:                              Neoverse-V2
Model:                                   1
Thread(s) per core:                      1
Core(s) per cluster:                     16
Socket(s):                               -
Cluster(s):                              1
Stepping:                                r0p1
BogoMIPS:                                2000.00
Flags:                                   fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 sve asimdfhm dit uscat ilrcpc flagm sb paca pacg dcpodp sve2 sveaes svepmull svebitperm svesha3 svesm4 flagm2 frint svei8mm svebf16 i8mm bf16 dgh rng bti
L1d cache:                               1 MiB (16 instances)
L1i cache:                               1 MiB (16 instances)
L2 cache:                                32 MiB (16 instances)
L3 cache:                                80 MiB (1 instance)
NUMA node(s):                            1
NUMA node0 CPU(s):                       0-15
Vulnerability Gather data sampling:      Not affected
Vulnerability Indirect target selection: Not affected
Vulnerability Itlb multihit:             Not affected
Vulnerability L1tf:                      Not affected
Vulnerability Mds:                       Not affected
Vulnerability Meltdown:                  Not affected
Vulnerability Mmio stale data:           Not affected
Vulnerability Reg file data sampling:    Not affected
Vulnerability Retbleed:                  Not affected
Vulnerability Spec rstack overflow:      Not affected
Vulnerability Spec store bypass:         Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1:                Mitigation; __user pointer sanitization
Vulnerability Spectre v2:                Mitigation; CSV2, BHB
Vulnerability Srbds:                     Not affected
Vulnerability Tsa:                       Not affected
Vulnerability Tsx async abort:           Not affected
Vulnerability Vmscape:                   Not affected

Comparing split-row-groups-by-range (6d1a25a) to 3bb9314 (merge-base) diff using: tpch
Results will be posted here when complete


File an issue against this benchmark runner

@adriangbot

Copy link
Copy Markdown

🤖 Benchmark running (GKE) | trigger
Instance: c4a-highmem-16 (12 vCPU / 65 GiB) | Linux bench-c4860255722-787-z7t9d 6.12.85+ #1 SMP Mon May 11 08:17:35 UTC 2026 aarch64 GNU/Linux

CPU Details (lscpu)
Architecture:                            aarch64
CPU op-mode(s):                          64-bit
Byte Order:                              Little Endian
CPU(s):                                  16
On-line CPU(s) list:                     0-15
Vendor ID:                               ARM
Model name:                              Neoverse-V2
Model:                                   1
Thread(s) per core:                      1
Core(s) per cluster:                     16
Socket(s):                               -
Cluster(s):                              1
Stepping:                                r0p1
BogoMIPS:                                2000.00
Flags:                                   fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 sve asimdfhm dit uscat ilrcpc flagm sb paca pacg dcpodp sve2 sveaes svepmull svebitperm svesha3 svesm4 flagm2 frint svei8mm svebf16 i8mm bf16 dgh rng bti
L1d cache:                               1 MiB (16 instances)
L1i cache:                               1 MiB (16 instances)
L2 cache:                                32 MiB (16 instances)
L3 cache:                                80 MiB (1 instance)
NUMA node(s):                            1
NUMA node0 CPU(s):                       0-15
Vulnerability Gather data sampling:      Not affected
Vulnerability Indirect target selection: Not affected
Vulnerability Itlb multihit:             Not affected
Vulnerability L1tf:                      Not affected
Vulnerability Mds:                       Not affected
Vulnerability Meltdown:                  Not affected
Vulnerability Mmio stale data:           Not affected
Vulnerability Reg file data sampling:    Not affected
Vulnerability Retbleed:                  Not affected
Vulnerability Spec rstack overflow:      Not affected
Vulnerability Spec store bypass:         Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1:                Mitigation; __user pointer sanitization
Vulnerability Spectre v2:                Mitigation; CSV2, BHB
Vulnerability Srbds:                     Not affected
Vulnerability Tsa:                       Not affected
Vulnerability Tsx async abort:           Not affected
Vulnerability Vmscape:                   Not affected

Comparing split-row-groups-by-range (6d1a25a) to 3bb9314 (merge-base) diff using: clickbench_partitioned
Results will be posted here when complete


File an issue against this benchmark runner

@adriangbot

Copy link
Copy Markdown

🤖 Benchmark completed (GKE) | trigger

Instance: c4a-highmem-16 (12 vCPU / 65 GiB)

CPU Details (lscpu)
Architecture:                            aarch64
CPU op-mode(s):                          64-bit
Byte Order:                              Little Endian
CPU(s):                                  16
On-line CPU(s) list:                     0-15
Vendor ID:                               ARM
Model name:                              Neoverse-V2
Model:                                   1
Thread(s) per core:                      1
Core(s) per cluster:                     16
Socket(s):                               -
Cluster(s):                              1
Stepping:                                r0p1
BogoMIPS:                                2000.00
Flags:                                   fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 sve asimdfhm dit uscat ilrcpc flagm sb paca pacg dcpodp sve2 sveaes svepmull svebitperm svesha3 svesm4 flagm2 frint svei8mm svebf16 i8mm bf16 dgh rng bti
L1d cache:                               1 MiB (16 instances)
L1i cache:                               1 MiB (16 instances)
L2 cache:                                32 MiB (16 instances)
L3 cache:                                80 MiB (1 instance)
NUMA node(s):                            1
NUMA node0 CPU(s):                       0-15
Vulnerability Gather data sampling:      Not affected
Vulnerability Indirect target selection: Not affected
Vulnerability Itlb multihit:             Not affected
Vulnerability L1tf:                      Not affected
Vulnerability Mds:                       Not affected
Vulnerability Meltdown:                  Not affected
Vulnerability Mmio stale data:           Not affected
Vulnerability Reg file data sampling:    Not affected
Vulnerability Retbleed:                  Not affected
Vulnerability Spec rstack overflow:      Not affected
Vulnerability Spec store bypass:         Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1:                Mitigation; __user pointer sanitization
Vulnerability Spectre v2:                Mitigation; CSV2, BHB
Vulnerability Srbds:                     Not affected
Vulnerability Tsa:                       Not affected
Vulnerability Tsx async abort:           Not affected
Vulnerability Vmscape:                   Not affected
Details

Comparing HEAD and split-row-groups-by-range
--------------------
Benchmark clickbench_partitioned.json
--------------------
┏━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query     ┃                                       HEAD ┃                  split-row-groups-by-range ┃        Change ┃
┡━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ QQuery 0  │               1.22 / 3.89 ±5.27 / 14.43 ms │               1.22 / 3.95 ±5.39 / 14.73 ms │     no change │
│ QQuery 1  │             13.07 / 13.17 ±0.08 / 13.29 ms │             12.85 / 13.04 ±0.13 / 13.22 ms │     no change │
│ QQuery 2  │             35.87 / 36.26 ±0.29 / 36.77 ms │             36.26 / 37.94 ±1.20 / 39.13 ms │     no change │
│ QQuery 3  │             30.42 / 31.35 ±0.73 / 32.25 ms │             30.72 / 31.49 ±0.53 / 32.37 ms │     no change │
│ QQuery 4  │      1668.51 / 1724.82 ±35.56 / 1774.02 ms │      1620.07 / 1714.27 ±66.44 / 1787.39 ms │     no change │
│ QQuery 5  │      1646.40 / 1807.81 ±85.17 / 1896.86 ms │     1588.72 / 1710.81 ±118.10 / 1936.05 ms │ +1.06x faster │
│ QQuery 6  │                1.27 / 1.46 ±0.23 / 1.90 ms │                1.29 / 1.45 ±0.23 / 1.91 ms │     no change │
│ QQuery 7  │             14.03 / 14.18 ±0.09 / 14.30 ms │             14.22 / 14.44 ±0.14 / 14.58 ms │     no change │
│ QQuery 8  │      1999.19 / 2027.09 ±24.63 / 2057.46 ms │      1943.29 / 2015.64 ±70.46 / 2142.54 ms │     no change │
│ QQuery 9  │         453.96 / 480.58 ±19.59 / 506.89 ms │         467.58 / 499.61 ±41.20 / 579.19 ms │     no change │
│ QQuery 10 │             76.41 / 78.10 ±1.20 / 79.59 ms │             76.82 / 79.85 ±1.67 / 81.48 ms │     no change │
│ QQuery 11 │          89.18 / 100.76 ±16.95 / 133.86 ms │            91.61 / 97.92 ±6.75 / 106.34 ms │     no change │
│ QQuery 12 │      1691.87 / 1799.01 ±67.52 / 1890.86 ms │      1697.93 / 1841.96 ±81.64 / 1941.17 ms │     no change │
│ QQuery 13 │         498.42 / 596.07 ±62.55 / 671.39 ms │       473.92 / 868.32 ±466.45 / 1776.63 ms │  1.46x slower │
│ QQuery 14 │         533.47 / 547.39 ±22.09 / 591.14 ms │          551.07 / 558.80 ±6.76 / 568.87 ms │     no change │
│ QQuery 15 │      1833.53 / 1919.28 ±48.14 / 1959.04 ms │      1829.28 / 1879.44 ±43.96 / 1960.23 ms │     no change │
│ QQuery 16 │      4103.46 / 4186.73 ±49.92 / 4238.93 ms │     4225.27 / 4394.11 ±169.15 / 4711.02 ms │     no change │
│ QQuery 17 │     4036.98 / 4233.96 ±129.85 / 4424.26 ms │     4195.66 / 4333.53 ±117.15 / 4540.01 ms │     no change │
│ QQuery 18 │  17411.62 / 17894.28 ±414.69 / 18660.05 ms │  17728.52 / 17928.21 ±144.06 / 18085.77 ms │     no change │
│ QQuery 19 │             27.73 / 28.44 ±0.59 / 29.33 ms │             28.65 / 30.25 ±1.75 / 33.54 ms │  1.06x slower │
│ QQuery 20 │          514.48 / 519.95 ±6.28 / 531.98 ms │          516.23 / 524.54 ±6.02 / 534.04 ms │     no change │
│ QQuery 21 │          511.16 / 522.91 ±8.23 / 535.27 ms │          515.58 / 522.77 ±5.94 / 530.54 ms │     no change │
│ QQuery 22 │          975.57 / 984.07 ±8.82 / 998.71 ms │        994.65 / 1000.23 ±5.69 / 1009.90 ms │     no change │
│ QQuery 23 │      2989.84 / 3032.06 ±37.59 / 3101.79 ms │      3032.64 / 3065.78 ±20.86 / 3095.46 ms │     no change │
│ QQuery 24 │             41.08 / 41.50 ±0.51 / 42.48 ms │             40.69 / 41.10 ±0.31 / 41.58 ms │     no change │
│ QQuery 25 │          110.11 / 114.47 ±5.12 / 122.82 ms │          111.99 / 120.14 ±7.66 / 132.63 ms │     no change │
│ QQuery 26 │             41.52 / 42.82 ±0.72 / 43.62 ms │             41.36 / 43.17 ±1.82 / 45.83 ms │     no change │
│ QQuery 27 │          667.93 / 674.34 ±4.31 / 678.58 ms │         669.14 / 680.26 ±10.48 / 699.82 ms │     no change │
│ QQuery 28 │     3438.98 / 3572.27 ±146.00 / 3854.25 ms │      3352.12 / 3455.12 ±56.22 / 3512.29 ms │     no change │
│ QQuery 29 │             39.96 / 42.78 ±5.03 / 52.83 ms │             40.51 / 42.08 ±1.53 / 44.68 ms │     no change │
│ QQuery 30 │         545.71 / 559.49 ±11.91 / 581.34 ms │         550.39 / 568.77 ±18.68 / 595.88 ms │     no change │
│ QQuery 31 │          281.06 / 291.94 ±5.67 / 297.31 ms │          291.85 / 299.74 ±6.99 / 312.61 ms │     no change │
│ QQuery 32 │         936.49 / 969.74 ±21.00 / 998.26 ms │         941.64 / 967.88 ±18.61 / 990.06 ms │     no change │
│ QQuery 33 │ 25268.56 / 27633.79 ±1506.95 / 29653.29 ms │ 26060.75 / 28647.11 ±1420.56 / 30410.37 ms │     no change │
│ QQuery 34 │  26850.77 / 27655.21 ±899.12 / 29265.85 ms │ 26531.81 / 29335.67 ±2110.05 / 32449.60 ms │  1.06x slower │
│ QQuery 35 │      977.38 / 1077.76 ±123.93 / 1309.23 ms │       952.56 / 1002.67 ±40.12 / 1065.17 ms │ +1.07x faster │
│ QQuery 36 │         162.74 / 180.33 ±22.33 / 224.37 ms │         158.97 / 178.88 ±16.81 / 209.66 ms │     no change │
│ QQuery 37 │             36.62 / 37.10 ±0.66 / 38.39 ms │             37.14 / 40.00 ±4.45 / 48.85 ms │  1.08x slower │
│ QQuery 38 │            43.33 / 50.96 ±10.20 / 70.97 ms │             40.71 / 42.70 ±1.19 / 44.10 ms │ +1.19x faster │
│ QQuery 39 │         172.98 / 192.98 ±20.76 / 231.92 ms │         173.45 / 187.79 ±12.71 / 209.88 ms │     no change │
│ QQuery 40 │             14.13 / 14.87 ±0.71 / 15.96 ms │             14.26 / 14.67 ±0.28 / 15.04 ms │     no change │
│ QQuery 41 │             13.20 / 13.38 ±0.15 / 13.64 ms │             13.43 / 13.81 ±0.37 / 14.41 ms │     no change │
│ QQuery 42 │             12.83 / 13.04 ±0.19 / 13.35 ms │             12.90 / 13.06 ±0.11 / 13.24 ms │     no change │
└───────────┴────────────────────────────────────────────┴────────────────────────────────────────────┴───────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┓
┃ Benchmark Summary                        ┃             ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━┩
│ Total Time (HEAD)                        │ 105762.39ms │
│ Total Time (split-row-groups-by-range)   │ 108862.94ms │
│ Average Time (HEAD)                      │   2459.59ms │
│ Average Time (split-row-groups-by-range) │   2531.70ms │
│ Queries Faster                           │           3 │
│ Queries Slower                           │           4 │
│ Queries with No Change                   │          36 │
│ Queries with Failure                     │           0 │
└──────────────────────────────────────────┴─────────────┘

Resource Usage

clickbench_partitioned — base (merge-base)

Metric Value
Wall time 530.1s
Peak memory 11.8 GiB
Avg memory 6.5 GiB
CPU user 4776.9s
CPU sys 304.2s
Peak spill 0 B

clickbench_partitioned — branch

Metric Value
Wall time 545.1s
Peak memory 12.1 GiB
Avg memory 6.6 GiB
CPU user 4811.6s
CPU sys 307.2s
Peak spill 0 B

File an issue against this benchmark runner

@adriangbot

Copy link
Copy Markdown

🤖 Benchmark completed (GKE) | trigger

Instance: c4a-highmem-16 (12 vCPU / 65 GiB)

CPU Details (lscpu)
Architecture:                            aarch64
CPU op-mode(s):                          64-bit
Byte Order:                              Little Endian
CPU(s):                                  16
On-line CPU(s) list:                     0-15
Vendor ID:                               ARM
Model name:                              Neoverse-V2
Model:                                   1
Thread(s) per core:                      1
Core(s) per cluster:                     16
Socket(s):                               -
Cluster(s):                              1
Stepping:                                r0p1
BogoMIPS:                                2000.00
Flags:                                   fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 sve asimdfhm dit uscat ilrcpc flagm sb paca pacg dcpodp sve2 sveaes svepmull svebitperm svesha3 svesm4 flagm2 frint svei8mm svebf16 i8mm bf16 dgh rng bti
L1d cache:                               1 MiB (16 instances)
L1i cache:                               1 MiB (16 instances)
L2 cache:                                32 MiB (16 instances)
L3 cache:                                80 MiB (1 instance)
NUMA node(s):                            1
NUMA node0 CPU(s):                       0-15
Vulnerability Gather data sampling:      Not affected
Vulnerability Indirect target selection: Not affected
Vulnerability Itlb multihit:             Not affected
Vulnerability L1tf:                      Not affected
Vulnerability Mds:                       Not affected
Vulnerability Meltdown:                  Not affected
Vulnerability Mmio stale data:           Not affected
Vulnerability Reg file data sampling:    Not affected
Vulnerability Retbleed:                  Not affected
Vulnerability Spec rstack overflow:      Not affected
Vulnerability Spec store bypass:         Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1:                Mitigation; __user pointer sanitization
Vulnerability Spectre v2:                Mitigation; CSV2, BHB
Vulnerability Srbds:                     Not affected
Vulnerability Tsa:                       Not affected
Vulnerability Tsx async abort:           Not affected
Vulnerability Vmscape:                   Not affected
Details

Comparing HEAD and split-row-groups-by-range
--------------------
Benchmark tpch_sf1.json
--------------------
┏━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query     ┃                           HEAD ┃      split-row-groups-by-range ┃        Change ┃
┡━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ QQuery 1  │ 38.10 / 38.96 ±0.94 / 40.80 ms │ 36.39 / 37.12 ±1.12 / 39.34 ms │     no change │
│ QQuery 2  │ 19.45 / 20.03 ±0.57 / 21.05 ms │ 20.37 / 21.34 ±0.65 / 22.22 ms │  1.07x slower │
│ QQuery 3  │ 31.95 / 33.68 ±0.90 / 34.50 ms │ 33.06 / 34.12 ±0.64 / 34.95 ms │     no change │
│ QQuery 4  │ 17.60 / 17.76 ±0.13 / 17.96 ms │ 18.20 / 18.34 ±0.12 / 18.54 ms │     no change │
│ QQuery 5  │ 38.87 / 40.15 ±0.92 / 41.48 ms │ 41.88 / 42.44 ±0.56 / 43.48 ms │  1.06x slower │
│ QQuery 6  │ 16.35 / 16.89 ±0.81 / 18.49 ms │ 16.52 / 16.78 ±0.36 / 17.50 ms │     no change │
│ QQuery 7  │ 44.01 / 46.40 ±1.69 / 49.28 ms │ 48.94 / 50.50 ±0.90 / 51.63 ms │  1.09x slower │
│ QQuery 8  │ 43.35 / 44.69 ±1.11 / 45.97 ms │ 47.33 / 47.87 ±0.43 / 48.57 ms │  1.07x slower │
│ QQuery 9  │ 50.22 / 50.70 ±0.40 / 51.35 ms │ 53.14 / 53.86 ±0.48 / 54.54 ms │  1.06x slower │
│ QQuery 10 │ 42.61 / 42.96 ±0.49 / 43.91 ms │ 48.00 / 48.35 ±0.28 / 48.82 ms │  1.13x slower │
│ QQuery 11 │ 13.33 / 13.80 ±0.37 / 14.41 ms │ 14.46 / 15.08 ±0.59 / 15.87 ms │  1.09x slower │
│ QQuery 12 │ 23.88 / 24.59 ±0.42 / 25.18 ms │ 24.42 / 25.67 ±1.27 / 28.06 ms │     no change │
│ QQuery 13 │ 32.34 / 34.34 ±1.64 / 36.41 ms │ 30.04 / 31.43 ±1.27 / 33.79 ms │ +1.09x faster │
│ QQuery 14 │ 23.75 / 24.02 ±0.19 / 24.23 ms │ 25.21 / 25.65 ±0.39 / 26.36 ms │  1.07x slower │
│ QQuery 15 │ 31.26 / 32.40 ±0.77 / 33.50 ms │ 32.40 / 32.79 ±0.37 / 33.30 ms │     no change │
│ QQuery 16 │ 14.17 / 14.34 ±0.10 / 14.43 ms │ 13.89 / 14.76 ±0.86 / 16.21 ms │     no change │
│ QQuery 17 │ 86.98 / 88.67 ±1.15 / 90.31 ms │ 89.67 / 90.49 ±0.66 / 91.33 ms │     no change │
│ QQuery 18 │ 66.80 / 69.61 ±2.84 / 74.99 ms │ 70.47 / 71.20 ±0.99 / 73.11 ms │     no change │
│ QQuery 19 │ 33.17 / 33.50 ±0.44 / 34.35 ms │ 34.39 / 34.90 ±0.44 / 35.46 ms │     no change │
│ QQuery 20 │ 34.25 / 34.95 ±0.36 / 35.26 ms │ 35.57 / 36.60 ±0.75 / 37.45 ms │     no change │
│ QQuery 21 │ 56.56 / 57.65 ±0.83 / 58.56 ms │ 55.28 / 58.09 ±1.80 / 60.51 ms │     no change │
│ QQuery 22 │ 14.02 / 14.33 ±0.26 / 14.79 ms │ 16.83 / 17.25 ±0.34 / 17.72 ms │  1.20x slower │
└───────────┴────────────────────────────────┴────────────────────────────────┴───────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━┓
┃ Benchmark Summary                        ┃          ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━┩
│ Total Time (HEAD)                        │ 794.44ms │
│ Total Time (split-row-groups-by-range)   │ 824.61ms │
│ Average Time (HEAD)                      │  36.11ms │
│ Average Time (split-row-groups-by-range) │  37.48ms │
│ Queries Faster                           │        1 │
│ Queries Slower                           │        9 │
│ Queries with No Change                   │       12 │
│ Queries with Failure                     │        0 │
└──────────────────────────────────────────┴──────────┘

Resource Usage

tpch — base (merge-base)

Metric Value
Wall time 5.0s
Peak memory 1.1 GiB
Avg memory 514.8 MiB
CPU user 23.6s
CPU sys 1.6s
Peak spill 0 B

tpch — branch

Metric Value
Wall time 5.0s
Peak memory 1.2 GiB
Avg memory 713.4 MiB
CPU user 26.1s
CPU sys 1.9s
Peak spill 0 B

File an issue against this benchmark runner

@adriangbot

Copy link
Copy Markdown

🤖 Benchmark completed (GKE) | trigger

Instance: c4a-highmem-16 (12 vCPU / 65 GiB)

CPU Details (lscpu)
Architecture:                            aarch64
CPU op-mode(s):                          64-bit
Byte Order:                              Little Endian
CPU(s):                                  16
On-line CPU(s) list:                     0-15
Vendor ID:                               ARM
Model name:                              Neoverse-V2
Model:                                   1
Thread(s) per core:                      1
Core(s) per cluster:                     16
Socket(s):                               -
Cluster(s):                              1
Stepping:                                r0p1
BogoMIPS:                                2000.00
Flags:                                   fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 sve asimdfhm dit uscat ilrcpc flagm sb paca pacg dcpodp sve2 sveaes svepmull svebitperm svesha3 svesm4 flagm2 frint svei8mm svebf16 i8mm bf16 dgh rng bti
L1d cache:                               1 MiB (16 instances)
L1i cache:                               1 MiB (16 instances)
L2 cache:                                32 MiB (16 instances)
L3 cache:                                80 MiB (1 instance)
NUMA node(s):                            1
NUMA node0 CPU(s):                       0-15
Vulnerability Gather data sampling:      Not affected
Vulnerability Indirect target selection: Not affected
Vulnerability Itlb multihit:             Not affected
Vulnerability L1tf:                      Not affected
Vulnerability Mds:                       Not affected
Vulnerability Meltdown:                  Not affected
Vulnerability Mmio stale data:           Not affected
Vulnerability Reg file data sampling:    Not affected
Vulnerability Retbleed:                  Not affected
Vulnerability Spec rstack overflow:      Not affected
Vulnerability Spec store bypass:         Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1:                Mitigation; __user pointer sanitization
Vulnerability Spectre v2:                Mitigation; CSV2, BHB
Vulnerability Srbds:                     Not affected
Vulnerability Tsa:                       Not affected
Vulnerability Tsx async abort:           Not affected
Vulnerability Vmscape:                   Not affected
Details

Comparing HEAD and split-row-groups-by-range
--------------------
Benchmark tpcds_sf1.json
--------------------
┏━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query     ┃                                  HEAD ┃             split-row-groups-by-range ┃        Change ┃
┡━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ QQuery 1  │           5.52 / 6.02 ±0.81 / 7.64 ms │           5.58 / 6.00 ±0.81 / 7.62 ms │     no change │
│ QQuery 2  │        80.25 / 80.89 ±0.42 / 81.25 ms │        38.23 / 39.23 ±0.93 / 40.93 ms │ +2.06x faster │
│ QQuery 3  │        28.80 / 29.19 ±0.27 / 29.50 ms │        16.02 / 16.20 ±0.13 / 16.40 ms │ +1.80x faster │
│ QQuery 4  │     473.13 / 476.12 ±1.67 / 478.00 ms │     233.84 / 238.11 ±2.38 / 240.50 ms │ +2.00x faster │
│ QQuery 5  │        51.23 / 53.69 ±2.76 / 58.67 ms │        48.26 / 49.66 ±1.52 / 51.75 ms │ +1.08x faster │
│ QQuery 6  │        35.90 / 36.47 ±0.42 / 37.11 ms │        18.06 / 18.82 ±0.78 / 20.27 ms │ +1.94x faster │
│ QQuery 7  │        93.58 / 94.17 ±0.45 / 94.97 ms │        36.82 / 39.72 ±2.92 / 44.99 ms │ +2.37x faster │
│ QQuery 8  │        36.05 / 37.85 ±2.14 / 42.04 ms │        23.23 / 23.59 ±0.31 / 23.98 ms │ +1.60x faster │
│ QQuery 9  │        52.69 / 54.75 ±1.98 / 57.98 ms │        59.36 / 62.09 ±1.77 / 63.80 ms │  1.13x slower │
│ QQuery 10 │        62.62 / 63.21 ±0.48 / 64.05 ms │        35.13 / 35.77 ±0.43 / 36.40 ms │ +1.77x faster │
│ QQuery 11 │     286.99 / 291.46 ±2.24 / 292.79 ms │     123.97 / 128.40 ±5.72 / 139.67 ms │ +2.27x faster │
│ QQuery 12 │        27.85 / 28.24 ±0.40 / 29.00 ms │        13.12 / 13.78 ±1.20 / 16.18 ms │ +2.05x faster │
│ QQuery 13 │     117.96 / 118.60 ±0.74 / 119.58 ms │        54.81 / 55.90 ±1.80 / 59.49 ms │ +2.12x faster │
│ QQuery 14 │     410.12 / 415.22 ±5.23 / 421.93 ms │     232.13 / 239.57 ±5.15 / 247.16 ms │ +1.73x faster │
│ QQuery 15 │        57.74 / 57.99 ±0.15 / 58.16 ms │        16.42 / 17.18 ±1.21 / 19.59 ms │ +3.37x faster │
│ QQuery 16 │           6.63 / 6.80 ±0.13 / 7.00 ms │           6.91 / 7.02 ±0.18 / 7.39 ms │     no change │
│ QQuery 17 │        80.04 / 80.84 ±0.93 / 82.66 ms │        44.29 / 46.52 ±3.28 / 53.05 ms │ +1.74x faster │
│ QQuery 18 │     121.49 / 124.76 ±2.28 / 127.94 ms │        53.38 / 55.02 ±1.76 / 58.22 ms │ +2.27x faster │
│ QQuery 19 │        40.75 / 41.40 ±0.70 / 42.73 ms │        23.93 / 24.38 ±0.27 / 24.79 ms │ +1.70x faster │
│ QQuery 20 │        34.72 / 35.14 ±0.34 / 35.67 ms │        15.62 / 15.75 ±0.11 / 15.93 ms │ +2.23x faster │
│ QQuery 21 │        17.10 / 17.47 ±0.30 / 17.93 ms │        18.14 / 18.22 ±0.12 / 18.45 ms │     no change │
│ QQuery 22 │        61.14 / 63.04 ±1.68 / 66.16 ms │        59.76 / 60.33 ±0.83 / 61.96 ms │     no change │
│ QQuery 23 │     351.71 / 353.22 ±1.75 / 356.53 ms │     223.94 / 224.94 ±1.50 / 227.90 ms │ +1.57x faster │
│ QQuery 24 │     223.01 / 225.50 ±3.01 / 231.11 ms │     134.05 / 139.66 ±3.97 / 146.18 ms │ +1.61x faster │
│ QQuery 25 │     110.76 / 111.61 ±0.97 / 113.48 ms │        52.78 / 53.97 ±0.68 / 54.85 ms │ +2.07x faster │
│ QQuery 26 │        56.91 / 58.94 ±3.19 / 65.22 ms │        24.78 / 25.27 ±0.30 / 25.70 ms │ +2.33x faster │
│ QQuery 27 │           6.33 / 6.45 ±0.13 / 6.69 ms │           6.32 / 6.48 ±0.19 / 6.83 ms │     no change │
│ QQuery 28 │        56.73 / 61.87 ±2.85 / 64.90 ms │        54.94 / 55.84 ±0.94 / 57.56 ms │ +1.11x faster │
│ QQuery 29 │        97.10 / 98.35 ±0.87 / 99.48 ms │        45.37 / 46.73 ±1.44 / 49.46 ms │ +2.10x faster │
│ QQuery 30 │        31.86 / 33.27 ±1.76 / 36.45 ms │        25.65 / 25.99 ±0.29 / 26.40 ms │ +1.28x faster │
│ QQuery 31 │     111.02 / 111.79 ±0.92 / 113.60 ms │        70.76 / 72.23 ±2.04 / 76.24 ms │ +1.55x faster │
│ QQuery 32 │        20.24 / 20.50 ±0.30 / 21.07 ms │        14.61 / 14.92 ±0.23 / 15.32 ms │ +1.37x faster │
│ QQuery 33 │        37.48 / 37.83 ±0.37 / 38.49 ms │        30.09 / 30.87 ±0.52 / 31.58 ms │ +1.23x faster │
│ QQuery 34 │          9.75 / 9.97 ±0.22 / 10.30 ms │           8.38 / 9.45 ±0.56 / 9.98 ms │ +1.05x faster │
│ QQuery 35 │        71.43 / 72.57 ±0.76 / 73.66 ms │        38.61 / 39.49 ±0.81 / 40.91 ms │ +1.84x faster │
│ QQuery 36 │           5.82 / 5.97 ±0.17 / 6.29 ms │           5.95 / 6.15 ±0.22 / 6.56 ms │     no change │
│ QQuery 37 │           7.04 / 7.19 ±0.17 / 7.50 ms │           6.93 / 7.06 ±0.07 / 7.15 ms │     no change │
│ QQuery 38 │        62.48 / 63.22 ±0.64 / 64.04 ms │        27.94 / 28.56 ±0.48 / 29.06 ms │ +2.21x faster │
│ QQuery 39 │        85.59 / 86.01 ±0.24 / 86.29 ms │        83.87 / 85.64 ±2.07 / 89.49 ms │     no change │
│ QQuery 40 │        23.21 / 23.74 ±0.32 / 24.09 ms │        16.21 / 16.66 ±0.35 / 17.28 ms │ +1.42x faster │
│ QQuery 41 │        11.47 / 11.99 ±0.77 / 13.51 ms │        11.41 / 11.54 ±0.17 / 11.88 ms │     no change │
│ QQuery 42 │        23.17 / 23.51 ±0.32 / 24.03 ms │        14.20 / 14.39 ±0.12 / 14.56 ms │ +1.63x faster │
│ QQuery 43 │           4.87 / 4.93 ±0.08 / 5.08 ms │           4.89 / 5.00 ±0.16 / 5.31 ms │     no change │
│ QQuery 44 │           9.24 / 9.34 ±0.09 / 9.50 ms │           9.24 / 9.39 ±0.10 / 9.53 ms │     no change │
│ QQuery 45 │        38.26 / 38.73 ±0.34 / 39.20 ms │        14.65 / 15.66 ±1.13 / 17.86 ms │ +2.47x faster │
│ QQuery 46 │        11.61 / 11.96 ±0.32 / 12.54 ms │        11.42 / 11.81 ±0.44 / 12.64 ms │     no change │
│ QQuery 47 │     229.73 / 232.24 ±1.61 / 234.70 ms │     100.33 / 103.01 ±2.65 / 107.58 ms │ +2.25x faster │
│ QQuery 48 │       96.45 / 98.65 ±3.06 / 104.67 ms │        40.30 / 41.37 ±1.58 / 44.49 ms │ +2.38x faster │
│ QQuery 49 │        76.12 / 76.95 ±0.71 / 78.22 ms │        65.19 / 66.79 ±1.42 / 69.15 ms │ +1.15x faster │
│ QQuery 50 │        59.23 / 61.42 ±2.56 / 66.19 ms │        33.84 / 35.83 ±3.09 / 41.95 ms │ +1.71x faster │
│ QQuery 51 │     100.53 / 102.64 ±2.12 / 106.72 ms │        67.81 / 70.61 ±3.28 / 76.74 ms │ +1.45x faster │
│ QQuery 52 │        23.38 / 23.66 ±0.17 / 23.93 ms │        14.56 / 14.75 ±0.20 / 15.15 ms │ +1.60x faster │
│ QQuery 53 │        29.32 / 29.51 ±0.13 / 29.68 ms │        14.49 / 14.98 ±0.41 / 15.48 ms │ +1.97x faster │
│ QQuery 54 │        55.33 / 56.20 ±1.35 / 58.87 ms │        32.70 / 33.26 ±0.31 / 33.64 ms │ +1.69x faster │
│ QQuery 55 │        22.81 / 23.35 ±0.32 / 23.68 ms │        13.76 / 13.85 ±0.08 / 13.99 ms │ +1.69x faster │
│ QQuery 56 │        38.50 / 39.14 ±0.52 / 40.09 ms │        32.39 / 32.88 ±0.42 / 33.55 ms │ +1.19x faster │
│ QQuery 57 │     177.78 / 179.44 ±1.88 / 181.85 ms │        60.20 / 62.09 ±2.70 / 67.46 ms │ +2.89x faster │
│ QQuery 58 │     113.45 / 114.51 ±0.99 / 116.20 ms │        51.17 / 52.93 ±2.26 / 57.04 ms │ +2.16x faster │
│ QQuery 59 │     117.65 / 118.73 ±0.85 / 120.18 ms │        43.77 / 44.49 ±1.16 / 46.81 ms │ +2.67x faster │
│ QQuery 60 │        39.18 / 39.60 ±0.28 / 39.93 ms │        31.53 / 32.01 ±0.35 / 32.48 ms │ +1.24x faster │
│ QQuery 61 │        12.55 / 12.80 ±0.23 / 13.23 ms │        12.31 / 12.45 ±0.16 / 12.76 ms │     no change │
│ QQuery 62 │        46.28 / 46.57 ±0.41 / 47.38 ms │        13.19 / 14.69 ±2.54 / 19.76 ms │ +3.17x faster │
│ QQuery 63 │        29.41 / 29.61 ±0.17 / 29.82 ms │        14.33 / 14.54 ±0.14 / 14.70 ms │ +2.04x faster │
│ QQuery 64 │     406.08 / 410.02 ±2.98 / 413.62 ms │     233.83 / 241.89 ±5.60 / 251.09 ms │ +1.70x faster │
│ QQuery 65 │     150.79 / 153.75 ±2.57 / 157.41 ms │     159.93 / 162.62 ±2.07 / 166.23 ms │  1.06x slower │
│ QQuery 66 │        78.74 / 81.68 ±2.59 / 84.85 ms │        52.97 / 54.25 ±1.31 / 56.77 ms │ +1.51x faster │
│ QQuery 67 │     236.84 / 239.26 ±2.33 / 243.02 ms │     104.63 / 107.21 ±3.08 / 112.43 ms │ +2.23x faster │
│ QQuery 68 │        11.88 / 12.05 ±0.19 / 12.39 ms │        11.65 / 12.21 ±0.37 / 12.67 ms │     no change │
│ QQuery 69 │        56.65 / 56.97 ±0.18 / 57.16 ms │        32.59 / 35.44 ±4.36 / 44.06 ms │ +1.61x faster │
│ QQuery 70 │     105.20 / 108.08 ±2.11 / 111.22 ms │        68.18 / 72.71 ±4.61 / 81.41 ms │ +1.49x faster │
│ QQuery 71 │        35.12 / 35.82 ±0.46 / 36.45 ms │        27.90 / 28.81 ±0.49 / 29.34 ms │ +1.24x faster │
│ QQuery 72 │ 2134.13 / 2183.82 ±36.23 / 2243.82 ms │ 2046.51 / 2162.36 ±66.11 / 2219.23 ms │     no change │
│ QQuery 73 │          9.60 / 9.82 ±0.19 / 10.16 ms │           8.72 / 9.18 ±0.37 / 9.74 ms │ +1.07x faster │
│ QQuery 74 │     165.50 / 167.24 ±1.28 / 169.10 ms │        79.55 / 81.00 ±1.92 / 84.78 ms │ +2.06x faster │
│ QQuery 75 │     148.19 / 152.43 ±5.56 / 163.11 ms │     109.15 / 118.97 ±5.74 / 125.57 ms │ +1.28x faster │
│ QQuery 76 │        34.68 / 34.99 ±0.20 / 35.31 ms │        26.06 / 28.50 ±1.89 / 30.78 ms │ +1.23x faster │
│ QQuery 77 │        60.58 / 61.10 ±0.36 / 61.65 ms │        50.53 / 51.25 ±0.62 / 52.13 ms │ +1.19x faster │
│ QQuery 78 │     181.67 / 187.61 ±4.03 / 193.51 ms │        79.35 / 83.00 ±4.26 / 90.85 ms │ +2.26x faster │
│ QQuery 79 │        66.99 / 67.27 ±0.31 / 67.83 ms │        32.58 / 34.84 ±2.18 / 38.78 ms │ +1.93x faster │
│ QQuery 80 │      97.46 / 100.98 ±2.23 / 104.11 ms │        80.61 / 82.07 ±1.24 / 83.60 ms │ +1.23x faster │
│ QQuery 81 │        25.50 / 25.78 ±0.28 / 26.29 ms │        23.38 / 26.37 ±4.19 / 34.37 ms │     no change │
│ QQuery 82 │        16.16 / 16.40 ±0.20 / 16.69 ms │        14.58 / 14.67 ±0.11 / 14.86 ms │ +1.12x faster │
│ QQuery 83 │        39.89 / 40.07 ±0.12 / 40.24 ms │        29.42 / 30.00 ±0.32 / 30.35 ms │ +1.34x faster │
│ QQuery 84 │        30.05 / 30.25 ±0.15 / 30.44 ms │        20.86 / 21.13 ±0.26 / 21.52 ms │ +1.43x faster │
│ QQuery 85 │     106.69 / 110.31 ±4.98 / 119.75 ms │        48.80 / 52.98 ±5.43 / 63.68 ms │ +2.08x faster │
│ QQuery 86 │        24.53 / 25.00 ±0.47 / 25.71 ms │        10.13 / 10.39 ±0.22 / 10.80 ms │ +2.41x faster │
│ QQuery 87 │        62.94 / 63.73 ±0.59 / 64.32 ms │        28.98 / 29.48 ±0.31 / 29.92 ms │ +2.16x faster │
│ QQuery 88 │        63.59 / 64.57 ±1.24 / 67.01 ms │        66.13 / 66.74 ±0.37 / 67.21 ms │     no change │
│ QQuery 89 │        35.64 / 36.59 ±1.22 / 38.99 ms │        17.47 / 17.69 ±0.25 / 18.15 ms │ +2.07x faster │
│ QQuery 90 │        17.22 / 17.54 ±0.23 / 17.86 ms │        10.60 / 10.94 ±0.26 / 11.30 ms │ +1.60x faster │
│ QQuery 91 │        45.54 / 45.74 ±0.12 / 45.93 ms │        26.90 / 28.79 ±2.83 / 34.39 ms │ +1.59x faster │
│ QQuery 92 │        28.88 / 29.36 ±0.42 / 29.92 ms │        15.54 / 15.88 ±0.32 / 16.46 ms │ +1.85x faster │
│ QQuery 93 │        48.77 / 49.97 ±1.17 / 51.91 ms │        24.28 / 25.58 ±1.45 / 28.40 ms │ +1.95x faster │
│ QQuery 94 │        37.51 / 38.42 ±0.83 / 39.85 ms │        20.42 / 20.74 ±0.44 / 21.59 ms │ +1.85x faster │
│ QQuery 95 │        80.08 / 81.30 ±0.86 / 82.30 ms │        46.79 / 47.81 ±0.96 / 49.11 ms │ +1.70x faster │
│ QQuery 96 │        24.10 / 24.34 ±0.28 / 24.81 ms │        10.52 / 11.93 ±2.62 / 17.17 ms │ +2.04x faster │
│ QQuery 97 │        54.77 / 55.34 ±0.53 / 56.25 ms │        21.43 / 21.83 ±0.32 / 22.31 ms │ +2.53x faster │
│ QQuery 98 │        42.24 / 43.02 ±0.76 / 44.12 ms │        21.55 / 21.67 ±0.08 / 21.77 ms │ +1.99x faster │
│ QQuery 99 │        70.07 / 70.39 ±0.30 / 70.88 ms │        17.32 / 17.59 ±0.36 / 18.28 ms │ +4.00x faster │
└───────────┴───────────────────────────────────────┴───────────────────────────────────────┴───────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━┓
┃ Benchmark Summary                        ┃           ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━┩
│ Total Time (HEAD)                        │ 9917.84ms │
│ Total Time (split-row-groups-by-range)   │ 6688.04ms │
│ Average Time (HEAD)                      │  100.18ms │
│ Average Time (split-row-groups-by-range) │   67.56ms │
│ Queries Faster                           │        80 │
│ Queries Slower                           │         2 │
│ Queries with No Change                   │        17 │
│ Queries with Failure                     │         0 │
└──────────────────────────────────────────┴───────────┘

Resource Usage

tpcds — base (merge-base)

Metric Value
Wall time 50.0s
Peak memory 2.3 GiB
Avg memory 1.7 GiB
CPU user 224.7s
CPU sys 5.5s
Peak spill 0 B

tpcds — branch

Metric Value
Wall time 35.0s
Peak memory 2.7 GiB
Avg memory 2.0 GiB
CPU user 256.8s
CPU sys 8.1s
Peak spill 0 B

File an issue against this benchmark runner

@adriangbot

Copy link
Copy Markdown

🤖 Benchmark completed (GKE) | trigger

Instance: c4a-highmem-16 (12 vCPU / 65 GiB)

CPU Details (lscpu)
Architecture:                            aarch64
CPU op-mode(s):                          64-bit
Byte Order:                              Little Endian
CPU(s):                                  16
On-line CPU(s) list:                     0-15
Vendor ID:                               ARM
Model name:                              Neoverse-V2
Model:                                   1
Thread(s) per core:                      1
Core(s) per cluster:                     16
Socket(s):                               -
Cluster(s):                              1
Stepping:                                r0p1
BogoMIPS:                                2000.00
Flags:                                   fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 sve asimdfhm dit uscat ilrcpc flagm sb paca pacg dcpodp sve2 sveaes svepmull svebitperm svesha3 svesm4 flagm2 frint svei8mm svebf16 i8mm bf16 dgh rng bti
L1d cache:                               1 MiB (16 instances)
L1i cache:                               1 MiB (16 instances)
L2 cache:                                32 MiB (16 instances)
L3 cache:                                80 MiB (1 instance)
NUMA node(s):                            1
NUMA node0 CPU(s):                       0-15
Vulnerability Gather data sampling:      Not affected
Vulnerability Indirect target selection: Not affected
Vulnerability Itlb multihit:             Not affected
Vulnerability L1tf:                      Not affected
Vulnerability Mds:                       Not affected
Vulnerability Meltdown:                  Not affected
Vulnerability Mmio stale data:           Not affected
Vulnerability Reg file data sampling:    Not affected
Vulnerability Retbleed:                  Not affected
Vulnerability Spec rstack overflow:      Not affected
Vulnerability Spec store bypass:         Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1:                Mitigation; __user pointer sanitization
Vulnerability Spectre v2:                Mitigation; CSV2, BHB
Vulnerability Srbds:                     Not affected
Vulnerability Tsa:                       Not affected
Vulnerability Tsx async abort:           Not affected
Vulnerability Vmscape:                   Not affected
Details

Comparing HEAD and split-row-groups-by-range
--------------------
Benchmark clickbench_partitioned.json
--------------------
┏━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query     ┃                                       HEAD ┃                  split-row-groups-by-range ┃        Change ┃
┡━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ QQuery 0  │               1.22 / 4.00 ±5.46 / 14.93 ms │               1.21 / 3.96 ±5.41 / 14.79 ms │     no change │
│ QQuery 1  │             12.51 / 12.86 ±0.20 / 13.11 ms │             12.71 / 13.03 ±0.17 / 13.17 ms │     no change │
│ QQuery 2  │             36.43 / 36.70 ±0.23 / 36.98 ms │             36.30 / 36.68 ±0.32 / 37.08 ms │     no change │
│ QQuery 3  │             30.87 / 31.57 ±1.00 / 33.54 ms │             31.22 / 31.53 ±0.16 / 31.65 ms │     no change │
│ QQuery 4  │      1629.59 / 1702.27 ±40.81 / 1742.50 ms │      1634.24 / 1696.63 ±62.25 / 1804.01 ms │     no change │
│ QQuery 5  │      1753.74 / 1840.30 ±67.64 / 1939.93 ms │     1613.53 / 1761.80 ±146.71 / 2038.06 ms │     no change │
│ QQuery 6  │              1.27 / 6.52 ±10.12 / 26.75 ms │                1.25 / 1.40 ±0.23 / 1.86 ms │ +4.65x faster │
│ QQuery 7  │             13.93 / 14.10 ±0.13 / 14.27 ms │            13.91 / 22.94 ±17.77 / 58.49 ms │  1.63x slower │
│ QQuery 8  │      1994.09 / 2072.54 ±67.58 / 2167.90 ms │      2001.65 / 2050.86 ±57.88 / 2161.75 ms │     no change │
│ QQuery 9  │         479.39 / 510.06 ±36.89 / 581.08 ms │         478.07 / 497.78 ±17.46 / 525.30 ms │     no change │
│ QQuery 10 │             77.52 / 78.24 ±0.67 / 79.45 ms │             74.96 / 79.36 ±5.88 / 90.83 ms │     no change │
│ QQuery 11 │            91.15 / 97.19 ±8.59 / 114.17 ms │          89.38 / 101.43 ±22.13 / 145.66 ms │     no change │
│ QQuery 12 │     1628.59 / 1739.19 ±123.32 / 1965.08 ms │      1719.81 / 1810.68 ±89.23 / 1978.22 ms │     no change │
│ QQuery 13 │        586.67 / 780.39 ±123.61 / 883.65 ms │        446.29 / 647.79 ±182.79 / 905.79 ms │ +1.20x faster │
│ QQuery 14 │         547.93 / 572.17 ±18.40 / 600.63 ms │         560.69 / 580.83 ±20.36 / 619.46 ms │     no change │
│ QQuery 15 │      1907.77 / 1967.78 ±41.20 / 2034.82 ms │      1856.90 / 1913.87 ±67.36 / 2046.31 ms │     no change │
│ QQuery 16 │      4148.16 / 4287.78 ±88.46 / 4418.06 ms │     4160.15 / 4313.19 ±120.58 / 4476.57 ms │     no change │
│ QQuery 17 │     4156.27 / 4332.90 ±111.98 / 4498.45 ms │     4175.64 / 4343.54 ±131.32 / 4546.53 ms │     no change │
│ QQuery 18 │  17585.79 / 17900.29 ±340.13 / 18511.68 ms │  17987.41 / 18371.43 ±215.14 / 18635.11 ms │     no change │
│ QQuery 19 │             28.11 / 29.01 ±1.03 / 30.86 ms │             28.24 / 28.96 ±0.75 / 30.33 ms │     no change │
│ QQuery 20 │          516.60 / 521.93 ±5.83 / 533.30 ms │          515.85 / 518.97 ±2.45 / 522.79 ms │     no change │
│ QQuery 21 │          515.31 / 520.24 ±3.46 / 525.75 ms │          519.82 / 527.69 ±7.00 / 539.62 ms │     no change │
│ QQuery 22 │         981.90 / 992.88 ±9.32 / 1005.18 ms │       998.02 / 1012.97 ±10.64 / 1027.24 ms │     no change │
│ QQuery 23 │      3081.67 / 3106.31 ±19.79 / 3140.04 ms │      3038.67 / 3079.77 ±25.41 / 3109.19 ms │     no change │
│ QQuery 24 │            41.19 / 51.57 ±18.72 / 88.97 ms │            41.11 / 49.09 ±10.97 / 69.90 ms │     no change │
│ QQuery 25 │          111.51 / 112.23 ±0.88 / 113.90 ms │          112.68 / 114.89 ±2.24 / 118.87 ms │     no change │
│ QQuery 26 │             41.79 / 47.90 ±7.63 / 61.30 ms │             41.38 / 42.26 ±0.65 / 43.06 ms │ +1.13x faster │
│ QQuery 27 │          664.13 / 672.75 ±6.10 / 682.97 ms │          675.13 / 681.32 ±4.22 / 686.70 ms │     no change │
│ QQuery 28 │     3375.15 / 3627.57 ±174.19 / 3849.40 ms │      3479.37 / 3548.66 ±85.23 / 3709.29 ms │     no change │
│ QQuery 29 │             40.89 / 41.25 ±0.51 / 42.24 ms │            40.93 / 58.66 ±22.73 / 96.35 ms │  1.42x slower │
│ QQuery 30 │          555.25 / 565.69 ±8.77 / 581.00 ms │          560.50 / 574.54 ±9.00 / 587.20 ms │     no change │
│ QQuery 31 │          291.96 / 297.87 ±5.69 / 307.42 ms │         290.59 / 313.78 ±14.19 / 334.07 ms │  1.05x slower │
│ QQuery 32 │        964.25 / 997.41 ±37.46 / 1066.08 ms │        976.71 / 998.39 ±18.55 / 1030.75 ms │     no change │
│ QQuery 33 │ 24321.00 / 27588.14 ±2206.61 / 30034.90 ms │  26832.38 / 27813.23 ±605.67 / 28626.92 ms │     no change │
│ QQuery 34 │ 26209.84 / 28890.08 ±2333.84 / 32955.32 ms │ 26816.08 / 28953.45 ±1422.25 / 30716.81 ms │     no change │
│ QQuery 35 │      987.34 / 1158.36 ±180.33 / 1508.55 ms │      990.43 / 1109.75 ±171.23 / 1445.49 ms │     no change │
│ QQuery 36 │          159.54 / 173.39 ±8.52 / 185.64 ms │         155.25 / 187.01 ±27.98 / 234.46 ms │  1.08x slower │
│ QQuery 37 │             37.36 / 42.83 ±8.87 / 60.51 ms │            37.13 / 45.34 ±12.54 / 70.33 ms │  1.06x slower │
│ QQuery 38 │            45.36 / 52.81 ±13.89 / 80.56 ms │             43.72 / 45.43 ±1.31 / 47.05 ms │ +1.16x faster │
│ QQuery 39 │          184.86 / 190.26 ±3.25 / 194.92 ms │         166.31 / 188.29 ±12.74 / 204.02 ms │     no change │
│ QQuery 40 │             15.01 / 15.48 ±0.32 / 15.85 ms │             14.72 / 17.75 ±3.76 / 25.18 ms │  1.15x slower │
│ QQuery 41 │             13.98 / 14.23 ±0.13 / 14.33 ms │             13.77 / 14.08 ±0.34 / 14.68 ms │     no change │
│ QQuery 42 │             13.58 / 13.67 ±0.07 / 13.74 ms │             13.31 / 13.39 ±0.08 / 13.54 ms │     no change │
└───────────┴────────────────────────────────────────────┴────────────────────────────────────────────┴───────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┓
┃ Benchmark Summary                        ┃             ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━┩
│ Total Time (HEAD)                        │ 107710.76ms │
│ Total Time (split-row-groups-by-range)   │ 108216.42ms │
│ Average Time (HEAD)                      │   2504.90ms │
│ Average Time (split-row-groups-by-range) │   2516.66ms │
│ Queries Faster                           │           4 │
│ Queries Slower                           │           6 │
│ Queries with No Change                   │          33 │
│ Queries with Failure                     │           0 │
└──────────────────────────────────────────┴─────────────┘

Resource Usage

clickbench_partitioned — base (merge-base)

Metric Value
Wall time 540.1s
Peak memory 12.0 GiB
Avg memory 6.5 GiB
CPU user 4820.5s
CPU sys 322.2s
Peak spill 0 B

clickbench_partitioned — branch

Metric Value
Wall time 545.1s
Peak memory 12.0 GiB
Avg memory 6.5 GiB
CPU user 4841.7s
CPU sys 326.6s
Peak spill 0 B

File an issue against this benchmark runner

… morsels

When a pop would leave the shared work queue empty, split the final byte
range into small morsels (halving down to a floor) and push the excess
back, so sibling streams that finish their pieces early steal a share of
the last piece instead of idling behind one straggler. Work items are
returned exactly as the planner sized them while the queue is deep.

The morsel floor is ~1MiB of data the scan actually reads: the file-range
floor is scaled by the fraction of file columns referenced by the
projection and filter, so narrow projections produce proportionally larger
byte ranges and the fixed per-piece open cost stays amortized.

TPC-DS SF=1 (single-row-group files): 4.4% faster overall (7.53s -> 7.21s),
28/99 queries faster, up to 1.14x, no meaningful regressions. TPC-H SF=1:
6.1% faster overall (1084ms -> 1022ms), 9/22 faster, 0 slower.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
@Dandandan

Copy link
Copy Markdown
Contributor Author

run benchmarks

@adriangbot

Copy link
Copy Markdown

🤖 Benchmark running (GKE) | trigger
Instance: c4a-highmem-16 (12 vCPU / 65 GiB) | Linux bench-c4862279847-792-g52v6 6.12.85+ #1 SMP Mon May 11 08:17:35 UTC 2026 aarch64 GNU/Linux

CPU Details (lscpu)
Architecture:                            aarch64
CPU op-mode(s):                          64-bit
Byte Order:                              Little Endian
CPU(s):                                  16
On-line CPU(s) list:                     0-15
Vendor ID:                               ARM
Model name:                              Neoverse-V2
Model:                                   1
Thread(s) per core:                      1
Core(s) per cluster:                     16
Socket(s):                               -
Cluster(s):                              1
Stepping:                                r0p1
BogoMIPS:                                2000.00
Flags:                                   fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 sve asimdfhm dit uscat ilrcpc flagm sb paca pacg dcpodp sve2 sveaes svepmull svebitperm svesha3 svesm4 flagm2 frint svei8mm svebf16 i8mm bf16 dgh rng bti
L1d cache:                               1 MiB (16 instances)
L1i cache:                               1 MiB (16 instances)
L2 cache:                                32 MiB (16 instances)
L3 cache:                                80 MiB (1 instance)
NUMA node(s):                            1
NUMA node0 CPU(s):                       0-15
Vulnerability Gather data sampling:      Not affected
Vulnerability Indirect target selection: Not affected
Vulnerability Itlb multihit:             Not affected
Vulnerability L1tf:                      Not affected
Vulnerability Mds:                       Not affected
Vulnerability Meltdown:                  Not affected
Vulnerability Mmio stale data:           Not affected
Vulnerability Reg file data sampling:    Not affected
Vulnerability Retbleed:                  Not affected
Vulnerability Spec rstack overflow:      Not affected
Vulnerability Spec store bypass:         Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1:                Mitigation; __user pointer sanitization
Vulnerability Spectre v2:                Mitigation; CSV2, BHB
Vulnerability Srbds:                     Not affected
Vulnerability Tsa:                       Not affected
Vulnerability Tsx async abort:           Not affected
Vulnerability Vmscape:                   Not affected

Comparing split-row-groups-by-range (95185fa) to 3bb9314 (merge-base) diff using: clickbench_partitioned
Results will be posted here when complete


File an issue against this benchmark runner

@adriangbot

Copy link
Copy Markdown

🤖 Benchmark running (GKE) | trigger
Instance: c4a-highmem-16 (12 vCPU / 65 GiB) | Linux bench-c4862279847-794-dtdwp 6.12.85+ #1 SMP Mon May 11 08:17:35 UTC 2026 aarch64 GNU/Linux

CPU Details (lscpu)
Architecture:                            aarch64
CPU op-mode(s):                          64-bit
Byte Order:                              Little Endian
CPU(s):                                  16
On-line CPU(s) list:                     0-15
Vendor ID:                               ARM
Model name:                              Neoverse-V2
Model:                                   1
Thread(s) per core:                      1
Core(s) per cluster:                     16
Socket(s):                               -
Cluster(s):                              1
Stepping:                                r0p1
BogoMIPS:                                2000.00
Flags:                                   fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 sve asimdfhm dit uscat ilrcpc flagm sb paca pacg dcpodp sve2 sveaes svepmull svebitperm svesha3 svesm4 flagm2 frint svei8mm svebf16 i8mm bf16 dgh rng bti
L1d cache:                               1 MiB (16 instances)
L1i cache:                               1 MiB (16 instances)
L2 cache:                                32 MiB (16 instances)
L3 cache:                                80 MiB (1 instance)
NUMA node(s):                            1
NUMA node0 CPU(s):                       0-15
Vulnerability Gather data sampling:      Not affected
Vulnerability Indirect target selection: Not affected
Vulnerability Itlb multihit:             Not affected
Vulnerability L1tf:                      Not affected
Vulnerability Mds:                       Not affected
Vulnerability Meltdown:                  Not affected
Vulnerability Mmio stale data:           Not affected
Vulnerability Reg file data sampling:    Not affected
Vulnerability Retbleed:                  Not affected
Vulnerability Spec rstack overflow:      Not affected
Vulnerability Spec store bypass:         Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1:                Mitigation; __user pointer sanitization
Vulnerability Spectre v2:                Mitigation; CSV2, BHB
Vulnerability Srbds:                     Not affected
Vulnerability Tsa:                       Not affected
Vulnerability Tsx async abort:           Not affected
Vulnerability Vmscape:                   Not affected

Comparing split-row-groups-by-range (95185fa) to 3bb9314 (merge-base) diff using: tpch
Results will be posted here when complete


File an issue against this benchmark runner

@adriangbot

Copy link
Copy Markdown

🤖 Benchmark running (GKE) | trigger
Instance: c4a-highmem-16 (12 vCPU / 65 GiB) | Linux bench-c4862279847-793-fw74w 6.12.85+ #1 SMP Mon May 11 08:17:35 UTC 2026 aarch64 GNU/Linux

CPU Details (lscpu)
Architecture:                            aarch64
CPU op-mode(s):                          64-bit
Byte Order:                              Little Endian
CPU(s):                                  16
On-line CPU(s) list:                     0-15
Vendor ID:                               ARM
Model name:                              Neoverse-V2
Model:                                   1
Thread(s) per core:                      1
Core(s) per cluster:                     16
Socket(s):                               -
Cluster(s):                              1
Stepping:                                r0p1
BogoMIPS:                                2000.00
Flags:                                   fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 sve asimdfhm dit uscat ilrcpc flagm sb paca pacg dcpodp sve2 sveaes svepmull svebitperm svesha3 svesm4 flagm2 frint svei8mm svebf16 i8mm bf16 dgh rng bti
L1d cache:                               1 MiB (16 instances)
L1i cache:                               1 MiB (16 instances)
L2 cache:                                32 MiB (16 instances)
L3 cache:                                80 MiB (1 instance)
NUMA node(s):                            1
NUMA node0 CPU(s):                       0-15
Vulnerability Gather data sampling:      Not affected
Vulnerability Indirect target selection: Not affected
Vulnerability Itlb multihit:             Not affected
Vulnerability L1tf:                      Not affected
Vulnerability Mds:                       Not affected
Vulnerability Meltdown:                  Not affected
Vulnerability Mmio stale data:           Not affected
Vulnerability Reg file data sampling:    Not affected
Vulnerability Retbleed:                  Not affected
Vulnerability Spec rstack overflow:      Not affected
Vulnerability Spec store bypass:         Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1:                Mitigation; __user pointer sanitization
Vulnerability Spectre v2:                Mitigation; CSV2, BHB
Vulnerability Srbds:                     Not affected
Vulnerability Tsa:                       Not affected
Vulnerability Tsx async abort:           Not affected
Vulnerability Vmscape:                   Not affected

Comparing split-row-groups-by-range (95185fa) to 3bb9314 (merge-base) diff using: tpcds
Results will be posted here when complete


File an issue against this benchmark runner

@adriangbot

Copy link
Copy Markdown

🤖 Benchmark completed (GKE) | trigger

Instance: c4a-highmem-16 (12 vCPU / 65 GiB)

CPU Details (lscpu)
Architecture:                            aarch64
CPU op-mode(s):                          64-bit
Byte Order:                              Little Endian
CPU(s):                                  16
On-line CPU(s) list:                     0-15
Vendor ID:                               ARM
Model name:                              Neoverse-V2
Model:                                   1
Thread(s) per core:                      1
Core(s) per cluster:                     16
Socket(s):                               -
Cluster(s):                              1
Stepping:                                r0p1
BogoMIPS:                                2000.00
Flags:                                   fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 sve asimdfhm dit uscat ilrcpc flagm sb paca pacg dcpodp sve2 sveaes svepmull svebitperm svesha3 svesm4 flagm2 frint svei8mm svebf16 i8mm bf16 dgh rng bti
L1d cache:                               1 MiB (16 instances)
L1i cache:                               1 MiB (16 instances)
L2 cache:                                32 MiB (16 instances)
L3 cache:                                80 MiB (1 instance)
NUMA node(s):                            1
NUMA node0 CPU(s):                       0-15
Vulnerability Gather data sampling:      Not affected
Vulnerability Indirect target selection: Not affected
Vulnerability Itlb multihit:             Not affected
Vulnerability L1tf:                      Not affected
Vulnerability Mds:                       Not affected
Vulnerability Meltdown:                  Not affected
Vulnerability Mmio stale data:           Not affected
Vulnerability Reg file data sampling:    Not affected
Vulnerability Retbleed:                  Not affected
Vulnerability Spec rstack overflow:      Not affected
Vulnerability Spec store bypass:         Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1:                Mitigation; __user pointer sanitization
Vulnerability Spectre v2:                Mitigation; CSV2, BHB
Vulnerability Srbds:                     Not affected
Vulnerability Tsa:                       Not affected
Vulnerability Tsx async abort:           Not affected
Vulnerability Vmscape:                   Not affected
Details

Comparing HEAD and split-row-groups-by-range
--------------------
Benchmark tpch_sf1.json
--------------------
┏━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query     ┃                           HEAD ┃      split-row-groups-by-range ┃        Change ┃
┡━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ QQuery 1  │ 38.25 / 39.32 ±1.12 / 40.86 ms │ 39.39 / 40.51 ±2.16 / 44.82 ms │     no change │
│ QQuery 2  │ 19.31 / 19.84 ±0.63 / 21.07 ms │ 20.77 / 21.25 ±0.44 / 22.05 ms │  1.07x slower │
│ QQuery 3  │ 30.97 / 31.97 ±1.25 / 34.37 ms │ 32.72 / 34.28 ±1.12 / 35.46 ms │  1.07x slower │
│ QQuery 4  │ 17.54 / 17.96 ±0.40 / 18.71 ms │ 17.77 / 17.96 ±0.15 / 18.15 ms │     no change │
│ QQuery 5  │ 40.41 / 40.96 ±0.44 / 41.41 ms │ 41.38 / 42.49 ±1.04 / 44.16 ms │     no change │
│ QQuery 6  │ 16.34 / 16.53 ±0.17 / 16.80 ms │ 16.54 / 16.92 ±0.34 / 17.52 ms │     no change │
│ QQuery 7  │ 46.17 / 47.31 ±0.75 / 48.47 ms │ 48.18 / 49.20 ±0.88 / 50.46 ms │     no change │
│ QQuery 8  │ 43.64 / 43.77 ±0.15 / 44.04 ms │ 51.58 / 51.99 ±0.25 / 52.22 ms │  1.19x slower │
│ QQuery 9  │ 49.46 / 50.49 ±0.77 / 51.72 ms │ 58.13 / 58.61 ±0.43 / 59.39 ms │  1.16x slower │
│ QQuery 10 │ 42.05 / 42.82 ±0.91 / 44.56 ms │ 47.78 / 48.49 ±0.64 / 49.55 ms │  1.13x slower │
│ QQuery 11 │ 13.33 / 13.55 ±0.22 / 13.89 ms │ 14.31 / 14.62 ±0.35 / 15.29 ms │  1.08x slower │
│ QQuery 12 │ 24.06 / 24.89 ±0.90 / 26.59 ms │ 25.58 / 26.08 ±0.68 / 27.39 ms │     no change │
│ QQuery 13 │ 32.02 / 34.31 ±1.31 / 36.07 ms │ 30.43 / 31.64 ±1.22 / 33.76 ms │ +1.08x faster │
│ QQuery 14 │ 23.71 / 23.78 ±0.05 / 23.86 ms │ 25.63 / 25.87 ±0.28 / 26.33 ms │  1.09x slower │
│ QQuery 15 │ 30.95 / 31.80 ±0.74 / 33.12 ms │ 32.78 / 33.10 ±0.26 / 33.55 ms │     no change │
│ QQuery 16 │ 13.91 / 14.03 ±0.10 / 14.19 ms │ 14.03 / 14.29 ±0.37 / 15.01 ms │     no change │
│ QQuery 17 │ 86.71 / 87.81 ±1.10 / 89.74 ms │ 89.74 / 92.02 ±2.10 / 95.25 ms │     no change │
│ QQuery 18 │ 64.91 / 67.00 ±1.95 / 70.36 ms │ 70.69 / 72.18 ±0.86 / 73.23 ms │  1.08x slower │
│ QQuery 19 │ 32.86 / 33.40 ±0.50 / 34.25 ms │ 39.88 / 40.15 ±0.16 / 40.34 ms │  1.20x slower │
│ QQuery 20 │ 34.24 / 34.61 ±0.32 / 35.16 ms │ 36.15 / 36.62 ±0.44 / 37.33 ms │  1.06x slower │
│ QQuery 21 │ 54.99 / 56.97 ±1.62 / 59.24 ms │ 56.79 / 59.27 ±2.22 / 63.40 ms │     no change │
│ QQuery 22 │ 13.86 / 14.42 ±0.64 / 15.65 ms │ 16.62 / 17.02 ±0.45 / 17.90 ms │  1.18x slower │
└───────────┴────────────────────────────────┴────────────────────────────────┴───────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━┓
┃ Benchmark Summary                        ┃          ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━┩
│ Total Time (HEAD)                        │ 787.55ms │
│ Total Time (split-row-groups-by-range)   │ 844.57ms │
│ Average Time (HEAD)                      │  35.80ms │
│ Average Time (split-row-groups-by-range) │  38.39ms │
│ Queries Faster                           │        1 │
│ Queries Slower                           │       11 │
│ Queries with No Change                   │       10 │
│ Queries with Failure                     │        0 │
└──────────────────────────────────────────┴──────────┘

Resource Usage

tpch — base (merge-base)

Metric Value
Wall time 5.0s
Peak memory 1.2 GiB
Avg memory 497.3 MiB
CPU user 23.3s
CPU sys 1.7s
Peak spill 0 B

tpch — branch

Metric Value
Wall time 5.0s
Peak memory 1.3 GiB
Avg memory 744.5 MiB
CPU user 25.7s
CPU sys 1.9s
Peak spill 0 B

File an issue against this benchmark runner

@adriangbot

Copy link
Copy Markdown

🤖 Benchmark completed (GKE) | trigger

Instance: c4a-highmem-16 (12 vCPU / 65 GiB)

CPU Details (lscpu)
Architecture:                            aarch64
CPU op-mode(s):                          64-bit
Byte Order:                              Little Endian
CPU(s):                                  16
On-line CPU(s) list:                     0-15
Vendor ID:                               ARM
Model name:                              Neoverse-V2
Model:                                   1
Thread(s) per core:                      1
Core(s) per cluster:                     16
Socket(s):                               -
Cluster(s):                              1
Stepping:                                r0p1
BogoMIPS:                                2000.00
Flags:                                   fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 sve asimdfhm dit uscat ilrcpc flagm sb paca pacg dcpodp sve2 sveaes svepmull svebitperm svesha3 svesm4 flagm2 frint svei8mm svebf16 i8mm bf16 dgh rng bti
L1d cache:                               1 MiB (16 instances)
L1i cache:                               1 MiB (16 instances)
L2 cache:                                32 MiB (16 instances)
L3 cache:                                80 MiB (1 instance)
NUMA node(s):                            1
NUMA node0 CPU(s):                       0-15
Vulnerability Gather data sampling:      Not affected
Vulnerability Indirect target selection: Not affected
Vulnerability Itlb multihit:             Not affected
Vulnerability L1tf:                      Not affected
Vulnerability Mds:                       Not affected
Vulnerability Meltdown:                  Not affected
Vulnerability Mmio stale data:           Not affected
Vulnerability Reg file data sampling:    Not affected
Vulnerability Retbleed:                  Not affected
Vulnerability Spec rstack overflow:      Not affected
Vulnerability Spec store bypass:         Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1:                Mitigation; __user pointer sanitization
Vulnerability Spectre v2:                Mitigation; CSV2, BHB
Vulnerability Srbds:                     Not affected
Vulnerability Tsa:                       Not affected
Vulnerability Tsx async abort:           Not affected
Vulnerability Vmscape:                   Not affected
Details

Comparing HEAD and split-row-groups-by-range
--------------------
Benchmark tpcds_sf1.json
--------------------
┏━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query     ┃                                  HEAD ┃             split-row-groups-by-range ┃        Change ┃
┡━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ QQuery 1  │           5.56 / 6.05 ±0.84 / 7.73 ms │           5.61 / 6.12 ±0.89 / 7.89 ms │     no change │
│ QQuery 2  │        81.57 / 82.12 ±0.36 / 82.57 ms │        37.89 / 38.74 ±1.05 / 40.78 ms │ +2.12x faster │
│ QQuery 3  │        29.92 / 30.06 ±0.12 / 30.20 ms │        16.35 / 16.47 ±0.09 / 16.64 ms │ +1.82x faster │
│ QQuery 4  │     492.53 / 499.60 ±3.88 / 504.28 ms │     232.35 / 238.57 ±3.94 / 242.71 ms │ +2.09x faster │
│ QQuery 5  │        52.43 / 52.75 ±0.17 / 52.93 ms │        47.70 / 50.66 ±2.36 / 54.08 ms │     no change │
│ QQuery 6  │        36.69 / 37.34 ±0.42 / 37.94 ms │        18.58 / 19.06 ±0.38 / 19.56 ms │ +1.96x faster │
│ QQuery 7  │        94.37 / 94.94 ±0.64 / 96.16 ms │        37.77 / 39.17 ±2.08 / 43.22 ms │ +2.42x faster │
│ QQuery 8  │        37.36 / 39.66 ±3.68 / 47.00 ms │        23.66 / 23.96 ±0.30 / 24.36 ms │ +1.65x faster │
│ QQuery 9  │        56.70 / 57.67 ±0.93 / 59.05 ms │        63.15 / 64.87 ±1.36 / 66.59 ms │  1.12x slower │
│ QQuery 10 │        63.38 / 63.79 ±0.43 / 64.62 ms │        35.92 / 36.49 ±0.38 / 37.06 ms │ +1.75x faster │
│ QQuery 11 │     308.14 / 311.11 ±2.75 / 315.73 ms │     124.57 / 126.20 ±0.88 / 127.15 ms │ +2.47x faster │
│ QQuery 12 │        28.47 / 29.11 ±0.61 / 30.03 ms │        13.12 / 13.27 ±0.18 / 13.62 ms │ +2.19x faster │
│ QQuery 13 │     120.85 / 123.21 ±3.59 / 130.32 ms │        55.69 / 56.66 ±1.78 / 60.22 ms │ +2.17x faster │
│ QQuery 14 │     416.52 / 424.89 ±6.39 / 435.00 ms │     238.34 / 241.47 ±2.89 / 245.51 ms │ +1.76x faster │
│ QQuery 15 │        58.75 / 59.45 ±0.77 / 60.94 ms │        16.78 / 16.95 ±0.13 / 17.14 ms │ +3.51x faster │
│ QQuery 16 │           6.72 / 6.86 ±0.16 / 7.16 ms │           7.03 / 7.17 ±0.20 / 7.55 ms │     no change │
│ QQuery 17 │        80.77 / 81.75 ±1.07 / 83.44 ms │        44.15 / 46.05 ±2.12 / 50.03 ms │ +1.78x faster │
│ QQuery 18 │     124.11 / 124.98 ±0.86 / 126.11 ms │        52.83 / 53.54 ±0.56 / 54.18 ms │ +2.33x faster │
│ QQuery 19 │        41.41 / 41.82 ±0.45 / 42.66 ms │        24.47 / 24.83 ±0.32 / 25.28 ms │ +1.68x faster │
│ QQuery 20 │        36.13 / 37.09 ±0.84 / 38.55 ms │        15.76 / 15.97 ±0.16 / 16.25 ms │ +2.32x faster │
│ QQuery 21 │        17.73 / 18.03 ±0.25 / 18.46 ms │        18.70 / 18.91 ±0.18 / 19.15 ms │     no change │
│ QQuery 22 │        62.60 / 63.18 ±0.53 / 64.08 ms │        60.32 / 61.30 ±1.27 / 63.78 ms │     no change │
│ QQuery 23 │     358.31 / 363.14 ±5.98 / 374.32 ms │     225.12 / 229.42 ±2.89 / 233.60 ms │ +1.58x faster │
│ QQuery 24 │     229.18 / 231.27 ±1.95 / 234.55 ms │     133.87 / 136.34 ±1.70 / 138.84 ms │ +1.70x faster │
│ QQuery 25 │     111.18 / 112.11 ±0.72 / 113.23 ms │        54.12 / 55.96 ±1.30 / 57.83 ms │ +2.00x faster │
│ QQuery 26 │        58.63 / 59.51 ±0.91 / 61.07 ms │        25.42 / 25.51 ±0.10 / 25.64 ms │ +2.33x faster │
│ QQuery 27 │           6.42 / 6.55 ±0.13 / 6.79 ms │           6.45 / 6.55 ±0.17 / 6.88 ms │     no change │
│ QQuery 28 │        61.63 / 62.51 ±0.92 / 64.27 ms │        55.33 / 56.46 ±0.63 / 57.09 ms │ +1.11x faster │
│ QQuery 29 │      98.09 / 102.44 ±5.44 / 112.87 ms │        47.55 / 49.00 ±1.60 / 51.95 ms │ +2.09x faster │
│ QQuery 30 │        32.71 / 32.93 ±0.22 / 33.25 ms │        25.87 / 26.93 ±1.18 / 28.88 ms │ +1.22x faster │
│ QQuery 31 │     112.14 / 113.07 ±0.94 / 114.28 ms │        72.41 / 72.75 ±0.40 / 73.51 ms │ +1.55x faster │
│ QQuery 32 │        20.36 / 22.04 ±2.34 / 26.67 ms │        15.29 / 15.43 ±0.11 / 15.57 ms │ +1.43x faster │
│ QQuery 33 │        38.59 / 39.85 ±1.67 / 43.15 ms │        29.27 / 31.37 ±1.11 / 32.15 ms │ +1.27x faster │
│ QQuery 34 │         9.90 / 10.28 ±0.26 / 10.57 ms │          8.76 / 9.36 ±0.50 / 10.08 ms │ +1.10x faster │
│ QQuery 35 │        73.19 / 73.91 ±0.69 / 75.17 ms │        39.45 / 39.89 ±0.52 / 40.81 ms │ +1.85x faster │
│ QQuery 36 │           5.95 / 6.11 ±0.16 / 6.40 ms │           6.00 / 6.17 ±0.16 / 6.49 ms │     no change │
│ QQuery 37 │           7.17 / 7.24 ±0.08 / 7.37 ms │           7.18 / 7.34 ±0.09 / 7.44 ms │     no change │
│ QQuery 38 │        63.08 / 64.54 ±2.18 / 68.87 ms │        28.84 / 29.51 ±0.70 / 30.86 ms │ +2.19x faster │
│ QQuery 39 │        86.63 / 87.30 ±0.68 / 88.50 ms │        84.16 / 84.69 ±0.51 / 85.56 ms │     no change │
│ QQuery 40 │        23.40 / 23.83 ±0.25 / 24.06 ms │        16.42 / 16.53 ±0.10 / 16.70 ms │ +1.44x faster │
│ QQuery 41 │        11.66 / 11.81 ±0.18 / 12.15 ms │        11.45 / 11.61 ±0.24 / 12.09 ms │     no change │
│ QQuery 42 │        23.90 / 24.42 ±0.41 / 24.95 ms │        14.24 / 15.43 ±1.50 / 18.27 ms │ +1.58x faster │
│ QQuery 43 │           4.85 / 4.99 ±0.15 / 5.24 ms │           4.76 / 4.92 ±0.14 / 5.18 ms │     no change │
│ QQuery 44 │           9.32 / 9.38 ±0.06 / 9.49 ms │           9.22 / 9.39 ±0.10 / 9.52 ms │     no change │
│ QQuery 45 │        38.35 / 40.01 ±1.77 / 43.30 ms │        14.59 / 15.14 ±0.43 / 15.70 ms │ +2.64x faster │
│ QQuery 46 │        11.87 / 12.17 ±0.23 / 12.48 ms │        11.66 / 12.11 ±0.33 / 12.48 ms │     no change │
│ QQuery 47 │     234.65 / 241.32 ±4.74 / 248.69 ms │     101.44 / 103.19 ±1.63 / 106.02 ms │ +2.34x faster │
│ QQuery 48 │        98.10 / 98.65 ±0.47 / 99.51 ms │        41.07 / 41.43 ±0.28 / 41.88 ms │ +2.38x faster │
│ QQuery 49 │        78.66 / 79.91 ±0.71 / 80.66 ms │        66.51 / 67.28 ±0.52 / 67.96 ms │ +1.19x faster │
│ QQuery 50 │        59.39 / 59.78 ±0.25 / 60.17 ms │        33.91 / 35.34 ±1.48 / 37.54 ms │ +1.69x faster │
│ QQuery 51 │      98.75 / 101.18 ±2.60 / 106.14 ms │        66.41 / 68.87 ±1.99 / 72.19 ms │ +1.47x faster │
│ QQuery 52 │        24.63 / 25.23 ±0.52 / 25.93 ms │        14.90 / 15.34 ±0.50 / 16.16 ms │ +1.64x faster │
│ QQuery 53 │        30.65 / 31.21 ±0.71 / 32.48 ms │        14.56 / 14.81 ±0.27 / 15.29 ms │ +2.11x faster │
│ QQuery 54 │        56.29 / 56.78 ±0.26 / 57.02 ms │        33.54 / 35.04 ±2.52 / 40.04 ms │ +1.62x faster │
│ QQuery 55 │        23.72 / 24.03 ±0.19 / 24.24 ms │        14.09 / 14.26 ±0.08 / 14.33 ms │ +1.69x faster │
│ QQuery 56 │        39.51 / 39.91 ±0.30 / 40.34 ms │        32.32 / 34.49 ±1.60 / 37.00 ms │ +1.16x faster │
│ QQuery 57 │     179.40 / 181.11 ±1.48 / 183.53 ms │        61.13 / 61.47 ±0.33 / 62.03 ms │ +2.95x faster │
│ QQuery 58 │     115.89 / 117.19 ±0.97 / 118.52 ms │        51.72 / 52.51 ±0.98 / 54.25 ms │ +2.23x faster │
│ QQuery 59 │     118.64 / 118.98 ±0.40 / 119.73 ms │        44.08 / 45.51 ±2.42 / 50.31 ms │ +2.61x faster │
│ QQuery 60 │        40.37 / 41.82 ±1.51 / 44.58 ms │        31.98 / 33.63 ±1.16 / 35.42 ms │ +1.24x faster │
│ QQuery 61 │        12.57 / 12.77 ±0.19 / 13.10 ms │        12.60 / 12.74 ±0.13 / 12.95 ms │     no change │
│ QQuery 62 │        46.38 / 46.72 ±0.21 / 46.98 ms │        13.30 / 13.43 ±0.09 / 13.53 ms │ +3.48x faster │
│ QQuery 63 │        30.09 / 30.30 ±0.18 / 30.63 ms │        14.48 / 14.76 ±0.15 / 14.88 ms │ +2.05x faster │
│ QQuery 64 │     409.41 / 413.73 ±2.54 / 417.19 ms │     240.44 / 244.53 ±2.91 / 248.19 ms │ +1.69x faster │
│ QQuery 65 │     149.87 / 152.76 ±3.38 / 158.84 ms │     151.99 / 157.91 ±4.17 / 164.79 ms │     no change │
│ QQuery 66 │        78.85 / 79.72 ±0.67 / 80.72 ms │        53.35 / 55.47 ±1.57 / 58.12 ms │ +1.44x faster │
│ QQuery 67 │     244.95 / 250.37 ±3.63 / 255.52 ms │     106.22 / 107.75 ±1.31 / 109.48 ms │ +2.32x faster │
│ QQuery 68 │        12.32 / 12.45 ±0.10 / 12.64 ms │        11.95 / 12.62 ±0.42 / 13.14 ms │     no change │
│ QQuery 69 │        58.26 / 59.04 ±0.87 / 60.65 ms │        33.53 / 33.88 ±0.42 / 34.61 ms │ +1.74x faster │
│ QQuery 70 │     105.70 / 107.07 ±1.41 / 109.56 ms │        69.56 / 71.61 ±1.82 / 74.76 ms │ +1.50x faster │
│ QQuery 71 │        36.25 / 39.23 ±2.86 / 42.98 ms │        28.75 / 30.49 ±1.52 / 32.65 ms │ +1.29x faster │
│ QQuery 72 │ 2163.56 / 2230.81 ±56.64 / 2327.80 ms │ 1966.99 / 2051.64 ±63.16 / 2157.87 ms │ +1.09x faster │
│ QQuery 73 │          9.65 / 9.86 ±0.21 / 10.23 ms │          8.40 / 9.60 ±0.76 / 10.38 ms │     no change │
│ QQuery 74 │     172.47 / 177.68 ±7.91 / 193.43 ms │        78.48 / 82.45 ±3.07 / 87.81 ms │ +2.16x faster │
│ QQuery 75 │     151.30 / 152.04 ±0.75 / 153.46 ms │     112.30 / 114.72 ±2.08 / 118.23 ms │ +1.33x faster │
│ QQuery 76 │        35.91 / 36.91 ±0.95 / 38.65 ms │        27.00 / 29.58 ±1.95 / 31.64 ms │ +1.25x faster │
│ QQuery 77 │        61.67 / 62.29 ±0.43 / 62.99 ms │        51.32 / 54.14 ±3.63 / 61.27 ms │ +1.15x faster │
│ QQuery 78 │     185.51 / 189.68 ±2.91 / 193.86 ms │        82.17 / 83.98 ±1.58 / 86.65 ms │ +2.26x faster │
│ QQuery 79 │        67.78 / 68.06 ±0.27 / 68.51 ms │        32.67 / 34.83 ±3.32 / 41.46 ms │ +1.95x faster │
│ QQuery 80 │      98.88 / 102.84 ±3.80 / 109.16 ms │        80.20 / 84.20 ±2.46 / 87.29 ms │ +1.22x faster │
│ QQuery 81 │        26.01 / 26.37 ±0.20 / 26.57 ms │        24.39 / 24.74 ±0.44 / 25.58 ms │ +1.07x faster │
│ QQuery 82 │        16.61 / 16.94 ±0.25 / 17.32 ms │        14.95 / 16.27 ±1.90 / 20.05 ms │     no change │
│ QQuery 83 │        40.80 / 41.19 ±0.30 / 41.71 ms │        29.83 / 30.57 ±0.48 / 31.27 ms │ +1.35x faster │
│ QQuery 84 │        30.12 / 30.32 ±0.15 / 30.56 ms │        20.81 / 21.57 ±0.72 / 22.46 ms │ +1.41x faster │
│ QQuery 85 │     109.36 / 111.81 ±2.41 / 115.94 ms │        49.67 / 52.05 ±3.58 / 59.18 ms │ +2.15x faster │
│ QQuery 86 │        25.60 / 25.87 ±0.22 / 26.19 ms │        10.45 / 10.67 ±0.20 / 10.96 ms │ +2.43x faster │
│ QQuery 87 │        62.96 / 63.91 ±0.71 / 65.02 ms │        28.39 / 30.23 ±2.10 / 34.29 ms │ +2.11x faster │
│ QQuery 88 │        64.53 / 65.86 ±1.79 / 69.27 ms │        66.95 / 68.40 ±2.02 / 72.30 ms │     no change │
│ QQuery 89 │        36.83 / 37.42 ±0.59 / 38.35 ms │        17.55 / 17.85 ±0.21 / 18.15 ms │ +2.10x faster │
│ QQuery 90 │        17.43 / 17.58 ±0.10 / 17.71 ms │        11.26 / 11.36 ±0.08 / 11.47 ms │ +1.55x faster │
│ QQuery 91 │        46.48 / 47.06 ±0.49 / 47.90 ms │        27.63 / 28.31 ±0.56 / 28.91 ms │ +1.66x faster │
│ QQuery 92 │        29.59 / 29.95 ±0.31 / 30.44 ms │        15.60 / 16.05 ±0.24 / 16.30 ms │ +1.87x faster │
│ QQuery 93 │        49.34 / 50.64 ±0.70 / 51.37 ms │        24.89 / 25.64 ±0.38 / 25.85 ms │ +1.98x faster │
│ QQuery 94 │        38.75 / 40.23 ±1.28 / 42.44 ms │        20.81 / 21.08 ±0.31 / 21.65 ms │ +1.91x faster │
│ QQuery 95 │        81.57 / 82.38 ±0.89 / 83.59 ms │        47.26 / 48.96 ±2.33 / 53.58 ms │ +1.68x faster │
│ QQuery 96 │        24.25 / 24.50 ±0.29 / 25.06 ms │        10.60 / 10.81 ±0.14 / 10.99 ms │ +2.27x faster │
│ QQuery 97 │        54.74 / 56.52 ±1.60 / 59.53 ms │        21.61 / 22.65 ±0.96 / 24.14 ms │ +2.50x faster │
│ QQuery 98 │        41.92 / 43.38 ±0.80 / 44.25 ms │        21.79 / 22.19 ±0.33 / 22.67 ms │ +1.96x faster │
│ QQuery 99 │        70.79 / 71.00 ±0.19 / 71.26 ms │        17.69 / 18.30 ±0.97 / 20.22 ms │ +3.88x faster │
└───────────┴───────────────────────────────────────┴───────────────────────────────────────┴───────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓
┃ Benchmark Summary                        ┃            ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩
│ Total Time (HEAD)                        │ 10141.21ms │
│ Total Time (split-row-groups-by-range)   │  6605.41ms │
│ Average Time (HEAD)                      │   102.44ms │
│ Average Time (split-row-groups-by-range) │    66.72ms │
│ Queries Faster                           │         79 │
│ Queries Slower                           │          1 │
│ Queries with No Change                   │         19 │
│ Queries with Failure                     │          0 │
└──────────────────────────────────────────┴────────────┘

Resource Usage

tpcds — base (merge-base)

Metric Value
Wall time 55.0s
Peak memory 2.4 GiB
Avg memory 1.6 GiB
CPU user 228.2s
CPU sys 6.0s
Peak spill 0 B

tpcds — branch

Metric Value
Wall time 35.0s
Peak memory 3.4 GiB
Avg memory 2.2 GiB
CPU user 257.4s
CPU sys 8.3s
Peak spill 0 B

File an issue against this benchmark runner

@adriangbot

Copy link
Copy Markdown

🤖 Benchmark completed (GKE) | trigger

Instance: c4a-highmem-16 (12 vCPU / 65 GiB)

CPU Details (lscpu)
Architecture:                            aarch64
CPU op-mode(s):                          64-bit
Byte Order:                              Little Endian
CPU(s):                                  16
On-line CPU(s) list:                     0-15
Vendor ID:                               ARM
Model name:                              Neoverse-V2
Model:                                   1
Thread(s) per core:                      1
Core(s) per cluster:                     16
Socket(s):                               -
Cluster(s):                              1
Stepping:                                r0p1
BogoMIPS:                                2000.00
Flags:                                   fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 sve asimdfhm dit uscat ilrcpc flagm sb paca pacg dcpodp sve2 sveaes svepmull svebitperm svesha3 svesm4 flagm2 frint svei8mm svebf16 i8mm bf16 dgh rng bti
L1d cache:                               1 MiB (16 instances)
L1i cache:                               1 MiB (16 instances)
L2 cache:                                32 MiB (16 instances)
L3 cache:                                80 MiB (1 instance)
NUMA node(s):                            1
NUMA node0 CPU(s):                       0-15
Vulnerability Gather data sampling:      Not affected
Vulnerability Indirect target selection: Not affected
Vulnerability Itlb multihit:             Not affected
Vulnerability L1tf:                      Not affected
Vulnerability Mds:                       Not affected
Vulnerability Meltdown:                  Not affected
Vulnerability Mmio stale data:           Not affected
Vulnerability Reg file data sampling:    Not affected
Vulnerability Retbleed:                  Not affected
Vulnerability Spec rstack overflow:      Not affected
Vulnerability Spec store bypass:         Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1:                Mitigation; __user pointer sanitization
Vulnerability Spectre v2:                Mitigation; CSV2, BHB
Vulnerability Srbds:                     Not affected
Vulnerability Tsa:                       Not affected
Vulnerability Tsx async abort:           Not affected
Vulnerability Vmscape:                   Not affected
Details

Comparing HEAD and split-row-groups-by-range
--------------------
Benchmark clickbench_partitioned.json
--------------------
┏━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query     ┃                                       HEAD ┃                  split-row-groups-by-range ┃        Change ┃
┡━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ QQuery 0  │               1.20 / 4.05 ±5.55 / 15.16 ms │               1.21 / 4.01 ±5.47 / 14.96 ms │     no change │
│ QQuery 1  │             12.74 / 13.08 ±0.23 / 13.31 ms │             12.73 / 13.04 ±0.18 / 13.24 ms │     no change │
│ QQuery 2  │             35.84 / 36.35 ±0.42 / 36.92 ms │             36.16 / 36.39 ±0.24 / 36.82 ms │     no change │
│ QQuery 3  │             30.98 / 32.03 ±1.75 / 35.53 ms │             31.10 / 31.78 ±0.73 / 33.12 ms │     no change │
│ QQuery 4  │      1622.47 / 1687.55 ±51.18 / 1758.87 ms │      1711.47 / 1749.50 ±26.29 / 1782.73 ms │     no change │
│ QQuery 5  │     1625.84 / 1778.23 ±154.61 / 2042.21 ms │     1798.25 / 1982.75 ±138.88 / 2196.68 ms │  1.12x slower │
│ QQuery 6  │                1.30 / 1.46 ±0.24 / 1.94 ms │                1.26 / 1.42 ±0.25 / 1.91 ms │     no change │
│ QQuery 7  │             14.41 / 14.57 ±0.10 / 14.70 ms │             14.02 / 14.22 ±0.12 / 14.37 ms │     no change │
│ QQuery 8  │      2056.43 / 2110.97 ±55.39 / 2209.90 ms │      2039.48 / 2103.40 ±83.24 / 2266.79 ms │     no change │
│ QQuery 9  │         512.69 / 525.52 ±14.22 / 553.26 ms │         484.68 / 542.52 ±43.44 / 610.81 ms │     no change │
│ QQuery 10 │             76.43 / 79.42 ±1.92 / 81.42 ms │             78.25 / 79.84 ±1.57 / 82.02 ms │     no change │
│ QQuery 11 │             91.92 / 93.85 ±1.85 / 97.33 ms │          89.46 / 104.11 ±27.14 / 158.35 ms │  1.11x slower │
│ QQuery 12 │     1605.50 / 1791.41 ±109.77 / 1942.45 ms │     1706.68 / 1963.07 ±227.42 / 2246.63 ms │  1.10x slower │
│ QQuery 13 │         559.51 / 664.33 ±96.84 / 842.81 ms │        430.02 / 566.47 ±162.31 / 822.96 ms │ +1.17x faster │
│ QQuery 14 │         546.11 / 568.84 ±20.36 / 604.56 ms │          558.52 / 571.42 ±7.28 / 579.15 ms │     no change │
│ QQuery 15 │      1863.57 / 1929.98 ±53.40 / 2001.31 ms │      1928.91 / 2014.39 ±55.23 / 2079.41 ms │     no change │
│ QQuery 16 │     4131.34 / 4291.63 ±110.47 / 4439.07 ms │      4169.19 / 4275.75 ±81.02 / 4379.79 ms │     no change │
│ QQuery 17 │     4201.40 / 4394.02 ±151.87 / 4604.08 ms │      4168.44 / 4262.97 ±50.90 / 4310.90 ms │     no change │
│ QQuery 18 │  17683.52 / 18194.60 ±377.96 / 18731.28 ms │  17250.85 / 18528.10 ±702.71 / 19071.41 ms │     no change │
│ QQuery 19 │             28.26 / 34.69 ±5.83 / 42.47 ms │             28.68 / 29.96 ±1.68 / 33.24 ms │ +1.16x faster │
│ QQuery 20 │         514.99 / 523.74 ±11.32 / 544.58 ms │          517.48 / 523.58 ±4.25 / 530.35 ms │     no change │
│ QQuery 21 │          508.89 / 519.95 ±7.74 / 531.82 ms │          524.98 / 529.20 ±3.94 / 535.61 ms │     no change │
│ QQuery 22 │         987.48 / 996.39 ±5.31 / 1002.49 ms │          969.96 / 980.56 ±8.93 / 994.08 ms │     no change │
│ QQuery 23 │      3018.49 / 3064.33 ±29.51 / 3105.13 ms │      3174.33 / 3196.09 ±14.69 / 3210.08 ms │     no change │
│ QQuery 24 │             41.96 / 44.82 ±4.97 / 54.71 ms │             41.69 / 46.61 ±8.70 / 63.99 ms │     no change │
│ QQuery 25 │          111.51 / 115.38 ±4.37 / 123.75 ms │          113.15 / 116.02 ±3.09 / 121.91 ms │     no change │
│ QQuery 26 │             42.46 / 43.50 ±1.40 / 46.22 ms │             42.67 / 45.82 ±3.89 / 52.67 ms │  1.05x slower │
│ QQuery 27 │          663.42 / 677.12 ±9.41 / 688.08 ms │          677.51 / 684.35 ±5.74 / 694.33 ms │     no change │
│ QQuery 28 │      3518.03 / 3572.66 ±56.31 / 3659.61 ms │      3453.25 / 3531.10 ±46.81 / 3570.97 ms │     no change │
│ QQuery 29 │             40.62 / 40.85 ±0.17 / 41.08 ms │             41.19 / 47.97 ±8.88 / 63.74 ms │  1.17x slower │
│ QQuery 30 │         560.92 / 571.71 ±13.12 / 596.58 ms │         568.64 / 588.24 ±16.43 / 618.40 ms │     no change │
│ QQuery 31 │         291.74 / 309.36 ±11.04 / 320.55 ms │         296.80 / 308.88 ±11.75 / 330.78 ms │     no change │
│ QQuery 32 │        954.05 / 999.76 ±24.71 / 1028.74 ms │      1029.93 / 1079.02 ±35.13 / 1124.98 ms │  1.08x slower │
│ QQuery 33 │ 26481.00 / 28172.02 ±1020.13 / 29616.16 ms │ 25406.63 / 30902.41 ±4430.89 / 37459.15 ms │  1.10x slower │
│ QQuery 34 │ 26203.36 / 27329.54 ±1205.32 / 29267.27 ms │ 27747.16 / 29173.84 ±1346.46 / 31636.82 ms │  1.07x slower │
│ QQuery 35 │      978.73 / 1094.58 ±102.04 / 1283.31 ms │     1097.27 / 1201.71 ±155.20 / 1507.47 ms │  1.10x slower │
│ QQuery 36 │          158.32 / 170.69 ±9.45 / 185.10 ms │          168.88 / 174.58 ±5.11 / 183.75 ms │     no change │
│ QQuery 37 │             38.43 / 46.23 ±8.65 / 62.99 ms │           37.53 / 55.12 ±28.11 / 111.21 ms │  1.19x slower │
│ QQuery 38 │             41.80 / 43.35 ±1.08 / 45.04 ms │             40.66 / 43.95 ±2.23 / 46.74 ms │     no change │
│ QQuery 39 │          183.61 / 196.77 ±9.23 / 211.96 ms │         184.13 / 198.25 ±14.61 / 226.00 ms │     no change │
│ QQuery 40 │             14.70 / 15.53 ±0.76 / 16.73 ms │             14.57 / 14.90 ±0.28 / 15.34 ms │     no change │
│ QQuery 41 │             14.23 / 14.32 ±0.06 / 14.42 ms │             14.08 / 14.21 ±0.09 / 14.34 ms │     no change │
│ QQuery 42 │             13.57 / 15.49 ±3.36 / 22.21 ms │             13.46 / 15.00 ±2.68 / 20.36 ms │     no change │
└───────────┴────────────────────────────────────────────┴────────────────────────────────────────────┴───────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┓
┃ Benchmark Summary                        ┃             ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━┩
│ Total Time (HEAD)                        │ 106824.67ms │
│ Total Time (split-row-groups-by-range)   │ 112346.51ms │
│ Average Time (HEAD)                      │   2484.29ms │
│ Average Time (split-row-groups-by-range) │   2612.71ms │
│ Queries Faster                           │           2 │
│ Queries Slower                           │          10 │
│ Queries with No Change                   │          31 │
│ Queries with Failure                     │           0 │
└──────────────────────────────────────────┴─────────────┘

Resource Usage

clickbench_partitioned — base (merge-base)

Metric Value
Wall time 535.1s
Peak memory 11.8 GiB
Avg memory 6.6 GiB
CPU user 4874.3s
CPU sys 316.8s
Peak spill 0 B

clickbench_partitioned — branch

Metric Value
Wall time 565.1s
Peak memory 11.8 GiB
Avg memory 6.5 GiB
CPU user 4918.4s
CPU sys 332.3s
Peak spill 0 B

File an issue against this benchmark runner

…lits

Pieces of a byte-range-split file now share a per-file SharedFileState
(attached when the shared work queue is built): the first piece to finish
opening publishes its parsed metadata, specialized predicate/projection,
and the statistics/bloom/page-index pruned access plan; pieces that start
later skip all of that work and its I/O, going straight from file-level
pruning to building their decoder. Sharing is optimistic (no waiting):
pieces that start before anything is published open the file themselves,
exactly as before.

To make the published plan reusable, the byte-range restriction moves from
row-group pruning time to stream-build time for shared files. That also
happens after the page index is loaded, so split boundaries now snap to
the page boundaries of each row group's largest column chunk: adjacent
pieces no longer both decode the page straddling their boundary.

Tail morsel splitting is unchanged: unconditional ~1MiB-projected morsels
were re-benchmarked with the shared open state and are still slower
(TPC-H SF=1 -11%: residual per-piece decoder/filter setup outweighs the
balance gain), so morsels remain a tail-only mechanism.

TPC-DS SF=1 single-row-group: 42/99 queries 5-13% faster, total neutral
(dominated by one high-variance join query). TPC-H SF=1: 1.3% faster.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
@Dandandan

Copy link
Copy Markdown
Contributor Author

run benchmarks

@adriangbot

Copy link
Copy Markdown

🤖 Benchmark running (GKE) | trigger
Instance: c4a-highmem-16 (12 vCPU / 65 GiB) | Linux bench-c4864934104-803-bbpk6 6.12.85+ #1 SMP Mon May 11 08:17:35 UTC 2026 aarch64 GNU/Linux

CPU Details (lscpu)
Architecture:                            aarch64
CPU op-mode(s):                          64-bit
Byte Order:                              Little Endian
CPU(s):                                  16
On-line CPU(s) list:                     0-15
Vendor ID:                               ARM
Model name:                              Neoverse-V2
Model:                                   1
Thread(s) per core:                      1
Core(s) per cluster:                     16
Socket(s):                               -
Cluster(s):                              1
Stepping:                                r0p1
BogoMIPS:                                2000.00
Flags:                                   fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 sve asimdfhm dit uscat ilrcpc flagm sb paca pacg dcpodp sve2 sveaes svepmull svebitperm svesha3 svesm4 flagm2 frint svei8mm svebf16 i8mm bf16 dgh rng bti
L1d cache:                               1 MiB (16 instances)
L1i cache:                               1 MiB (16 instances)
L2 cache:                                32 MiB (16 instances)
L3 cache:                                80 MiB (1 instance)
NUMA node(s):                            1
NUMA node0 CPU(s):                       0-15
Vulnerability Gather data sampling:      Not affected
Vulnerability Indirect target selection: Not affected
Vulnerability Itlb multihit:             Not affected
Vulnerability L1tf:                      Not affected
Vulnerability Mds:                       Not affected
Vulnerability Meltdown:                  Not affected
Vulnerability Mmio stale data:           Not affected
Vulnerability Reg file data sampling:    Not affected
Vulnerability Retbleed:                  Not affected
Vulnerability Spec rstack overflow:      Not affected
Vulnerability Spec store bypass:         Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1:                Mitigation; __user pointer sanitization
Vulnerability Spectre v2:                Mitigation; CSV2, BHB
Vulnerability Srbds:                     Not affected
Vulnerability Tsa:                       Not affected
Vulnerability Tsx async abort:           Not affected
Vulnerability Vmscape:                   Not affected

Comparing split-row-groups-by-range (46e73b2) to 3bb9314 (merge-base) diff using: tpcds
Results will be posted here when complete


File an issue against this benchmark runner

@adriangbot

Copy link
Copy Markdown

🤖 Benchmark running (GKE) | trigger
Instance: c4a-highmem-16 (12 vCPU / 65 GiB) | Linux bench-c4864934104-804-shzjs 6.12.85+ #1 SMP Mon May 11 08:17:35 UTC 2026 aarch64 GNU/Linux

CPU Details (lscpu)
Architecture:                            aarch64
CPU op-mode(s):                          64-bit
Byte Order:                              Little Endian
CPU(s):                                  16
On-line CPU(s) list:                     0-15
Vendor ID:                               ARM
Model name:                              Neoverse-V2
Model:                                   1
Thread(s) per core:                      1
Core(s) per cluster:                     16
Socket(s):                               -
Cluster(s):                              1
Stepping:                                r0p1
BogoMIPS:                                2000.00
Flags:                                   fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 sve asimdfhm dit uscat ilrcpc flagm sb paca pacg dcpodp sve2 sveaes svepmull svebitperm svesha3 svesm4 flagm2 frint svei8mm svebf16 i8mm bf16 dgh rng bti
L1d cache:                               1 MiB (16 instances)
L1i cache:                               1 MiB (16 instances)
L2 cache:                                32 MiB (16 instances)
L3 cache:                                80 MiB (1 instance)
NUMA node(s):                            1
NUMA node0 CPU(s):                       0-15
Vulnerability Gather data sampling:      Not affected
Vulnerability Indirect target selection: Not affected
Vulnerability Itlb multihit:             Not affected
Vulnerability L1tf:                      Not affected
Vulnerability Mds:                       Not affected
Vulnerability Meltdown:                  Not affected
Vulnerability Mmio stale data:           Not affected
Vulnerability Reg file data sampling:    Not affected
Vulnerability Retbleed:                  Not affected
Vulnerability Spec rstack overflow:      Not affected
Vulnerability Spec store bypass:         Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1:                Mitigation; __user pointer sanitization
Vulnerability Spectre v2:                Mitigation; CSV2, BHB
Vulnerability Srbds:                     Not affected
Vulnerability Tsa:                       Not affected
Vulnerability Tsx async abort:           Not affected
Vulnerability Vmscape:                   Not affected

Comparing split-row-groups-by-range (46e73b2) to 3bb9314 (merge-base) diff using: tpch
Results will be posted here when complete


File an issue against this benchmark runner

@adriangbot

Copy link
Copy Markdown

🤖 Benchmark running (GKE) | trigger
Instance: c4a-highmem-16 (12 vCPU / 65 GiB) | Linux bench-c4864934104-802-cdzzj 6.12.85+ #1 SMP Mon May 11 08:17:35 UTC 2026 aarch64 GNU/Linux

CPU Details (lscpu)
Architecture:                            aarch64
CPU op-mode(s):                          64-bit
Byte Order:                              Little Endian
CPU(s):                                  16
On-line CPU(s) list:                     0-15
Vendor ID:                               ARM
Model name:                              Neoverse-V2
Model:                                   1
Thread(s) per core:                      1
Core(s) per cluster:                     16
Socket(s):                               -
Cluster(s):                              1
Stepping:                                r0p1
BogoMIPS:                                2000.00
Flags:                                   fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 sve asimdfhm dit uscat ilrcpc flagm sb paca pacg dcpodp sve2 sveaes svepmull svebitperm svesha3 svesm4 flagm2 frint svei8mm svebf16 i8mm bf16 dgh rng bti
L1d cache:                               1 MiB (16 instances)
L1i cache:                               1 MiB (16 instances)
L2 cache:                                32 MiB (16 instances)
L3 cache:                                80 MiB (1 instance)
NUMA node(s):                            1
NUMA node0 CPU(s):                       0-15
Vulnerability Gather data sampling:      Not affected
Vulnerability Indirect target selection: Not affected
Vulnerability Itlb multihit:             Not affected
Vulnerability L1tf:                      Not affected
Vulnerability Mds:                       Not affected
Vulnerability Meltdown:                  Not affected
Vulnerability Mmio stale data:           Not affected
Vulnerability Reg file data sampling:    Not affected
Vulnerability Retbleed:                  Not affected
Vulnerability Spec rstack overflow:      Not affected
Vulnerability Spec store bypass:         Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1:                Mitigation; __user pointer sanitization
Vulnerability Spectre v2:                Mitigation; CSV2, BHB
Vulnerability Srbds:                     Not affected
Vulnerability Tsa:                       Not affected
Vulnerability Tsx async abort:           Not affected
Vulnerability Vmscape:                   Not affected

Comparing split-row-groups-by-range (46e73b2) to 3bb9314 (merge-base) diff using: clickbench_partitioned
Results will be posted here when complete


File an issue against this benchmark runner

@adriangbot

Copy link
Copy Markdown

🤖 Benchmark completed (GKE) | trigger

Instance: c4a-highmem-16 (12 vCPU / 65 GiB)

CPU Details (lscpu)
Architecture:                            aarch64
CPU op-mode(s):                          64-bit
Byte Order:                              Little Endian
CPU(s):                                  16
On-line CPU(s) list:                     0-15
Vendor ID:                               ARM
Model name:                              Neoverse-V2
Model:                                   1
Thread(s) per core:                      1
Core(s) per cluster:                     16
Socket(s):                               -
Cluster(s):                              1
Stepping:                                r0p1
BogoMIPS:                                2000.00
Flags:                                   fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 sve asimdfhm dit uscat ilrcpc flagm sb paca pacg dcpodp sve2 sveaes svepmull svebitperm svesha3 svesm4 flagm2 frint svei8mm svebf16 i8mm bf16 dgh rng bti
L1d cache:                               1 MiB (16 instances)
L1i cache:                               1 MiB (16 instances)
L2 cache:                                32 MiB (16 instances)
L3 cache:                                80 MiB (1 instance)
NUMA node(s):                            1
NUMA node0 CPU(s):                       0-15
Vulnerability Gather data sampling:      Not affected
Vulnerability Indirect target selection: Not affected
Vulnerability Itlb multihit:             Not affected
Vulnerability L1tf:                      Not affected
Vulnerability Mds:                       Not affected
Vulnerability Meltdown:                  Not affected
Vulnerability Mmio stale data:           Not affected
Vulnerability Reg file data sampling:    Not affected
Vulnerability Retbleed:                  Not affected
Vulnerability Spec rstack overflow:      Not affected
Vulnerability Spec store bypass:         Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1:                Mitigation; __user pointer sanitization
Vulnerability Spectre v2:                Mitigation; CSV2, BHB
Vulnerability Srbds:                     Not affected
Vulnerability Tsa:                       Not affected
Vulnerability Tsx async abort:           Not affected
Vulnerability Vmscape:                   Not affected
Details

Comparing HEAD and split-row-groups-by-range
--------------------
Benchmark tpch_sf1.json
--------------------
┏━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query     ┃                           HEAD ┃      split-row-groups-by-range ┃        Change ┃
┡━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ QQuery 1  │ 38.34 / 40.11 ±1.92 / 43.00 ms │ 40.07 / 40.88 ±1.09 / 43.03 ms │     no change │
│ QQuery 2  │ 19.35 / 19.74 ±0.55 / 20.80 ms │ 18.80 / 19.47 ±0.61 / 20.32 ms │     no change │
│ QQuery 3  │ 30.76 / 31.66 ±1.10 / 33.79 ms │ 30.86 / 32.49 ±1.10 / 33.85 ms │     no change │
│ QQuery 4  │ 17.63 / 18.31 ±0.61 / 19.11 ms │ 17.56 / 18.20 ±0.50 / 18.64 ms │     no change │
│ QQuery 5  │ 40.42 / 41.09 ±0.53 / 41.83 ms │ 38.82 / 40.44 ±1.14 / 42.08 ms │     no change │
│ QQuery 6  │ 16.32 / 16.49 ±0.09 / 16.56 ms │ 18.37 / 18.56 ±0.10 / 18.67 ms │  1.13x slower │
│ QQuery 7  │ 45.85 / 47.35 ±0.85 / 48.21 ms │ 46.58 / 49.56 ±3.83 / 57.09 ms │     no change │
│ QQuery 8  │ 43.35 / 43.78 ±0.48 / 44.70 ms │ 49.62 / 49.89 ±0.32 / 50.40 ms │  1.14x slower │
│ QQuery 9  │ 49.63 / 50.49 ±0.51 / 51.00 ms │ 55.91 / 56.37 ±0.48 / 57.28 ms │  1.12x slower │
│ QQuery 10 │ 42.08 / 42.58 ±0.27 / 42.84 ms │ 42.42 / 43.18 ±0.56 / 43.97 ms │     no change │
│ QQuery 11 │ 13.59 / 13.70 ±0.15 / 14.00 ms │ 13.89 / 14.14 ±0.23 / 14.57 ms │     no change │
│ QQuery 12 │ 23.91 / 24.86 ±0.98 / 26.73 ms │ 25.56 / 26.50 ±1.18 / 28.75 ms │  1.07x slower │
│ QQuery 13 │ 32.12 / 34.29 ±1.45 / 35.90 ms │ 27.79 / 29.91 ±1.75 / 32.12 ms │ +1.15x faster │
│ QQuery 14 │ 23.82 / 24.14 ±0.36 / 24.84 ms │ 25.02 / 25.67 ±0.82 / 27.27 ms │  1.06x slower │
│ QQuery 15 │ 31.37 / 31.58 ±0.26 / 32.09 ms │ 34.58 / 34.95 ±0.58 / 36.10 ms │  1.11x slower │
│ QQuery 16 │ 14.06 / 14.36 ±0.20 / 14.57 ms │ 12.59 / 12.87 ±0.22 / 13.26 ms │ +1.12x faster │
│ QQuery 17 │ 87.64 / 88.37 ±0.49 / 89.16 ms │ 89.82 / 90.25 ±0.26 / 90.56 ms │     no change │
│ QQuery 18 │ 66.63 / 67.87 ±1.32 / 70.06 ms │ 65.92 / 69.02 ±1.95 / 71.74 ms │     no change │
│ QQuery 19 │ 32.87 / 33.18 ±0.44 / 34.03 ms │ 40.57 / 40.76 ±0.17 / 41.04 ms │  1.23x slower │
│ QQuery 20 │ 34.31 / 34.53 ±0.14 / 34.71 ms │ 34.14 / 34.66 ±0.58 / 35.54 ms │     no change │
│ QQuery 21 │ 56.05 / 58.19 ±1.95 / 61.85 ms │ 57.35 / 58.33 ±1.16 / 60.59 ms │     no change │
│ QQuery 22 │ 13.99 / 14.18 ±0.18 / 14.41 ms │ 12.89 / 13.06 ±0.24 / 13.54 ms │ +1.09x faster │
└───────────┴────────────────────────────────┴────────────────────────────────┴───────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━┓
┃ Benchmark Summary                        ┃          ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━┩
│ Total Time (HEAD)                        │ 790.87ms │
│ Total Time (split-row-groups-by-range)   │ 819.18ms │
│ Average Time (HEAD)                      │  35.95ms │
│ Average Time (split-row-groups-by-range) │  37.24ms │
│ Queries Faster                           │        3 │
│ Queries Slower                           │        7 │
│ Queries with No Change                   │       12 │
│ Queries with Failure                     │        0 │
└──────────────────────────────────────────┴──────────┘

Resource Usage

tpch — base (merge-base)

Metric Value
Wall time 5.0s
Peak memory 1.1 GiB
Avg memory 503.7 MiB
CPU user 23.4s
CPU sys 1.7s
Peak spill 0 B

tpch — branch

Metric Value
Wall time 5.0s
Peak memory 1.3 GiB
Avg memory 734.1 MiB
CPU user 26.0s
CPU sys 1.8s
Peak spill 0 B

File an issue against this benchmark runner

@adriangbot

Copy link
Copy Markdown

🤖 Benchmark completed (GKE) | trigger

Instance: c4a-highmem-16 (12 vCPU / 65 GiB)

CPU Details (lscpu)
Architecture:                            aarch64
CPU op-mode(s):                          64-bit
Byte Order:                              Little Endian
CPU(s):                                  16
On-line CPU(s) list:                     0-15
Vendor ID:                               ARM
Model name:                              Neoverse-V2
Model:                                   1
Thread(s) per core:                      1
Core(s) per cluster:                     16
Socket(s):                               -
Cluster(s):                              1
Stepping:                                r0p1
BogoMIPS:                                2000.00
Flags:                                   fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 sve asimdfhm dit uscat ilrcpc flagm sb paca pacg dcpodp sve2 sveaes svepmull svebitperm svesha3 svesm4 flagm2 frint svei8mm svebf16 i8mm bf16 dgh rng bti
L1d cache:                               1 MiB (16 instances)
L1i cache:                               1 MiB (16 instances)
L2 cache:                                32 MiB (16 instances)
L3 cache:                                80 MiB (1 instance)
NUMA node(s):                            1
NUMA node0 CPU(s):                       0-15
Vulnerability Gather data sampling:      Not affected
Vulnerability Indirect target selection: Not affected
Vulnerability Itlb multihit:             Not affected
Vulnerability L1tf:                      Not affected
Vulnerability Mds:                       Not affected
Vulnerability Meltdown:                  Not affected
Vulnerability Mmio stale data:           Not affected
Vulnerability Reg file data sampling:    Not affected
Vulnerability Retbleed:                  Not affected
Vulnerability Spec rstack overflow:      Not affected
Vulnerability Spec store bypass:         Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1:                Mitigation; __user pointer sanitization
Vulnerability Spectre v2:                Mitigation; CSV2, BHB
Vulnerability Srbds:                     Not affected
Vulnerability Tsa:                       Not affected
Vulnerability Tsx async abort:           Not affected
Vulnerability Vmscape:                   Not affected
Details

Comparing HEAD and split-row-groups-by-range
--------------------
Benchmark tpcds_sf1.json
--------------------
┏━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query     ┃                                  HEAD ┃             split-row-groups-by-range ┃        Change ┃
┡━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ QQuery 1  │           5.83 / 6.27 ±0.81 / 7.90 ms │           6.03 / 6.50 ±0.80 / 8.09 ms │     no change │
│ QQuery 2  │        82.44 / 82.84 ±0.34 / 83.33 ms │        39.20 / 40.13 ±0.91 / 41.83 ms │ +2.06x faster │
│ QQuery 3  │        30.28 / 30.64 ±0.25 / 30.93 ms │        16.68 / 16.75 ±0.07 / 16.87 ms │ +1.83x faster │
│ QQuery 4  │     519.07 / 522.32 ±2.23 / 524.53 ms │     231.53 / 235.66 ±4.77 / 243.01 ms │ +2.22x faster │
│ QQuery 5  │        53.11 / 53.40 ±0.21 / 53.67 ms │        46.60 / 47.87 ±1.20 / 50.13 ms │ +1.12x faster │
│ QQuery 6  │        37.77 / 38.03 ±0.23 / 38.38 ms │        18.92 / 19.86 ±1.27 / 22.37 ms │ +1.91x faster │
│ QQuery 7  │       95.43 / 97.91 ±2.69 / 101.73 ms │        38.66 / 39.88 ±1.90 / 43.64 ms │ +2.45x faster │
│ QQuery 8  │        37.71 / 38.22 ±0.44 / 38.98 ms │        23.98 / 24.30 ±0.17 / 24.49 ms │ +1.57x faster │
│ QQuery 9  │        55.69 / 57.65 ±1.42 / 59.37 ms │        58.96 / 60.43 ±1.19 / 62.31 ms │     no change │
│ QQuery 10 │        63.84 / 64.37 ±0.35 / 64.80 ms │        35.04 / 36.20 ±1.27 / 38.59 ms │ +1.78x faster │
│ QQuery 11 │     332.72 / 334.25 ±1.90 / 337.89 ms │     122.68 / 128.38 ±5.26 / 134.97 ms │ +2.60x faster │
│ QQuery 12 │        29.79 / 30.03 ±0.22 / 30.32 ms │        13.82 / 13.96 ±0.15 / 14.24 ms │ +2.15x faster │
│ QQuery 13 │     121.54 / 123.22 ±1.20 / 124.95 ms │        55.79 / 56.30 ±0.51 / 57.26 ms │ +2.19x faster │
│ QQuery 14 │    419.32 / 430.08 ±14.61 / 458.54 ms │     240.91 / 244.95 ±2.46 / 247.82 ms │ +1.76x faster │
│ QQuery 15 │        62.26 / 62.85 ±0.57 / 63.83 ms │        18.07 / 18.40 ±0.25 / 18.68 ms │ +3.42x faster │
│ QQuery 16 │           7.25 / 7.34 ±0.07 / 7.46 ms │           7.31 / 7.42 ±0.10 / 7.60 ms │     no change │
│ QQuery 17 │        82.07 / 84.55 ±3.63 / 91.75 ms │        45.54 / 47.76 ±2.29 / 52.19 ms │ +1.77x faster │
│ QQuery 18 │     125.52 / 126.37 ±0.68 / 127.21 ms │        52.03 / 53.26 ±1.75 / 56.56 ms │ +2.37x faster │
│ QQuery 19 │        42.75 / 43.09 ±0.28 / 43.49 ms │        25.05 / 25.30 ±0.15 / 25.49 ms │ +1.70x faster │
│ QQuery 20 │        36.80 / 37.47 ±0.45 / 37.94 ms │        16.49 / 17.03 ±0.94 / 18.92 ms │ +2.20x faster │
│ QQuery 21 │        18.25 / 18.54 ±0.33 / 19.14 ms │        19.26 / 19.57 ±0.18 / 19.78 ms │  1.06x slower │
│ QQuery 22 │        65.93 / 66.72 ±0.52 / 67.56 ms │        63.79 / 65.49 ±1.53 / 67.92 ms │     no change │
│ QQuery 23 │     373.27 / 377.70 ±4.10 / 383.99 ms │     223.51 / 225.13 ±1.05 / 226.78 ms │ +1.68x faster │
│ QQuery 24 │     230.70 / 235.91 ±5.91 / 247.40 ms │     133.48 / 137.39 ±3.36 / 142.52 ms │ +1.72x faster │
│ QQuery 25 │     114.75 / 116.38 ±1.21 / 118.25 ms │        56.23 / 58.95 ±3.01 / 64.78 ms │ +1.97x faster │
│ QQuery 26 │        58.96 / 59.22 ±0.19 / 59.51 ms │        25.79 / 26.04 ±0.20 / 26.41 ms │ +2.27x faster │
│ QQuery 27 │           6.81 / 6.88 ±0.08 / 7.03 ms │           6.87 / 6.99 ±0.09 / 7.13 ms │     no change │
│ QQuery 28 │        58.58 / 60.59 ±2.39 / 64.50 ms │        55.22 / 57.67 ±2.21 / 61.07 ms │     no change │
│ QQuery 29 │      98.95 / 101.54 ±2.39 / 105.70 ms │        48.71 / 49.44 ±0.69 / 50.71 ms │ +2.05x faster │
│ QQuery 30 │        33.08 / 33.41 ±0.25 / 33.80 ms │        25.68 / 26.35 ±0.58 / 27.31 ms │ +1.27x faster │
│ QQuery 31 │     114.61 / 115.79 ±1.03 / 116.98 ms │        72.20 / 73.25 ±1.03 / 75.01 ms │ +1.58x faster │
│ QQuery 32 │        21.40 / 21.81 ±0.26 / 22.13 ms │        15.70 / 16.28 ±0.80 / 17.86 ms │ +1.34x faster │
│ QQuery 33 │        39.02 / 39.25 ±0.22 / 39.55 ms │        30.01 / 31.37 ±0.98 / 32.89 ms │ +1.25x faster │
│ QQuery 34 │        10.27 / 10.41 ±0.11 / 10.59 ms │          8.84 / 9.98 ±1.57 / 13.10 ms │     no change │
│ QQuery 35 │        74.71 / 76.16 ±1.43 / 78.86 ms │        39.31 / 39.57 ±0.19 / 39.79 ms │ +1.92x faster │
│ QQuery 36 │           6.12 / 6.25 ±0.11 / 6.45 ms │           6.25 / 6.35 ±0.11 / 6.54 ms │     no change │
│ QQuery 37 │           7.50 / 7.74 ±0.13 / 7.85 ms │           7.40 / 7.53 ±0.11 / 7.65 ms │     no change │
│ QQuery 38 │        67.12 / 67.92 ±0.78 / 69.39 ms │        27.43 / 28.38 ±1.15 / 30.53 ms │ +2.39x faster │
│ QQuery 39 │        90.92 / 91.60 ±0.66 / 92.80 ms │        91.60 / 92.85 ±1.10 / 94.23 ms │     no change │
│ QQuery 40 │        24.10 / 24.46 ±0.56 / 25.57 ms │        16.73 / 16.89 ±0.14 / 17.13 ms │ +1.45x faster │
│ QQuery 41 │        12.20 / 12.27 ±0.09 / 12.44 ms │        12.15 / 12.68 ±0.94 / 14.55 ms │     no change │
│ QQuery 42 │        24.91 / 25.44 ±0.92 / 27.27 ms │        14.85 / 15.01 ±0.22 / 15.45 ms │ +1.70x faster │
│ QQuery 43 │           5.32 / 5.40 ±0.09 / 5.57 ms │           5.10 / 5.24 ±0.10 / 5.41 ms │     no change │
│ QQuery 44 │        10.30 / 10.37 ±0.11 / 10.59 ms │          9.78 / 9.95 ±0.13 / 10.13 ms │     no change │
│ QQuery 45 │        42.76 / 44.08 ±0.94 / 45.09 ms │        15.62 / 15.96 ±0.39 / 16.71 ms │ +2.76x faster │
│ QQuery 46 │        12.24 / 12.51 ±0.21 / 12.73 ms │        10.96 / 11.61 ±0.86 / 13.30 ms │ +1.08x faster │
│ QQuery 47 │     256.81 / 261.54 ±3.72 / 268.14 ms │     104.95 / 109.26 ±3.80 / 116.02 ms │ +2.39x faster │
│ QQuery 48 │       98.18 / 99.66 ±1.04 / 101.16 ms │        41.43 / 41.80 ±0.37 / 42.45 ms │ +2.38x faster │
│ QQuery 49 │        78.08 / 78.51 ±0.44 / 79.19 ms │        63.92 / 66.48 ±2.07 / 68.80 ms │ +1.18x faster │
│ QQuery 50 │        59.86 / 60.74 ±0.97 / 62.32 ms │        34.75 / 35.80 ±0.98 / 37.40 ms │ +1.70x faster │
│ QQuery 51 │      99.59 / 101.74 ±2.11 / 105.14 ms │        69.38 / 70.33 ±0.68 / 71.31 ms │ +1.45x faster │
│ QQuery 52 │        24.74 / 25.00 ±0.20 / 25.29 ms │        14.86 / 16.20 ±2.21 / 20.57 ms │ +1.54x faster │
│ QQuery 53 │        30.36 / 30.58 ±0.16 / 30.85 ms │        14.98 / 15.26 ±0.17 / 15.45 ms │ +2.00x faster │
│ QQuery 54 │        57.33 / 60.19 ±3.51 / 66.81 ms │        33.42 / 33.70 ±0.20 / 33.91 ms │ +1.79x faster │
│ QQuery 55 │        24.35 / 24.90 ±0.35 / 25.36 ms │        14.25 / 15.14 ±1.50 / 18.14 ms │ +1.65x faster │
│ QQuery 56 │        40.14 / 40.91 ±0.66 / 42.08 ms │        32.70 / 33.96 ±1.36 / 36.60 ms │ +1.20x faster │
│ QQuery 57 │     182.16 / 183.52 ±1.30 / 185.57 ms │        62.20 / 63.68 ±0.97 / 64.74 ms │ +2.88x faster │
│ QQuery 58 │     116.23 / 117.60 ±1.01 / 119.19 ms │        52.63 / 53.74 ±1.73 / 57.13 ms │ +2.19x faster │
│ QQuery 59 │     119.57 / 120.45 ±0.60 / 121.18 ms │        45.25 / 46.55 ±1.61 / 49.71 ms │ +2.59x faster │
│ QQuery 60 │        40.75 / 41.20 ±0.25 / 41.42 ms │        32.40 / 34.22 ±1.10 / 35.57 ms │ +1.20x faster │
│ QQuery 61 │        13.01 / 13.12 ±0.07 / 13.23 ms │        12.83 / 12.95 ±0.09 / 13.07 ms │     no change │
│ QQuery 62 │        48.36 / 50.44 ±1.90 / 53.88 ms │        14.20 / 14.32 ±0.13 / 14.55 ms │ +3.52x faster │
│ QQuery 63 │        30.95 / 31.79 ±0.48 / 32.42 ms │        15.22 / 16.37 ±1.39 / 18.32 ms │ +1.94x faster │
│ QQuery 64 │     419.89 / 424.49 ±2.56 / 427.35 ms │     238.39 / 242.04 ±4.12 / 249.87 ms │ +1.75x faster │
│ QQuery 65 │     149.81 / 152.61 ±1.81 / 154.36 ms │     114.57 / 120.08 ±3.28 / 124.51 ms │ +1.27x faster │
│ QQuery 66 │        81.53 / 82.12 ±0.47 / 82.52 ms │        55.14 / 56.55 ±1.63 / 59.58 ms │ +1.45x faster │
│ QQuery 67 │     261.44 / 264.82 ±2.06 / 267.07 ms │     106.01 / 109.59 ±2.56 / 112.72 ms │ +2.42x faster │
│ QQuery 68 │        12.44 / 12.73 ±0.20 / 13.06 ms │        11.31 / 11.42 ±0.09 / 11.54 ms │ +1.11x faster │
│ QQuery 69 │        59.84 / 61.03 ±1.38 / 63.47 ms │        32.45 / 32.77 ±0.37 / 33.44 ms │ +1.86x faster │
│ QQuery 70 │     109.51 / 109.96 ±0.59 / 111.08 ms │        69.56 / 70.70 ±1.36 / 73.29 ms │ +1.56x faster │
│ QQuery 71 │        36.51 / 37.00 ±0.47 / 37.88 ms │        29.96 / 31.93 ±1.99 / 35.40 ms │ +1.16x faster │
│ QQuery 72 │ 2216.07 / 2323.85 ±82.94 / 2455.01 ms │ 2127.40 / 2173.48 ±32.06 / 2219.43 ms │ +1.07x faster │
│ QQuery 73 │        10.14 / 10.27 ±0.12 / 10.46 ms │           8.56 / 8.76 ±0.18 / 9.09 ms │ +1.17x faster │
│ QQuery 74 │     190.14 / 193.31 ±1.90 / 195.35 ms │        75.36 / 80.74 ±5.34 / 90.70 ms │ +2.39x faster │
│ QQuery 75 │     154.80 / 157.83 ±4.08 / 165.84 ms │     110.59 / 114.86 ±4.30 / 122.88 ms │ +1.37x faster │
│ QQuery 76 │        36.78 / 37.45 ±0.58 / 38.26 ms │        27.73 / 28.60 ±0.67 / 29.55 ms │ +1.31x faster │
│ QQuery 77 │        63.00 / 63.67 ±0.53 / 64.52 ms │        50.57 / 52.88 ±2.95 / 58.65 ms │ +1.20x faster │
│ QQuery 78 │     194.83 / 199.09 ±4.03 / 205.19 ms │        82.18 / 85.18 ±2.28 / 87.58 ms │ +2.34x faster │
│ QQuery 79 │        69.69 / 72.46 ±3.33 / 78.69 ms │        31.88 / 32.13 ±0.26 / 32.62 ms │ +2.26x faster │
│ QQuery 80 │     103.13 / 104.95 ±1.63 / 107.75 ms │        81.18 / 85.86 ±6.54 / 98.69 ms │ +1.22x faster │
│ QQuery 81 │        26.94 / 27.11 ±0.12 / 27.28 ms │        22.93 / 23.42 ±0.29 / 23.83 ms │ +1.16x faster │
│ QQuery 82 │        17.38 / 17.51 ±0.10 / 17.67 ms │        15.92 / 16.08 ±0.09 / 16.19 ms │ +1.09x faster │
│ QQuery 83 │        42.15 / 45.67 ±4.28 / 53.63 ms │        31.04 / 31.53 ±0.76 / 33.03 ms │ +1.45x faster │
│ QQuery 84 │        31.87 / 32.39 ±0.45 / 33.14 ms │        20.28 / 21.57 ±1.62 / 24.76 ms │ +1.50x faster │
│ QQuery 85 │     109.75 / 111.03 ±0.85 / 112.22 ms │        50.58 / 52.14 ±1.86 / 55.75 ms │ +2.13x faster │
│ QQuery 86 │        26.24 / 26.69 ±0.45 / 27.54 ms │        10.70 / 10.90 ±0.13 / 11.03 ms │ +2.45x faster │
│ QQuery 87 │        67.36 / 69.99 ±2.26 / 73.90 ms │        27.02 / 27.75 ±0.91 / 29.52 ms │ +2.52x faster │
│ QQuery 88 │        65.07 / 65.72 ±0.43 / 66.24 ms │        64.97 / 67.46 ±3.76 / 74.92 ms │     no change │
│ QQuery 89 │        37.34 / 37.74 ±0.28 / 38.08 ms │        17.84 / 18.11 ±0.27 / 18.56 ms │ +2.08x faster │
│ QQuery 90 │        18.07 / 18.21 ±0.13 / 18.45 ms │        11.00 / 11.87 ±1.54 / 14.95 ms │ +1.53x faster │
│ QQuery 91 │        47.77 / 50.09 ±3.08 / 56.01 ms │        27.85 / 28.31 ±0.50 / 29.27 ms │ +1.77x faster │
│ QQuery 92 │        31.51 / 32.49 ±0.62 / 33.29 ms │        16.18 / 17.04 ±1.31 / 19.64 ms │ +1.91x faster │
│ QQuery 93 │        52.18 / 52.84 ±0.40 / 53.28 ms │        25.94 / 26.86 ±0.72 / 27.95 ms │ +1.97x faster │
│ QQuery 94 │        39.68 / 40.48 ±0.57 / 41.33 ms │        20.95 / 22.27 ±1.07 / 23.52 ms │ +1.82x faster │
│ QQuery 95 │        84.51 / 87.35 ±3.68 / 94.49 ms │        47.53 / 50.03 ±2.53 / 53.60 ms │ +1.75x faster │
│ QQuery 96 │        25.16 / 25.31 ±0.17 / 25.58 ms │        10.92 / 11.06 ±0.15 / 11.32 ms │ +2.29x faster │
│ QQuery 97 │        56.79 / 57.34 ±0.55 / 58.32 ms │        22.71 / 24.18 ±0.96 / 25.72 ms │ +2.37x faster │
│ QQuery 98 │        44.93 / 45.29 ±0.26 / 45.65 ms │        21.79 / 22.72 ±0.88 / 23.77 ms │ +1.99x faster │
│ QQuery 99 │        71.89 / 73.80 ±2.87 / 79.50 ms │        18.71 / 18.78 ±0.05 / 18.83 ms │ +3.93x faster │
└───────────┴───────────────────────────────────────┴───────────────────────────────────────┴───────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓
┃ Benchmark Summary                        ┃            ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩
│ Total Time (HEAD)                        │ 10485.77ms │
│ Total Time (split-row-groups-by-range)   │  6725.66ms │
│ Average Time (HEAD)                      │   105.92ms │
│ Average Time (split-row-groups-by-range) │    67.94ms │
│ Queries Faster                           │         83 │
│ Queries Slower                           │          1 │
│ Queries with No Change                   │         15 │
│ Queries with Failure                     │          0 │
└──────────────────────────────────────────┴────────────┘

Resource Usage

tpcds — base (merge-base)

Metric Value
Wall time 55.0s
Peak memory 2.1 GiB
Avg memory 1.4 GiB
CPU user 239.4s
CPU sys 6.0s
Peak spill 0 B

tpcds — branch

Metric Value
Wall time 35.0s
Peak memory 3.4 GiB
Avg memory 2.3 GiB
CPU user 255.2s
CPU sys 8.2s
Peak spill 0 B

File an issue against this benchmark runner

@adriangbot

Copy link
Copy Markdown

🤖 Benchmark completed (GKE) | trigger

Instance: c4a-highmem-16 (12 vCPU / 65 GiB)

CPU Details (lscpu)
Architecture:                            aarch64
CPU op-mode(s):                          64-bit
Byte Order:                              Little Endian
CPU(s):                                  16
On-line CPU(s) list:                     0-15
Vendor ID:                               ARM
Model name:                              Neoverse-V2
Model:                                   1
Thread(s) per core:                      1
Core(s) per cluster:                     16
Socket(s):                               -
Cluster(s):                              1
Stepping:                                r0p1
BogoMIPS:                                2000.00
Flags:                                   fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 sve asimdfhm dit uscat ilrcpc flagm sb paca pacg dcpodp sve2 sveaes svepmull svebitperm svesha3 svesm4 flagm2 frint svei8mm svebf16 i8mm bf16 dgh rng bti
L1d cache:                               1 MiB (16 instances)
L1i cache:                               1 MiB (16 instances)
L2 cache:                                32 MiB (16 instances)
L3 cache:                                80 MiB (1 instance)
NUMA node(s):                            1
NUMA node0 CPU(s):                       0-15
Vulnerability Gather data sampling:      Not affected
Vulnerability Indirect target selection: Not affected
Vulnerability Itlb multihit:             Not affected
Vulnerability L1tf:                      Not affected
Vulnerability Mds:                       Not affected
Vulnerability Meltdown:                  Not affected
Vulnerability Mmio stale data:           Not affected
Vulnerability Reg file data sampling:    Not affected
Vulnerability Retbleed:                  Not affected
Vulnerability Spec rstack overflow:      Not affected
Vulnerability Spec store bypass:         Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1:                Mitigation; __user pointer sanitization
Vulnerability Spectre v2:                Mitigation; CSV2, BHB
Vulnerability Srbds:                     Not affected
Vulnerability Tsa:                       Not affected
Vulnerability Tsx async abort:           Not affected
Vulnerability Vmscape:                   Not affected
Details

Comparing HEAD and split-row-groups-by-range
--------------------
Benchmark clickbench_partitioned.json
--------------------
┏━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query     ┃                                       HEAD ┃                  split-row-groups-by-range ┃        Change ┃
┡━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ QQuery 0  │               1.24 / 4.07 ±5.53 / 15.12 ms │               1.21 / 4.10 ±5.61 / 15.32 ms │     no change │
│ QQuery 1  │             13.12 / 13.41 ±0.19 / 13.67 ms │             12.77 / 13.22 ±0.27 / 13.50 ms │     no change │
│ QQuery 2  │             36.22 / 36.42 ±0.18 / 36.67 ms │             36.29 / 36.43 ±0.13 / 36.66 ms │     no change │
│ QQuery 3  │             30.98 / 31.40 ±0.45 / 32.25 ms │             31.27 / 31.51 ±0.29 / 32.06 ms │     no change │
│ QQuery 4  │      1670.42 / 1735.77 ±33.50 / 1765.15 ms │      1683.95 / 1755.68 ±52.14 / 1810.09 ms │     no change │
│ QQuery 5  │      1689.08 / 1741.82 ±56.29 / 1844.49 ms │      1807.71 / 1907.70 ±76.02 / 2010.76 ms │  1.10x slower │
│ QQuery 6  │                1.27 / 1.42 ±0.25 / 1.92 ms │                1.27 / 1.44 ±0.24 / 1.91 ms │     no change │
│ QQuery 7  │             14.66 / 14.75 ±0.11 / 14.94 ms │             14.00 / 14.30 ±0.17 / 14.45 ms │     no change │
│ QQuery 8  │      2076.07 / 2119.66 ±45.32 / 2207.12 ms │      2055.25 / 2117.78 ±55.31 / 2205.55 ms │     no change │
│ QQuery 9  │         494.49 / 521.40 ±21.04 / 551.35 ms │         488.96 / 522.65 ±20.19 / 550.02 ms │     no change │
│ QQuery 10 │           79.19 / 86.13 ±10.50 / 107.01 ms │             77.89 / 80.87 ±3.38 / 87.21 ms │ +1.07x faster │
│ QQuery 11 │             91.65 / 93.16 ±0.84 / 93.96 ms │             91.22 / 93.62 ±1.80 / 95.39 ms │     no change │
│ QQuery 12 │      1728.32 / 1791.12 ±37.91 / 1842.07 ms │      1692.19 / 1774.19 ±57.38 / 1843.75 ms │     no change │
│ QQuery 13 │         442.89 / 525.63 ±86.72 / 672.84 ms │      419.33 / 1493.88 ±709.30 / 2232.56 ms │  2.84x slower │
│ QQuery 14 │         542.66 / 560.17 ±16.29 / 582.00 ms │         567.78 / 585.31 ±16.37 / 616.09 ms │     no change │
│ QQuery 15 │      1975.14 / 2003.05 ±14.43 / 2014.22 ms │      1903.96 / 1980.82 ±60.18 / 2042.66 ms │     no change │
│ QQuery 16 │     4334.21 / 4530.43 ±166.25 / 4772.47 ms │     4261.39 / 4488.67 ±137.47 / 4632.46 ms │     no change │
│ QQuery 17 │     4308.29 / 4505.68 ±140.20 / 4666.11 ms │      4237.57 / 4416.50 ±91.89 / 4485.03 ms │     no change │
│ QQuery 18 │  17756.09 / 18806.16 ±864.49 / 19964.60 ms │  17750.45 / 18230.04 ±281.27 / 18485.74 ms │     no change │
│ QQuery 19 │            28.54 / 35.94 ±14.47 / 64.89 ms │             28.68 / 29.51 ±0.85 / 31.07 ms │ +1.22x faster │
│ QQuery 20 │          516.58 / 528.23 ±9.39 / 542.44 ms │          522.95 / 526.12 ±2.91 / 531.17 ms │     no change │
│ QQuery 21 │          512.30 / 519.16 ±5.76 / 527.36 ms │          516.72 / 525.09 ±6.64 / 534.01 ms │     no change │
│ QQuery 22 │        980.38 / 997.48 ±11.65 / 1016.77 ms │          977.25 / 984.98 ±6.42 / 993.03 ms │     no change │
│ QQuery 23 │      3056.47 / 3089.45 ±23.43 / 3113.40 ms │       3147.53 / 3160.63 ±8.60 / 3172.12 ms │     no change │
│ QQuery 24 │           41.58 / 64.90 ±39.44 / 143.29 ms │            42.59 / 56.22 ±12.82 / 79.06 ms │ +1.15x faster │
│ QQuery 25 │          111.97 / 113.36 ±1.09 / 115.16 ms │          111.41 / 115.69 ±4.93 / 124.00 ms │     no change │
│ QQuery 26 │             42.36 / 43.17 ±0.67 / 44.13 ms │             42.86 / 44.46 ±2.34 / 49.10 ms │     no change │
│ QQuery 27 │          670.79 / 679.46 ±5.68 / 687.85 ms │          673.31 / 679.87 ±5.69 / 689.93 ms │     no change │
│ QQuery 28 │     3466.38 / 3614.20 ±175.74 / 3959.21 ms │      3558.87 / 3636.33 ±46.51 / 3699.20 ms │     no change │
│ QQuery 29 │             40.45 / 41.17 ±0.78 / 42.66 ms │             41.15 / 41.69 ±0.63 / 42.87 ms │     no change │
│ QQuery 30 │         562.67 / 598.80 ±32.04 / 653.19 ms │         586.20 / 608.15 ±21.20 / 637.23 ms │     no change │
│ QQuery 31 │         289.47 / 310.39 ±21.30 / 350.71 ms │         306.12 / 321.83 ±16.47 / 353.73 ms │     no change │
│ QQuery 32 │      1010.13 / 1054.46 ±46.42 / 1115.14 ms │       979.19 / 1020.32 ±27.68 / 1052.13 ms │     no change │
│ QQuery 33 │ 26503.52 / 28458.25 ±1019.43 / 29228.44 ms │  26649.02 / 27789.96 ±686.46 / 28793.26 ms │     no change │
│ QQuery 34 │ 27425.59 / 29981.57 ±2283.62 / 33533.09 ms │ 27395.29 / 29098.43 ±1388.79 / 30361.33 ms │     no change │
│ QQuery 35 │      1083.89 / 1152.23 ±79.14 / 1305.73 ms │     1073.62 / 1182.90 ±115.60 / 1402.13 ms │     no change │
│ QQuery 36 │          169.08 / 172.75 ±3.09 / 176.69 ms │          164.37 / 173.03 ±6.25 / 180.14 ms │     no change │
│ QQuery 37 │           37.38 / 57.81 ±30.52 / 118.40 ms │            38.54 / 50.22 ±16.95 / 83.42 ms │ +1.15x faster │
│ QQuery 38 │             43.58 / 45.57 ±1.41 / 47.73 ms │             43.23 / 45.65 ±1.76 / 48.65 ms │     no change │
│ QQuery 39 │         193.96 / 210.60 ±23.64 / 257.11 ms │          176.51 / 190.57 ±7.10 / 195.28 ms │ +1.11x faster │
│ QQuery 40 │             14.79 / 15.38 ±0.45 / 16.15 ms │             14.98 / 15.54 ±0.62 / 16.61 ms │     no change │
│ QQuery 41 │             14.12 / 14.45 ±0.29 / 14.97 ms │             14.28 / 14.46 ±0.14 / 14.61 ms │     no change │
│ QQuery 42 │             13.65 / 13.71 ±0.05 / 13.80 ms │             13.78 / 13.98 ±0.19 / 14.32 ms │     no change │
└───────────┴────────────────────────────────────────────┴────────────────────────────────────────────┴───────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┓
┃ Benchmark Summary                        ┃             ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━┩
│ Total Time (HEAD)                        │ 110934.13ms │
│ Total Time (split-row-groups-by-range)   │ 109874.29ms │
│ Average Time (HEAD)                      │   2579.86ms │
│ Average Time (split-row-groups-by-range) │   2555.22ms │
│ Queries Faster                           │           5 │
│ Queries Slower                           │           2 │
│ Queries with No Change                   │          36 │
│ Queries with Failure                     │           0 │
└──────────────────────────────────────────┴─────────────┘

Resource Usage

clickbench_partitioned — base (merge-base)

Metric Value
Wall time 560.1s
Peak memory 12.3 GiB
Avg memory 6.5 GiB
CPU user 4952.8s
CPU sys 325.5s
Peak spill 0 B

clickbench_partitioned — branch

Metric Value
Wall time 550.1s
Peak memory 12.9 GiB
Avg memory 6.5 GiB
CPU user 4997.2s
CPU sys 329.3s
Peak spill 0 B

File an issue against this benchmark runner

@Dandandan Dandandan changed the title perf: split row groups by file range for parallel single-row-group scans (morsel splitting) perf: split row groups by file range (morsel splitting) Jul 2, 2026

@andygrove andygrove left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a nice speedup for files with a single large row group, and gating it behind split_row_groups_by_range is a clean way to make it opt-out.

One question about the sibling work-stealing path this builds on (SharedWorkSource / WorkSource::Shared). It matters for executors that run each output partition as an isolated task in a separate process, like Ballista and datafusion-distributed. Those consumers cannot poll all sibling partitions together, so a shared queue gets fully drained by whichever single task runs, and every task then scans the whole input. Today the only way to opt out of sibling sharing looks like the preserve_order and partitioned_by_file_group plan flags in create_sibling_state. Would you be open to also gating dynamic sibling work-stealing behind a session config, the way enable_dynamic_filter_pushdown gates that feature? That would give distributed executors a clean off switch.

It is worth noting that split_row_groups_by_range=false does not cover this on its own, since the shared queue still hands out whole row groups dynamically. So this is really a question about the underlying WorkSource::Shared mechanism rather than this PR specifically, which just extends it.

Happy to open a separate issue for the broader dynamic-scheduling question so this PR stays focused on the perf win. Does that split sound right to you?

This review was prepared with the help of an LLM.

@Dandandan

Copy link
Copy Markdown
Contributor Author

This is a nice speedup for files with a single large row group, and gating it behind split_row_groups_by_range is a clean way to make it opt-out.

One question about the sibling work-stealing path this builds on (SharedWorkSource / WorkSource::Shared). It matters for executors that run each output partition as an isolated task in a separate process, like Ballista and datafusion-distributed. Those consumers cannot poll all sibling partitions together, so a shared queue gets fully drained by whichever single task runs, and every task then scans the whole input. Today the only way to opt out of sibling sharing looks like the preserve_order and partitioned_by_file_group plan flags in create_sibling_state. Would you be open to also gating dynamic sibling work-stealing behind a session config, the way enable_dynamic_filter_pushdown gates that feature? That would give distributed executors a clean off switch.

It is worth noting that split_row_groups_by_range=false does not cover this on its own, since the shared queue still hands out whole row groups dynamically. So this is really a question about the underlying WorkSource::Shared mechanism rather than this PR specifically, which just extends it.

Happy to open a separate issue for the broader dynamic-scheduling question so this PR stays focused on the perf win. Does that split sound right to you?

This review was prepared with the help of an LLM.

Thanks for the review. Yeah would be good to gate it.

I am also still iterating on this PR (e.g. removing the overhead).

See also my other comment in your PR about ballista.

pull Bot pushed a commit to buraksenn/datafusion that referenced this pull request Jul 2, 2026
apache#23294)

## Which issue does this PR close?

- Closes apache#23293.

## Rationale for this change

`FileStream` sibling work-stealing (`WorkSource::Shared`, added in
apache#21351 and extended by apache#23285) seeds one shared work queue from every
file group and lets whichever output partition goes idle first steal the
next unopened file (or byte-range morsel). This assumes all output
partitions of a scan are polled concurrently in one process.

Executors that run each output partition as an isolated task in a
separate process — Ballista and datafusion-distributed — never poll the
sibling partitions. The single polled partition drains the whole shared
queue and reads files belonging to other partitions, so every isolated
task reads the entire input and the scan output is inflated by the
partition count. This is a correctness bug for those executors, not just
a performance one.

The existing escape hatches (`preserve_order`,
`partitioned_by_file_group`) are plan-level flags on `FileScanConfig`,
not something a distributed executor can set centrally through the
session config, and a plain repartitioned scan does not set
`partitioned_by_file_group`. There is no session-level off switch,
unlike `datafusion.optimizer.enable_dynamic_filter_pushdown`, which
exists precisely so consumers that cannot support runtime
cross-partition state can disable it.

## What changes are included in this PR?

- Add `datafusion.execution.enable_file_stream_work_stealing` (default
`true`). When `false`, `FileScanConfig::create_sibling_state` returns
`None`, so each partition falls back to `WorkSource::Local` and reads
only its own file group.
- Thread `&ConfigOptions` into `DataSource::create_sibling_state` so the
flag is read from the session config at `execute` time. As a session
config value it round-trips through `datafusion-proto` with no proto
schema change.
- Regenerate `configs.md` and add the setting to
`information_schema.slt`.
- Turn the previously `#[ignore]`d reproduction test into a passing
regression test that drives only partition 0 (as an isolated task does)
and asserts both behaviors: with the default (stealing on) partition 0
also reads partition 1's file, and with the flag off it reads only its
own.

## Are these changes tested?

Yes. `isolated_partition_respects_work_stealing_config` in
`datafusion/datasource/src/file_stream/mod.rs` covers both the default
(shared-queue) behavior and the flag-off behavior. The existing sibling
work-stealing tests continue to pass with the default.
`information_schema` sqllogictests pass with the new setting listed.

## Are there any user-facing changes?

A new session config,
`datafusion.execution.enable_file_stream_work_stealing` (default
`true`), so existing behavior is unchanged.
`DataSource::create_sibling_state` gains a `&ConfigOptions` parameter
(an API change for anyone implementing the `DataSource` trait directly).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

auto detected api change Auto detected API change common Related to common crate datasource Changes to the datasource crate documentation Improvements or additions to documentation proto Related to proto crate sqllogictest SQL Logic Tests (.slt)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants