Skip to content

GH-50280: [C++] Implement VisitTwoBitRuns and VisitTwoSetBitRuns methods#50281

Open
HuaHuaY wants to merge 1 commit into
apache:mainfrom
HuaHuaY:visit_two_bit_runs
Open

GH-50280: [C++] Implement VisitTwoBitRuns and VisitTwoSetBitRuns methods#50281
HuaHuaY wants to merge 1 commit into
apache:mainfrom
HuaHuaY:visit_two_bit_runs

Conversation

@HuaHuaY

@HuaHuaY HuaHuaY commented Jun 29, 2026

Copy link
Copy Markdown
Contributor

Rationale for this change

We already have VisitTwoBitBlocks which can iterate over two bitmaps simultaneously. But there is no mirror code about BitRunReader and SetBitRunReader.

What changes are included in this PR?

Implement VisitTwoBitRuns and VisitTwoSetBitRuns methods and refactor VisitTwoBitBlocks in GroupedPivotAccumulator as examples.

Are these changes tested?

Yes.

Are there any user-facing changes?

No.

@HuaHuaY HuaHuaY requested a review from pitrou as a code owner June 29, 2026 11:58
@github-actions

Copy link
Copy Markdown

⚠️ GitHub issue #50280 has been automatically assigned in GitHub to PR creator.

@pitrou

pitrou commented Jun 29, 2026

Copy link
Copy Markdown
Member

We have toy element-wise summation benchmarks in bit_block_counter_benchmark.cc (see BinaryBitBlockBenchmark), can you add one that uses these new primitives?

@pitrou

pitrou commented Jun 29, 2026

Copy link
Copy Markdown
Member

Also, VisitTwoBitBlocksVoid is used extensively for binary scalar kernels, see src/arrow/compute/kernels/codegen_internal.h.

@HuaHuaY HuaHuaY force-pushed the visit_two_bit_runs branch from 36c7481 to 51ee9fa Compare June 30, 2026 06:29
@HuaHuaY

HuaHuaY commented Jun 30, 2026

Copy link
Copy Markdown
Contributor Author

I have renamed the benchmark file and added some cases use VisitXXXBitRuns. And I added two new methods VisitTwoBitRunsVoid and VisitTwoSetBitRunsVoid.

BTW I refactored the existed method VisitSetBitRunsVoid. I believe compiler will optimize the lambda package cost.

@HuaHuaY

HuaHuaY commented Jun 30, 2026

Copy link
Copy Markdown
Contributor Author

Now benchmarks about new method VisitTwoBitRunsVoid are much slower in my own environment. It seems that llvm doesn't vectorize the sum operation in benchmark function. I will optimize the benchmark and push a new commit soon.

@HuaHuaY

HuaHuaY commented Jun 30, 2026

Copy link
Copy Markdown
Contributor Author

@pitrou How to run benchmark in conbench-apache-arrow? Please help to call the bot.

@pitrou

pitrou commented Jun 30, 2026

Copy link
Copy Markdown
Member

@ursabot please benchmark lang=C++

@rok

rok commented Jun 30, 2026

Copy link
Copy Markdown
Member

Benchmark runs are scheduled for commit c0d6bf7. Watch https://buildkite.com/apache-arrow and https://conbench.arrow-dev.org for updates. A comment will be posted here when the runs are complete.

@conbench-apache-arrow

Copy link
Copy Markdown

Thanks for your patience. Conbench analyzed the 3 benchmarking runs that have been run so far on PR commit c0d6bf7.

There were 5 benchmark results indicating a performance regression:

The full Conbench report has more details.

@wgtmac

wgtmac commented Jun 30, 2026

Copy link
Copy Markdown
Member

Interesting that benchmark was run of behalf of @rok

@rok

rok commented Jun 30, 2026

Copy link
Copy Markdown
Member

@wgtmac That's a current workaround for not having a conbench user. :)

@HuaHuaY

HuaHuaY commented Jun 30, 2026

Copy link
Copy Markdown
Contributor Author

I've compiled the test data into three tables. BinaryBitBlockCounterSum is faster than BinaryVisitTwoSetBitRunsSum when there are many null values, which I think is expected. In this case, merging the two bitmaps results in too many consecutive intervals, making it slower than judging multiple bits at once.

However, I was somewhat surprised that in x86 tests, BinaryBitBlockCounterSum achieved better performance when there were almost no null values. I don't understand why.

There were also some unexpected findings. Currently, VisitTwoSetBitRuns performs better than VisitTwoBitRuns. This is partly due to function implementation, and partly because SetBitRunReader performs better than BitRunReader. I will later adjust VisitTwoBitRuns to use SetBitRunReader (VisitTwoBitRuns can directly wrap VisitTwoSetBitRuns). I haven't yet figured out why SetBitRunReader is so much faster.

Based on these test results, I plan to revert the changes to cpp/src/arrow/compute/kernels/hash_aggregate_pivot.cc. However, I think that the real value of VisitTwoSetBitRuns lies in its ability to handle data in batches rather than row-by-row, allowing the use of vector-processing functions such as ::arrow::bit_util::SetBitsTo or ::arrow::internal::CountSetBits. Since I plan to use it in a future PR, I'd like to add benchmark scenarios that will showcase the advantages of VisitTwoSetBitRuns.

4ab7d672 / amd64-c6a-4xlarge-linux / c0d6bf7

Benchmark 2 8 64 512 4096 32768 65536
BitBlockCounterSum 2.59e+8 6.46e+8 1.39e+9 3.58e+9 4.90e+9 5.15e+9 5.17e+9
VisitBitRunsSum 3.10e+8 5.70e+8 2.07e+9 4.30e+9 4.68e+9 4.73e+9 4.73e+9
VisitSetBitRunsSum 3.59e+8 6.40e+8 2.20e+9 4.58e+9 4.88e+9 4.93e+9 4.94e+9
BinaryBitBlockCounterSum 1.79e+8 3.68e+8 7.40e+8 1.99e+9 3.69e+9 4.18e+9 4.22e+9
BinaryVisitTwoBitRunsSum 1.40e+8 2.42e+8 9.38e+8 2.61e+9 3.36e+9 3.49e+9 3.52e+9
BinaryVisitTwoSetBitRunsSum 1.78e+8 2.77e+8 1.03e+9 3.05e+9 3.94e+9 4.06e+9 4.08e+9

fd5a984c / amd64-m5-4xlarge-linux / c0d6bf7

Benchmark 2 8 64 512 4096 32768 65536
BitBlockCounterSum 2.00e+8 4.55e+8 9.33e+8 1.88e+9 2.26e+9 2.32e+9 2.32e+9
VisitBitRunsSum 2.07e+8 3.73e+8 1.15e+9 1.98e+9 2.17e+9 2.19e+9 2.19e+9
VisitSetBitRunsSum 2.34e+8 4.01e+8 1.15e+9 2.03e+9 2.24e+9 2.26e+9 2.26e+9
BinaryBitBlockCounterSum 1.20e+8 2.13e+8 4.06e+8 1.02e+9 1.79e+9 2.00e+9 2.01e+9
BinaryVisitTwoBitRunsSum 1.05e+8 1.77e+8 5.93e+8 1.34e+9 1.63e+9 1.67e+9 1.68e+9
BinaryVisitTwoSetBitRunsSum 1.16e+8 1.83e+8 6.26e+8 1.51e+9 1.87e+9 1.91e+9 1.91e+9

457fe991 / test-mac-arm / c0d6bf7

Benchmark 2 8 64 512 4096 32768 65536
BitBlockCounterSum 2.60e+8 7.22e+8 1.98e+9 5.40e+9 7.09e+9 7.29e+9 7.30e+9
VisitBitRunsSum 3.41e+8 6.44e+8 1.57e+9 2.33e+9 2.57e+9 2.59e+9 2.59e+9
VisitSetBitRunsSum 4.00e+8 6.36e+8 2.64e+9 9.07e+9 1.22e+10 1.29e+10 1.29e+10
BinaryBitBlockCounterSum 1.79e+8 4.07e+8 1.00e+9 2.52e+9 4.76e+9 5.38e+9 5.44e+9
BinaryVisitTwoBitRunsSum 1.75e+8 2.99e+8 1.12e+9 3.91e+9 5.64e+9 5.90e+9 5.93e+9
BinaryVisitTwoSetBitRunsSum 2.28e+8 3.17e+8 1.21e+9 4.14e+9 6.09e+9 6.40e+9 6.42e+9

@pitrou

pitrou commented Jun 30, 2026

Copy link
Copy Markdown
Member

I haven't yet figured out why SetBitRunReader is so much faster.

Because it's simpler :)

@HuaHuaY

HuaHuaY commented Jul 1, 2026

Copy link
Copy Markdown
Contributor Author

With the help of AI, I investigated why VisitBitRuns is slower than VisitSetBitRuns by modifying benchmarks and examining the generated assembly code.

I concluded that the primary reason lies in the absence of the NOINLINE attribute on BitRunReader::NextRun; this results in a larger code size for VisitBitRuns compared to VisitSetBitRuns, making VisitBitRuns itself less likely to be inlined. However, adding the NOINLINE attribute to BitRunReader::NextRun actually degrades performance further, as it forces a non-inlined call to NextRun regardless of whether the bitmap value is true or false.

Additionally, BitRunReader contains some minor code duplication that can be optimized. I will upload the changes shortly.

@pitrou

pitrou commented Jul 1, 2026

Copy link
Copy Markdown
Member

I think NOINLINE was added at some point because it actually made some benchmarks better (on some compiler :-)). So let's be careful and check for regressions if we undo it.

Note that, ultimately, what matters is actual users of this function (for example in the compute library), not the bitmap traversal micro-benchmarks :-)

@HuaHuaY

HuaHuaY commented Jul 1, 2026

Copy link
Copy Markdown
Contributor Author

Yeah, I was trying to figure out why BitRunReader's performance was inferior to SetBitRunReader, and removing NOINLINE from SetBitRunReader::NextRun wasn't what I needed.

I submitted a new commit that made some optimizations to BitRunReader:

  1. The assignment word_ = 0 only needs to occur when reading partial word.
  2. Remove redundant calculations of bit_util::IsMultipleOf64 and use word_ == 0 instead.
  3. Prioritize checking position_ + 128 <= length_ to avoid performing two boolean checks in the AdvanceUntilChange and LoadWord functions when null values ​​are rare.

what matters is actual users of this function (for example in the compute library), not the bitmap traversal micro-benchmarks :-)

As you said, user performance is more important than benchmark performance. These changes not only improved my local benchmark but also, I believe, semantically optimized the execution process. And I don't add ARROW_FORCE_INLINE to VisitBitRuns because I concern that ARROW_FORCE_INLINE might slow down user performance in certain scenarios.

@pitrou

pitrou commented Jul 1, 2026

Copy link
Copy Markdown
Member

@ursabot please benchmark lang=C++

@rok

rok commented Jul 1, 2026

Copy link
Copy Markdown
Member

Benchmark runs are scheduled for commit 3b5b2d0. Watch https://buildkite.com/apache-arrow and https://conbench.arrow-dev.org for updates. A comment will be posted here when the runs are complete.

@HuaHuaY

HuaHuaY commented Jul 1, 2026

Copy link
Copy Markdown
Contributor Author

I added a new benchmark to count the number of valid int8 values ​​in a random struct<a: struct<b: int8>> array. Due to the use of CountSetBits batch interface, VisitTwoBitRunsVoid achieves better performance than VisitTwoBitBlocksVoid in most cases in my own environment.

parent_null_every value_null_every BitBlocks ns BitRuns ns vs block SetBitRuns ns vs block
2 2 5100500 3247773 0.64x 3272429 0.64x
2 8 4592000 3371095 0.73x 3245095 0.71x
2 64 4338500 3281286 0.76x 3272810 0.75x
8 2 4439125 3052261 0.69x 3103000 0.70x
8 8 2801040 3075136 1.10x 3090304 1.10x
8 64 2326967 3070955 1.32x 3171870 1.36x
64 2 4132647 813513 0.20x 848000 0.21x
64 8 2026114 835193 0.41x 823469 0.41x
64 64 1456122 789952 0.54x 842785 0.58x
512 2 3918444 100072 0.03x 98034 0.03x
4096 2 3878944 26525 0.01x 26539 0.01x
65536 65536 1052229 14449 0.01x 14641 0.01x

@conbench-apache-arrow

Copy link
Copy Markdown

Thanks for your patience. Conbench analyzed the 4 benchmarking runs that have been run so far on PR commit 3b5b2d0.

There were 3 benchmark results indicating a performance regression:

The full Conbench report has more details.

@HuaHuaY

HuaHuaY commented Jul 1, 2026

Copy link
Copy Markdown
Contributor Author

These are the results from the last benchmark and this benchmark, grouped in pairs, with the latest results following. I think that after optimizing BitRunReader, the performance gap between VisitBitRuns and VisitSetBitRuns has indeed narrowed in most cases. @pitrou Please take a look.

4ab7d672 / amd64-c6a-4xlarge-linux / c0d6bf7

Benchmark 2 8 64 512 4096 32768 65536
BitBlockCounterSum 2.59e+8 6.46e+8 1.39e+9 3.58e+9 4.90e+9 5.15e+9 5.17e+9
VisitBitRunsSum 3.10e+8 5.70e+8 2.07e+9 4.30e+9 4.68e+9 4.73e+9 4.73e+9
VisitSetBitRunsSum 3.59e+8 6.40e+8 2.20e+9 4.58e+9 4.88e+9 4.93e+9 4.94e+9

d164bbbe / amd64-c6a-4xlarge-linux / 3b5b2d0

Benchmark 2 8 64 512 4096 32768 65536
BitBlockCounterSum 2.56e+8 6.37e+8 1.37e+9 3.55e+9 4.85e+9 5.09e+9 5.11e+9
VisitBitRunsSum 3.13e+8 5.75e+8 2.10e+9 4.38e+9 4.70e+9 4.76e+9 4.76e+9
VisitSetBitRunsSum 3.39e+8 6.07e+8 2.17e+9 4.50e+9 4.80e+9 4.86e+9 4.87e+9

fd5a984c / amd64-m5-4xlarge-linux / c0d6bf7

Benchmark 2 8 64 512 4096 32768 65536
BitBlockCounterSum 2.00e+8 4.55e+8 9.33e+8 1.88e+9 2.26e+9 2.32e+9 2.32e+9
VisitBitRunsSum 2.07e+8 3.73e+8 1.15e+9 1.98e+9 2.17e+9 2.19e+9 2.19e+9
VisitSetBitRunsSum 2.34e+8 4.01e+8 1.15e+9 2.03e+9 2.24e+9 2.26e+9 2.26e+9

e74c6927 / amd64-m5-4xlarge-linux / 3b5b2d0

Benchmark 2 8 64 512 4096 32768 65536
BitBlockCounterSum 2.00e+8 4.55e+8 9.34e+8 1.88e+9 2.26e+9 2.32e+9 2.32e+9
VisitBitRunsSum 2.18e+8 3.84e+8 1.15e+9 1.97e+9 2.15e+9 2.17e+9 2.17e+9
VisitSetBitRunsSum 2.30e+8 3.99e+8 1.15e+9 2.03e+9 2.24e+9 2.26e+9 2.26e+9

457fe991 / test-mac-arm / c0d6bf7

Benchmark 2 8 64 512 4096 32768 65536
BitBlockCounterSum 2.60e+8 7.22e+8 1.98e+9 5.40e+9 7.09e+9 7.29e+9 7.30e+9
VisitBitRunsSum 3.41e+8 6.44e+8 1.57e+9 2.33e+9 2.57e+9 2.59e+9 2.59e+9
VisitSetBitRunsSum 4.00e+8 6.36e+8 2.64e+9 9.07e+9 1.22e+10 1.29e+10 1.29e+10

6701543e / test-mac-arm / 3b5b2d0

Benchmark 2 8 64 512 4096 32768 65536
BitBlockCounterSum 2.60e+8 7.16e+8 1.90e+9 5.33e+9 7.03e+9 7.27e+9 7.29e+9
VisitBitRunsSum 3.58e+8 5.84e+8 2.27e+9 8.53e+9 1.23e+10 1.31e+10 1.31e+10
VisitSetBitRunsSum 4.07e+8 6.49e+8 2.59e+9 9.00e+9 1.22e+10 1.28e+10 1.29e+10

@pitrou

pitrou commented Jul 1, 2026

Copy link
Copy Markdown
Member

The benchmarks above don't seem to show significant wins from the BitRunReader improvements. Actually, there is a number of small-ish regressions.

@HuaHuaY

HuaHuaY commented Jul 1, 2026

Copy link
Copy Markdown
Contributor Author

@ursabot please benchmark lang=C++

@rok

rok commented Jul 1, 2026

Copy link
Copy Markdown
Member

Benchmark runs are scheduled for commit b026a7d. Watch https://buildkite.com/apache-arrow and https://conbench.arrow-dev.org for updates. A comment will be posted here when the runs are complete.

@HuaHuaY

HuaHuaY commented Jul 1, 2026

Copy link
Copy Markdown
Contributor Author

I've only modified BitRunReader and not modified SetBitRunReader. And I hope to narrow the performance gap between BitRunReader and SetBitRunReader. Therefore, I think the change in the performance ratio of VisitBitRunsSum and VisitSetBitRunsSum is more indicative of the effectiveness of the modifications. I've reorganized the data and listed the ratio differences in the last column.

However, even if the bitmap benchmark results improve, it doesn't mean it won't negatively impact other benchmark cases. I looked at the three test cases in the benchmark report above. ArrayArrayKernel seems unrelated to the path I modified. I couldn't reproduce the remaining two performance regressions in my own x86 environment. I rerun the benchmark, and if it still shows a performance regression, I plan to revert the BitRunReader changes. And optimizing BitRunReader is not the original purpose of this PR.

Group 1

A: 4ab7d672 / amd64-c6a-4xlarge-linux / c0d6bf7
B: d164bbbe / amd64-c6a-4xlarge-linux / 3b5b2d0
Ratio: VisitBitRunsSum / VisitSetBitRunsSum

params A VisitBitRunsSum A VisitSetBitRunsSum A ratio B VisitBitRunsSum B VisitSetBitRunsSum B ratio Δ ratio (B-A)
2 3.10e+8 3.59e+8 86.43% 3.13e+8 3.39e+8 92.32% +5.89%
8 5.70e+8 6.40e+8 89.03% 5.75e+8 6.07e+8 94.71% +5.68%
64 2.07e+9 2.20e+9 94.15% 2.10e+9 2.17e+9 96.72% +2.57%
512 4.30e+9 4.58e+9 93.84% 4.38e+9 4.50e+9 97.29% +3.45%
4096 4.68e+9 4.88e+9 95.82% 4.70e+9 4.80e+9 97.97% +2.14%
32768 4.73e+9 4.93e+9 95.79% 4.76e+9 4.86e+9 98.09% +2.30%
65536 4.73e+9 4.94e+9 95.84% 4.76e+9 4.87e+9 97.68% +1.84%

Group 2

A: fd5a984c / amd64-m5-4xlarge-linux / c0d6bf7
B: e74c6927 / amd64-m5-4xlarge-linux / 3b5b2d0
Ratio: VisitBitRunsSum / VisitSetBitRunsSum

params A VisitBitRunsSum A VisitSetBitRunsSum A ratio B VisitBitRunsSum B VisitSetBitRunsSum B ratio Δ ratio (B-A)
2 2.07e+8 2.34e+8 88.71% 2.18e+8 2.30e+8 94.94% +6.24%
8 3.73e+8 4.01e+8 92.97% 3.84e+8 3.99e+8 96.08% +3.11%
64 1.15e+9 1.15e+9 100.13% 1.15e+9 1.15e+9 100.49% +0.35%
512 1.98e+9 2.03e+9 97.34% 1.97e+9 2.03e+9 97.15% -0.19%
4096 2.17e+9 2.24e+9 97.06% 2.15e+9 2.24e+9 96.04% -1.02%
32768 2.19e+9 2.26e+9 96.93% 2.17e+9 2.26e+9 95.94% -0.99%
65536 2.19e+9 2.26e+9 96.94% 2.17e+9 2.26e+9 95.92% -1.02%

Group 3

A: 457fe991 / test-mac-arm / c0d6bf7
B: 6701543e / test-mac-arm / 3b5b2d0
Ratio: VisitBitRunsSum / VisitSetBitRunsSum

params A VisitBitRunsSum A VisitSetBitRunsSum A ratio B VisitBitRunsSum B VisitSetBitRunsSum B ratio Δ ratio (B-A)
2 3.41e+8 4.00e+8 85.26% 3.58e+8 4.07e+8 88.12% +2.86%
8 6.44e+8 6.36e+8 101.30% 5.84e+8 6.49e+8 89.95% -11.35%
64 1.57e+9 2.64e+9 59.46% 2.27e+9 2.59e+9 87.57% +28.10%
512 2.33e+9 9.07e+9 25.71% 8.53e+9 9.00e+9 94.73% +69.02%
4096 2.57e+9 1.22e+10 21.10% 1.23e+10 1.22e+10 100.99% +79.88%
32768 2.59e+9 1.29e+10 20.14% 1.31e+10 1.28e+10 101.80% +81.65%
65536 2.59e+9 1.29e+10 20.12% 1.31e+10 1.29e+10 101.76% +81.64%

@conbench-apache-arrow

Copy link
Copy Markdown

Thanks for your patience. Conbench analyzed the 3 benchmarking runs that have been run so far on PR commit b026a7d.

There were no benchmark performance regressions. 🎉

The full Conbench report has more details.

@HuaHuaY

HuaHuaY commented Jul 2, 2026

Copy link
Copy Markdown
Contributor Author

@ursabot please benchmark lang=C++

@rok

rok commented Jul 2, 2026

Copy link
Copy Markdown
Member

Commit b026a7d already has scheduled benchmark runs.

@HuaHuaY HuaHuaY force-pushed the visit_two_bit_runs branch from b026a7d to 41af9ca Compare July 2, 2026 02:33
@HuaHuaY

HuaHuaY commented Jul 2, 2026

Copy link
Copy Markdown
Contributor Author

@ursabot please benchmark lang=C++

@rok

rok commented Jul 2, 2026

Copy link
Copy Markdown
Member

Benchmark runs are scheduled for commit 41af9ca. Watch https://buildkite.com/apache-arrow and https://conbench.arrow-dev.org for updates. A comment will be posted here when the runs are complete.

@HuaHuaY

HuaHuaY commented Jul 2, 2026

Copy link
Copy Markdown
Contributor Author

A performance regression has indeed occurred on amd64-c6a-4xlarge-linux. I will revert related changes soon.

@HuaHuaY HuaHuaY force-pushed the visit_two_bit_runs branch from 41af9ca to 4f4b336 Compare July 2, 2026 07:10
@HuaHuaY

HuaHuaY commented Jul 2, 2026

Copy link
Copy Markdown
Contributor Author

Hi @pitrou, I have reverted all code changes on existing paths; the rest are new code and should not affect existing benchmarks. Please take a look again.

@conbench-apache-arrow

Copy link
Copy Markdown

Thanks for your patience. Conbench analyzed the 4 benchmarking runs that have been run so far on PR commit 41af9ca.

There were 18 benchmark results indicating a performance regression:

The full Conbench report has more details.

@pitrou pitrou left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the update!


namespace arrow::internal {

struct NestedBitmapTraversalBenchmark {

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this benchmark is very realistic, because if we were implementing something like this we'd be doing it for an arbitrary nesting level (not 3), and we would probably not use the bitmap visitors for it.

(perhaps we'd BitmapAnd the validity bitmaps in fixed-size chunks and call CountSetBits on the latter?)

@HuaHuaY HuaHuaY Jul 2, 2026

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The use case I provided here isn't quite accurate. I think a more accurate use case would be using the type struct<a: list<int8>>. I encountered this during a recursive type parsing process. When encountering list<XXX>, it combines the accumulated bitmap passed from the outer layer with the bitmap of the list to determine the valid rows of the XXX array. Its use case should be similar to VisitTwoBitBlocks, the key point being that sometimes we don't need BitmapAnd to generate a temporary bitmap. Compared to VisitTwoBitBlocks, it simply changes the data input from row-by-row to a batch interface. I plan to modify this benchmark later. If you have any better suggestions, please let me know.

thread_pool_test.cc)

add_arrow_benchmark(bit_block_counter_benchmark)
add_arrow_benchmark(bitmap_traversal_benchmark)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand why you're renaming the benchmark executable, but I'm afraid it's gonna break the benchmark history by making it unrelated in the benchmark database. @rok Am I right?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe so. If you have a good reason to and you don't care about breaking history that's ok. If not I'd suggest to keep the current name.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Get it. I will revert the name change.

ASSERT_THAT(runs, ElementsAreArray({SetBitRun{0, 2}, SetBitRun{3, 2}}));
}

TEST_F(TestSetBitRunReader, VisitTwoBitRunsVoid) {

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can the VisitTwoBitRunsVoid tests be folded into the VisitTwoBitRuns tests as they use the exact same data? It will make the whole test file more readable and maintainable.

@github-actions github-actions Bot added awaiting committer review Awaiting committer review awaiting review Awaiting review awaiting changes Awaiting changes and removed awaiting review Awaiting review awaiting committer review Awaiting committer review labels Jul 2, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants