[common] Accelerate endsWith and like '%x' with reverse btree global index#8371
[common] Accelerate endsWith and like '%x' with reverse btree global index#8371ArnavBalyan wants to merge 1 commit into
Conversation
|
cc @JingsongLi @leaves12138 thanks! :) |
| String v = "row" + rnd.nextInt(1_000_000) + suffixes[rnd.nextInt(suffixes.length)]; | ||
| data.add(Pair.of(BinaryString.fromString(v), (long) i)); | ||
| } | ||
| data.sort((a, b) -> cmp.compare(a.getKey(), b.getKey())); |
There was a problem hiding this comment.
[P1] Please cover the production build path before registering this index type. This test writes a valid reverse-btree only because it sorts with the reversed-key comparator here. The real create_global_index paths do not do that: reverse-btree is not in the Spark/Flink SortedIndexTopoBuilder support lists, so it falls through to the default/generic builders, and those pass rows to BTreeIndexWriter in scan order; even SortedGlobalIndexBuilder sorts by the original index field, not the reversed bytes. Since SstFileWriter/BTreeIndexWriter require keys to be monotonically increasing in the writer comparator, an index built through the normal procedure can produce an incorrectly ordered SST and wrong suffix lookups. Please wire reverse-btree into a builder that sorts by ReversedKeySerializer ordering and add an integration test for the procedure path.
Purpose:
likequeries through the existing prefix scan, and uses min/max pruning instead of full scan.Tests