Flatter representation of deeply nested `AND` / `OR` Expressions

### Is your feature request related to a problem or challenge?

DataFusion represents arbitrary binary Operator trees like 
```sql
(a OR (b OR (c OR (d OR ...)))
```

Using a binary tree structure called `BinaryExpr`:

https://github.com/apache/datafusion/blob/2da88876f6e764ca3052c92b9d1783929d13d7bf/datafusion/expr/src/expr.rs#L770-L779

There are also many places in the codebase where recursive algorithms are used to walk over Expr trees, 

An example of this is `Treenode::apply`: https://github.com/apache/datafusion/blob/0838a4ddb902535b0e95a1c5a254be7e9c7fe9bf/datafusion/common/src/tree_node.rs#L196-L209


The stack depth is a function of the nesting depth of the expression, which means that

> [!IMPORTANT]
> In some cases processing deeply nested expressions can lean to a stack overflow (and thus a process `abort`)

This shows up in practice from Machine-generated SQL/filters with thousands of OR/AND terms (e.g. IN lists rewritten to OR, or large predicate pushdown

This representation has served us quite well, but we do have several mitigations

1. [`recursive` crate](https://crates.io/crates/recursive) + recursive_protection feature (segmented stack growth)

The primary mitigation. Recursive TreeNode drivers are annotated `#[cfg_attr(feature = "recursive_protection", recursive::recursive)]`, which uses the recursive crate  to grow the stack on the heap when it runs low, instead of overflowing. See the TreeNode walkers (6 sites — apply/transform_down/transform_up/rewrite/etc.): https://github.com/apache/datafusion/blob/0838a4ddb902535b0e95a1c5a254be7e9c7fe9bf/datafusion/common/src/tree_node.rs#L196-L209

2. SQL parser recursion limit (bounded recursion → error, not crash)

The [sqlparser crate ](https://crates.io/crates/sqlparser) itself caps nesting depth (default 50) and returns RecursionLimitExceeded rather than overflowing the stack.

3. Explicit-stack (iterative) SQL→Expr conversion
- See issue #1444

We also hit this a bunch when converting a [sqlparser](https://crates.io/crates/sqlparser) binary-operator tree into `Expr` and have a stack based conversion approach (see https://github.com/apache/datafusion/blob/0838a4ddb902535b0e95a1c5a254be7e9c7fe9bf/datafusion/sql/src/expr/mod.rs#L76-L120)

4. Out-of-line match arms in the expression simplifier

Large match arms are extracted into separate functions specifically to shrink per-frame stack usage and avoid overflow during simplification.

- https://github.com/apache/datafusion/blob/0838a4ddb902535b0e95a1c5a254be7e9c7fe9bf/datafusion/optimizer/src/simplify_expressions/expr_simplifier.rs#L2335-L2341 


Recently people seem to be hitting recursion limits again and trying to put in patches, such as 
* https://github.com/apache/datafusion/pull/23198  from @mdashti
* https://github.com/apache/datafusion/pull/22943 from @nathanb

### Describe the solution you'd like

I would like a more fool proof way to avoid stack overflow issues



### Describe alternatives you've considered

## Option 1: n-ary expressions

The approach taken by DuckDb Mysql and Postgres is to represent `AND` and `OR` as n-ary rather than binary (2)

So that would be something like this (could also have an enum for And/Or)
```rust
enum ConjunctionType { And, Or }
/// AND/OR of some number of expressions
struct Conjunction {
  conjunction_type: ConjunctionType,
  /// Logically `exprs[0] AND (exprs[1] AND (exprs[2] AND ...`
  exprs : Vec<Expr>.
}
```

And then  
```rust
/// extend the Expr enum
enum Expr { 
  ...
  // AND
  Conjunction(Conjunction),
 ...
}
```

### Examples:

DuckDB uses ConjunctionExpression: https://github.com/duckdb/duckdb/blob/b8a06e4a22672e254cd0baa68a3dbed2eb51c56e/src/include/duckdb/parser/expression/conjunction_expression.hpp#L17-L27

Postrgres uses BoolExpr https://github.com/postgres/postgres/blob/REL_17_0/src/include/nodes/primnodes.h#L929-L942

Mysql uses [`Item_cond_and` / `Item_cond_or`](https://github.com/mysql/mysql-server/blob/mysql-8.0.40/sql/item_cmpfunc.h#L2711-L2768)  (based on [Item_cond base which stores args as a  List<Item> list](https://github.com/mysql/mysql-server/blob/mysql-8.0.40/sql/item_cmpfunc.h#L2421-L2447);

### Cons
This would be a major downstream API change 
Also unless we removed Operator::And / Operator::Or we would potentially have multiple ways to express the same logical expression


## Keep Binary Expr, have some sort of "balance" operation
Another approach taken by spark is to use Binary expressions: https://github.com/apache/spark/blob/fa33ea000a0bda9e5a3fa1af98e8e85b8cc5e4d4/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/predicates.scala#L828

[`buildBalancedPredicate`](https://github.com/apache/spark/blob/fa33ea000a0bda9e5a3fa1af98e8e85b8cc5e4d4/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/predicates.scala#L160-L184) rebuilds a predicate list into a height-balanced tree in 
 - https://github.com/apache/spark/pull/31724

### Additional context

Recent PRs 
* https://github.com/apache/datafusion/pull/23198  from @mdashti
* https://github.com/apache/datafusion/pull/22943 from @nathanb

🎯 Umbrella / strategic (open)

- #16788 — the epic to explore iterative rewrites / Box-ing / n-ary to remove the need for #[recursive] and regain the 1–2% planning overhead. 
- #14857 
- #19787 
- #14118 

🐛 Open stack-overflow bugs (motivating cases)

- #22229 
- #22936
- #8900 
- #23246

📜 Closed — the history that motivated the current mitigations

- #1444 
- #1434 
- #9375 
- #11102 
- #16030
- #4066
- #3968
- #23056
- #9573
- #9373
- #15087
- TPC-DS planning overflows: #6277, #6264, #6040, #4786, #4065, #8837

	/// Binary expression for [`Expr::BinaryExpr`]
	#[derive(Clone, PartialEq, Eq, PartialOrd, Hash, Debug)]
	pub struct BinaryExpr {
	/// Left-hand side of the expression
	pub left: Box<Expr>,
	/// The comparison operator
	pub op: Operator,
	/// Right-hand side of the expression
	pub right: Box<Expr>,
	}

	fn apply<'n, F: FnMut(&'n Self) -> Result<TreeNodeRecursion>>(
	&'n self,
	mut f: F,
	) -> Result<TreeNodeRecursion> {
	#[cfg_attr(feature = "recursive_protection", recursive::recursive)]
	fn apply_impl<'n, N: TreeNode, F: FnMut(&'n N) -> Result<TreeNodeRecursion>>(
	node: &'n N,
	f: &mut F,
	) -> Result<TreeNodeRecursion> {
	f(node)?.visit_children(\|\| node.apply_children(\|c\| apply_impl(c, f)))
	}

	apply_impl(self, &mut f)
	}

	fn apply<'n, F: FnMut(&'n Self) -> Result<TreeNodeRecursion>>(
	&'n self,
	mut f: F,
	) -> Result<TreeNodeRecursion> {
	#[cfg_attr(feature = "recursive_protection", recursive::recursive)]
	fn apply_impl<'n, N: TreeNode, F: FnMut(&'n N) -> Result<TreeNodeRecursion>>(
	node: &'n N,
	f: &mut F,
	) -> Result<TreeNodeRecursion> {
	f(node)?.visit_children(\|\| node.apply_children(\|c\| apply_impl(c, f)))
	}

	apply_impl(self, &mut f)
	}

	) -> Result<Expr> {
	enum StackEntry {
	SQLExpr(Box<SQLExpr>),
	Operator(BinaryOperator),
	}

	// Virtual stack machine to convert SQLExpr to Expr
	// This allows visiting the expr tree in a depth-first manner which
	// produces expressions in postfix notations, i.e. `a + b` => `a b +`.
	// See https://github.com/apache/datafusion/issues/1444
	let mut stack = vec![StackEntry::SQLExpr(Box::new(sql))];
	let mut eval_stack = vec![];

	while let Some(entry) = stack.pop() {
	match entry {
	StackEntry::SQLExpr(sql_expr) => {
	match *sql_expr {
	SQLExpr::BinaryOp { left, op, right } => {
	// Note the order that we push the entries to the stack
	// is important. We want to visit the left node first.
	stack.push(StackEntry::Operator(op));
	stack.push(StackEntry::SQLExpr(right));
	stack.push(StackEntry::SQLExpr(left));
	}
	_ => {
	let expr = self.sql_expr_to_logical_expr_internal(
	*sql_expr,
	schema,
	planner_context,
	)?;
	eval_stack.push(expr);
	}
	}
	}
	StackEntry::Operator(op) => {
	let right = eval_stack.pop().unwrap();
	let left = eval_stack.pop().unwrap();
	let expr = self.build_logical_expr(op, left, right, schema)?;
	eval_stack.push(expr);
	}
	}
	}

	assert_eq!(1, eval_stack.len());
	let expr = eval_stack.pop().unwrap();

	// Move this function body out of the large match branch avoid stack overflow
	fn simplify_right_is_one_case(
	info: &SimplifyContext,
	left: Box<Expr>,
	op: &Operator,
	right: &Expr,
	) -> Result<Transformed<Expr>> {

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Flatter representation of deeply nested `AND` / `OR` Expressions #23264

Is your feature request related to a problem or challenge?

Describe the solution you'd like

Describe alternatives you've considered

Option 1: n-ary expressions

Examples:

Cons

Keep Binary Expr, have some sort of "balance" operation

Additional context

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

Flatter representation of deeply nested AND / OR Expressions #23264

Description

Is your feature request related to a problem or challenge?

Describe the solution you'd like

Describe alternatives you've considered

Option 1: n-ary expressions

Examples:

Cons

Keep Binary Expr, have some sort of "balance" operation

Additional context

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions

Flatter representation of deeply nested `AND` / `OR` Expressions #23264