You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The stack depth is a function of the nesting depth of the expression, which means that
Important
In some cases processing deeply nested expressions can lean to a stack overflow (and thus a process abort)
This shows up in practice from Machine-generated SQL/filters with thousands of OR/AND terms (e.g. IN lists rewritten to OR, or large predicate pushdown
This representation has served us quite well, but we do have several mitigations
The primary mitigation. Recursive TreeNode drivers are annotated #[cfg_attr(feature = "recursive_protection", recursive::recursive)], which uses the recursive crate to grow the stack on the heap when it runs low, instead of overflowing. See the TreeNode walkers (6 sites — apply/transform_down/transform_up/rewrite/etc.):
I would like a more fool proof way to avoid stack overflow issues
Describe alternatives you've considered
Option 1: n-ary expressions
The approach taken by DuckDb Mysql and Postgres is to represent AND and OR as n-ary rather than binary (2)
So that would be something like this (could also have an enum for And/Or)
enumConjunctionType{And,Or}/// AND/OR of some number of expressionsstructConjunction{conjunction_type:ConjunctionType,/// Logically `exprs[0] AND (exprs[1] AND (exprs[2] AND ...`exprs:Vec<Expr>.}
And then
/// extend the Expr enumenumExpr{
...
// ANDConjunction(Conjunction),
...
}
This would be a major downstream API change
Also unless we removed Operator::And / Operator::Or we would potentially have multiple ways to express the same logical expression
Keep Binary Expr, have some sort of "balance" operation
Is your feature request related to a problem or challenge?
DataFusion represents arbitrary binary Operator trees like
Using a binary tree structure called
BinaryExpr:datafusion/datafusion/expr/src/expr.rs
Lines 770 to 779 in 2da8887
There are also many places in the codebase where recursive algorithms are used to walk over Expr trees,
An example of this is
Treenode::apply:datafusion/datafusion/common/src/tree_node.rs
Lines 196 to 209 in 0838a4d
The stack depth is a function of the nesting depth of the expression, which means that
Important
In some cases processing deeply nested expressions can lean to a stack overflow (and thus a process
abort)This shows up in practice from Machine-generated SQL/filters with thousands of OR/AND terms (e.g. IN lists rewritten to OR, or large predicate pushdown
This representation has served us quite well, but we do have several mitigations
recursivecrate + recursive_protection feature (segmented stack growth)The primary mitigation. Recursive TreeNode drivers are annotated
#[cfg_attr(feature = "recursive_protection", recursive::recursive)], which uses the recursive crate to grow the stack on the heap when it runs low, instead of overflowing. See the TreeNode walkers (6 sites — apply/transform_down/transform_up/rewrite/etc.):datafusion/datafusion/common/src/tree_node.rs
Lines 196 to 209 in 0838a4d
The sqlparser crate itself caps nesting depth (default 50) and returns RecursionLimitExceeded rather than overflowing the stack.
We also hit this a bunch when converting a sqlparser binary-operator tree into
Exprand have a stack based conversion approach (seedatafusion/datafusion/sql/src/expr/mod.rs
Lines 76 to 120 in 0838a4d
Large match arms are extracted into separate functions specifically to shrink per-frame stack usage and avoid overflow during simplification.
datafusion/datafusion/optimizer/src/simplify_expressions/expr_simplifier.rs
Lines 2335 to 2341 in 0838a4d
Recently people seem to be hitting recursion limits again and trying to put in patches, such as
BinaryExprchain iteratively to avoid a stack overflow #23198 from @mdashtiDescribe the solution you'd like
I would like a more fool proof way to avoid stack overflow issues
Describe alternatives you've considered
Option 1: n-ary expressions
The approach taken by DuckDb Mysql and Postgres is to represent
ANDandORas n-ary rather than binary (2)So that would be something like this (could also have an enum for And/Or)
And then
Examples:
DuckDB uses ConjunctionExpression: https://github.com/duckdb/duckdb/blob/b8a06e4a22672e254cd0baa68a3dbed2eb51c56e/src/include/duckdb/parser/expression/conjunction_expression.hpp#L17-L27
Postrgres uses BoolExpr https://github.com/postgres/postgres/blob/REL_17_0/src/include/nodes/primnodes.h#L929-L942
Mysql uses
Item_cond_and/Item_cond_or(based on Item_cond base which stores args as a List list;Cons
This would be a major downstream API change
Also unless we removed Operator::And / Operator::Or we would potentially have multiple ways to express the same logical expression
Keep Binary Expr, have some sort of "balance" operation
Another approach taken by spark is to use Binary expressions: https://github.com/apache/spark/blob/fa33ea000a0bda9e5a3fa1af98e8e85b8cc5e4d4/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/predicates.scala#L828
buildBalancedPredicaterebuilds a predicate list into a height-balanced tree inAdditional context
Recent PRs
BinaryExprchain iteratively to avoid a stack overflow #23198 from @mdashti🎯 Umbrella / strategic (open)
LogicalPlan::using_columns()#14118🐛 Open stack-overflow bugs (motivating cases)
WHEREclause overflow stack inselect_to_planvia derivedexpr::clone#22936StackGuardracesrecursive's process-global stack-size minimum across threads #23246📜 Closed — the history that motivated the current mitigations
ORlist overflows the stack #9375Expr::to_bytescan produce output that hitsExpr::from_bytesrecursion limit #3968PhysicalPlanNode::try_from_physical_plan#15087