Skip to content

Parser: fix exponential parse time on expression-named function arguments#2392

Merged
iffyio merged 1 commit into
apache:mainfrom
firebolt-analytics:upstream-fix-function-arg-exponential
Jul 2, 2026
Merged

Parser: fix exponential parse time on expression-named function arguments#2392
iffyio merged 1 commit into
apache:mainfrom
firebolt-analytics:upstream-fix-function-arg-exponential

Conversation

@moshap-firebolt

Copy link
Copy Markdown
Contributor

On dialects with expression-named function arguments
(supports_named_fn_args_with_expr_name, e.g. PostgreSQL and MSSQL),
parse_function_args speculatively parses the whole argument expression to
detect the name => value form, then on failure rewinds and re-parses the
same expression on the unnamed path. Because CAST and CASE fall back to
function-call parsing when their reserved-word form fails, both passes recurse
through the remaining chain, so on inputs like CAST(CASE (CAST(CASE (... work
doubles per level. Parsing the leading expression once and then checking for a
named-argument operator removes the redundant traversal; the resulting AST is
identical on valid SQL.

Measured on PostgreSqlDialect with with_recursion_limit(256), release
build. Input: SELECT + CAST(CASE ( repeated N times + ) repeated 3N:

N Before After
10 1.8 s 781 us
15 >30 s 967 us
20 >30 s 1.6 ms
25 >30 s 2.3 ms

A wildcard can never be a named-argument name, so it is routed straight to the
unnamed path, preserving prior behavior. This also makes the "reserved keyword
as a bare function argument" error consistent across dialects (SELECT MAX(interval) now reports the same position regardless of named-argument
support); test_reserved_keywords_for_identifiers is updated accordingly.

Regression test in tests/sqlparser_common.rs runs a 30-level chain with a 5 s
timeout (hits the timeout pre-fix, finishes in well under a millisecond
post-fix); bench under sqlparser_bench.

@iffyio iffyio left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thanks @moshap-firebolt!

@iffyio iffyio added this pull request to the merge queue Jul 1, 2026
@github-merge-queue github-merge-queue Bot removed this pull request from the merge queue due to a conflict with the base branch Jul 1, 2026
@iffyio

iffyio commented Jul 1, 2026

Copy link
Copy Markdown
Contributor

@moshap-firebolt could you take a look at the conflict on the branch?

…ents

On dialects with expression-named function arguments
(`supports_named_fn_args_with_expr_name`, e.g. PostgreSQL and MSSQL),
`parse_function_args` speculatively parses the whole argument expression to
detect the `name => value` form, then on failure rewinds and re-parses the
same expression on the unnamed path. Because CAST and CASE fall back to
function-call parsing when their reserved-word form fails, both passes recurse
through the remaining chain, so on inputs like `CAST(CASE (CAST(CASE (...` work
doubles per level. Parsing the leading expression once and then checking for a
named-argument operator removes the redundant traversal; the resulting AST is
identical on valid SQL.

Measured on `PostgreSqlDialect` with `with_recursion_limit(256)`, release
build. Input: `SELECT ` + `CAST(CASE (` repeated N times + `)` repeated 3N:

| N  | Before | After  |
|----|--------|--------|
| 10 | 1.8 s  | 781 us |
| 15 | >30 s  | 967 us |
| 20 | >30 s  | 1.6 ms |
| 25 | >30 s  | 2.3 ms |

A wildcard can never be a named-argument name, so it is routed straight to the
unnamed path, preserving prior behavior. This also makes the "reserved keyword
as a bare function argument" error consistent across dialects (`SELECT
MAX(interval)` now reports the same position regardless of named-argument
support); `test_reserved_keywords_for_identifiers` is updated accordingly.

Regression test in `tests/sqlparser_common.rs` runs a 30-level chain with a 5 s
timeout (hits the timeout pre-fix, finishes in well under a millisecond
post-fix); bench under `sqlparser_bench`.
@moshap-firebolt moshap-firebolt force-pushed the upstream-fix-function-arg-exponential branch from 3ed0dac to 5e5fb1c Compare July 1, 2026 20:07
@moshap-firebolt

Copy link
Copy Markdown
Contributor Author

@moshap-firebolt could you take a look at the conflict on the branch?

@iffyio - done!

@iffyio iffyio added this pull request to the merge queue Jul 2, 2026
Merged via the queue into apache:main with commit be95cf6 Jul 2, 2026
10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants