Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JS: Migrate to shared data flow library (targeting main!) 🚀 #18467

Open
wants to merge 548 commits into
base: main
Choose a base branch
from

Conversation

asgerf
Copy link
Contributor

@asgerf asgerf commented Jan 10, 2025

This merges the feature branch js/shared-dataflow-branch into main, replacing JavaScript's data flow library with the shared data flow library (instantiated for JavaScript).

Data flow queries are now written against the same interface as other languages (ConfigSig style). All queries and documentation has been updated accordingly. There's a migration guide included in this PR to assist external users in migrating their queries.

In terms of analysis quality, there are some general benefits from performing the switch:

  • Side-effects on function parameters are tracked back to the caller more reliably.
  • Path explanations will usually be better at following the flow without "jumps". Previously, path explanations had a tendency to skip over function bodies or from a store to a read of a property.

It also comes with a lot of other semi-related changes, which perhaps seem orthogonal, but were necessary to enable a smooth transition. There's too much to mention in full, so just a few highlights:

  • Support for flow summaries in library models. The library models for arrays, maps, sets, and promises have been converted to flow summaries.
  • Switch to use-use flow based on the shared SSA library, to prevent backwards "time travel" flow after a side effect.
  • Array indices are now tracked more precisely.
  • More precise handling of rest parameters and spread arguments ("..." syntax), as well as calls to Function.prototype.apply, and uses of the arguments array. These "dynamic" argument-passing steps are now much better at preserving argument indices to avoid mixing up different arguments and parameters. Super-calls through default constructors are also handled more precisely, since they desugar down to rest parameters and spread arguments.

See the list of pull requests and their descriptions for more details.

TaintedUrlSuffix is currently only used in TaintTracking configs meaning it is already propagated
by taint steps. The inclusion of these taint steps here however meant that implicit reads could appear prior to any of these steps.

This was is problematic for PropRead steps as an expression like x[0] could spuriously read from array element 1 via the path:

x [element 1]
x [empty access path] (after implicit read)
x[0] (taint step through PropRead)
A URL of form https://example.com?evil#bar will contain '?evil' after splitting out the '#' suffix, and vice versa.
Preserving tainted-url-suffix into array element 0 seemed like a good idea, but didn't work out so well.
JS: Fix handling of constant array index reads, and fix the fallout
Previously only reflected XSS used shared barrier guards.
JS: Fix bug causing re-evaluation of cached barriers
- Paths are now relative to the test case, not the qlpack
- Paths going through an implicit reads have changed slightly
We need these to return the dominator instead of declaring it in the parameter list, so that we can use it directly to fulfill part of the signature for the SSA library.

We can't rewrite it with an inline predicate since the SSA module calls with a transitive closure '*', which does not permit inline predicates.
JS: Remove with statement comment
…public

This exposes the predicates publicly, but will be hidden again in the next commit.
Indentation will be fixed in next commit
Only formatting changes
asgerf and others added 23 commits January 8, 2025 09:12
Co-authored-by: Erik Krogh Kristensen <[email protected]>
JS: Add migration guide and change note
This isn't going to become a taint step, the workaround is the permanent solution
The test case actually has the correct result now
There is an issue for tracking this. It's not a small fix.
This requires changes to the shared data flow library, not something we should track with a TODO in the JS codebase
This is useful info, but not something that can be fixed locally in this query, so a TODO comment isn't helping
We have an issue for fixing the underlying problem
The RHS of an assignment actually has a post-update node now
It is not subsumed by the other case, both cases are needed
Copy link

@github-advanced-security github-advanced-security bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CodeQL found more than 20 potential problems in the proposed changes. Check the Files changed tab for more details.

@asgerf asgerf marked this pull request as ready for review January 15, 2025 12:29
@asgerf asgerf requested a review from a team as a code owner January 15, 2025 12:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants