X Tutup
The Wayback Machine - https://web.archive.org/web/20250521214756/https://github.com/github/codeql/pull/7968
Skip to content

JS: Taint analysis for win paths #7968

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 9 commits into from
Closed

JS: Taint analysis for win paths #7968

wants to merge 9 commits into from

Conversation

kaeluka
Copy link
Contributor

@kaeluka kaeluka commented Feb 11, 2022

This contributes support for tracking win32 paths in tainted path queries.

The main difference is, that you can now extend query labels by extending the Platform class (as in TaintedWindowsPaths.ql) and path will then be tracked using 8 labels, instead of four.

By default, there's the four labels as before:

  1. normalized-relative-posix-path
  2. normalized-absolute-posix-path
  3. raw-relative-posix-path
  4. raw-absolute-posix-path

When extending the Platform class by a win32 variant, you'll additionally get:

  1. normalized-relative-win32-path
  2. normalized-absolute-win32-path
  3. raw-relative-win32-path
  4. raw-absolute-win32-path.

I also refactored the isAdditionalTaintStep predicate that would have — IMHO — become too large after this change by pulling out much of the logic into an abstract class and extensions thereof.

I ran an experiment, and looked at it with @esbena looking over my shoulder. It's not affecting the results of pre-existing queries. The results it finds for the new queries lgtm (I've looked at a subset of results from various projects).

@github-actions github-actions bot added the JS label Feb 11, 2022
@kaeluka kaeluka changed the title JS: Taint analysis for win paths JS: Taint analysis for win paths [DRAFT] Feb 11, 2022
@kaeluka kaeluka changed the title JS: Taint analysis for win paths [DRAFT] JS: Taint analysis for win paths Mar 4, 2022
@kaeluka kaeluka marked this pull request as ready for review March 4, 2022 09:00
@kaeluka kaeluka requested a review from a team as a code owner March 4, 2022 09:00
@kaeluka
Copy link
Contributor Author

kaeluka commented Mar 4, 2022

@erik-krogh has asked me to split the first commit into 3:

  1. with the refactoring into a module
  2. adding the platform class that creates new paths
  3. adding the logic

Comment on lines 207 to 216
/**
* A call that preserves taint without changing the flow label.
*/
class PreservingPathCall extends DataFlow::CallNode {
DataFlow::Node input;
DataFlow::Node output;
/** A call to the path.join method. */
class JoinPathCall extends PathTransformationNode {
JoinPathCall() {
this = NodeJSLib::Path::moduleMember("join").getACall() and
input = this.getAnArgument() and
output = this
}

PreservingPathCall() {
this =
NodeJSLib::Path::moduleMember(["dirname", "toNamespacedPath", "parse", "format"]).getACall() and
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm confused by what you've done with this class.

It seems to me that you've changed the behavior of the first case (pasted below) in this "refactoring" commit.

NodeJSLib::Path::moduleMember(["dirname", "toNamespacedPath", "parse", "format"]).getACall() and
input = this.getAnArgument() and
output = this

This class was used in the isPosixPathStep predicate to just forward a step while preserving the flow-labels.

It seems to me that you've changed it.

The "dirname" and "format" case is in the PreservingPathCall class, with the same behavior.
The "toNamespacedPath" case is in ToNamespacedPathCall, with the same behavior.

But the "parse" case is in a new class ParseCall with different behavior.
You've added some property checks based on ["root", "dir"]

I don't think that check works, but that's another matter. (CallNode::getALocalUse() can never be a PropRead).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry this reply took so long, I got really bogged down with other work and then time off.

What I tried to do here was to track through parse more precisely (it gives you an object which contains the root, dir, extension, baseName of a path). In principle, we may add more precise tracking of those individual properties — but that's neither a refactoring, nor what this PR really needs to be adding.

As the PR big enough on its own, I've amend the refactoring PR to remove the class ParseCall altogether (https://github.com/github/codeql/compare/7193fd691a391fb2f27db708b7877c648f797589..8b817fe555a0fdee7caed60ba44be1fc11ac39b7) and moved parse back to its friends dirname and format.

Thank you 👍

exists(RegExpSequence seq | seq.getNumChild() = 2 and literal.getRoot() = seq |
seq.getChild(0) instanceof RegExpCaret and
seq.getChild(1) = term
class ToNamespacedPathCall extends PathTransformationNode {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be part of PreservingPathCall (like it was before).

output = this.getCallback(1).getParameter(1)
module PathTransformations {
/** An abstract description of what kind of paths a call may return. */
abstract class PathTransformationNode extends DataFlow::CallNode {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
abstract class PathTransformationNode extends DataFlow::CallNode {
abstract class PathTransformationCall extends DataFlow::CallNode {

?

Comment on lines +1006 to +1010
predicate isPathStep(
DataFlow::Node src, DataFlow::Node dst, Path::PathLabel srclabel, Path::PathLabel dstlabel
) {
exists(PathTransformationNode ptn | ptn.step(srclabel, src, dstlabel, dst))
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you should just inline this predicate into the place it's used.
With your refactoring the predicate has become a predicate for taint-steps that originate from some call, which is not reflected in the name/qldoc.

bindingset[s]
predicate isDrivePath(string s) { s.regexpMatch("^[a-zA-Z]:\\\\*") }

/**
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/**
/** DEPRECATED: Use `PathLabel`. */
abstract deprecated class PosixLabel extends PathLabel {
PosixLabel() { platform = platformPosix() }
}
/**

@@ -33,7 +36,38 @@ module TaintedPath {
*/
abstract class BarrierGuardNode extends DataFlow::LabeledBarrierGuardNode { }

module Label {
/**
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/**
/** DEPRECATED: Use `Path`. */
deprecated module Label = Path;
/**

@@ -80,499 +132,596 @@ module TaintedPath {
predicate isAbsolute() { relativeness = "absolute" }

/** Gets the path label with normalized flag set to true. */
PosixPath toNormalized() {
PathLabel toNormalized() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think all of these utility predicates should preserve the platform
(add this.getPlatform() = result.getPlatform() to toNormalized, toNonNormalized, toAbsolute, toRelative`).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think your NormalizingPathCall doesn't preserve the platform due to you not preserving the platform in these utility predicates.
(The rest look OK).

Comment on lines +18 to +22
import javascript
import semmle.javascript.security.dataflow.TaintedPathQuery
import semmle.javascript.security.dataflow.TaintedPathCustomizations
import DataFlow::PathGraph
import semmle.javascript.frameworks.NodeJSLib
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
import javascript
import semmle.javascript.security.dataflow.TaintedPathQuery
import semmle.javascript.security.dataflow.TaintedPathCustomizations
import DataFlow::PathGraph
import semmle.javascript.frameworks.NodeJSLib
import javascript
import semmle.javascript.security.dataflow.TaintedPathQuery
import DataFlow::PathGraph

Note to self: add this as an example to the ql/redundant-import ticket.

Comment on lines +23 to +24

class Win32Platform extends Path::Platform {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
class Win32Platform extends Path::Platform {
/** Materializes the Win32 platform, which enables the Win32 tainted-path steps */
class Win32Platform extends Path::Platform {

@@ -0,0 +1 @@
Security/CWE-022/TaintedWindowsPath.ql
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would like a test of which taint-flows are only an issue on Windows.

I've written it for you here:

import javascript
import semmle.javascript.security.dataflow.TaintedPathQuery

// TODO: Move this to TaintedWindowsPathQuery.qll?
class Win32Platform extends Path::Platform {
  Win32Platform() { this = Path::platformWin32() }

  override string getSep() { result = "\\" }
}

from Configuration cfg, DataFlow::PathNode source, DataFlow::PathNode sink
where
  cfg.hasFlowPath(source, sink) and
  sink.getFlowLabel().(Path::PathLabel).getPlatform() = Path::platformWin32() and
  not sink.getFlowLabel().(Path::PathLabel).getPlatform() = Path::platformPosix()
select source, sink

It detects that express.js:27 is only an issue on windows (as expected).
But it doesn't flag express.js:35. I haven't looked into why.

Comment on lines +10 to +19

app.get("/some/path/2", function (req, res) {
const p = path.join('.', req.query.filename);
if (!p.startsWith('..')) {
req.files.foo.mv(p); // OK (but not on windows)
res.status(200);
} else {
res.status(400);
}
});
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you move this test to the Windows-path test folder?
With the test I suggested we can actually test that req.files.foo.mv(p) is only flagged by the windows query.

@kaeluka kaeluka marked this pull request as draft May 9, 2022 12:42
@kaeluka
Copy link
Contributor Author

kaeluka commented Jul 27, 2022

Talked this over with Calum; I'll drop this for the time being.

@kaeluka kaeluka closed this Jul 27, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants
X Tutup