X Tutup
The Wayback Machine - https://web.archive.org/web/20260107032941/https://github.com/github/codeql/issues/8568
Skip to content

How can I "sanitize" paths where a variable is passed to a sanitization function, but the path doesn't contain the result of the sanitization function? #8568

@gsingh93

Description

@gsingh93

Update:

I'd like SsaDefinition and the dominates predicate to be better documented with some actual examples. See #8568 (comment) for some more context.


Original Post:

I would like to remove/sanitize a path if the sink node was ever passed to the sanitization function sometime before reaching the sink. Here's an example of some C code I want to analyze:

void sink(int);
void sanitize(int);

void foo(int x) {
  sink(x);
}

void test(int x) {
  // Alert
  sink(x);

  // Alert
  foo(x);

  sanitize(x);

  // No alert
  sink(x);

  // No alert
  foo(x);
}

If I use a simple TaintTracking::Configuration which tracks flows from the parameter x to the argument to sink(x), I get four paths even if I use some type of sanitizer like this:

  override predicate isSanitizer(DataFlow::Node sanitizer) {
    exists(FunctionCall c | c.getTarget().hasName("sanitize") |
      c.getAnArgument() = sanitizer.asExpr()
    )
  }

It makes sense why this doesn't work: each path goes directly from the parameter x to the argument of sink or foo. There are no paths that go from the parameter x, to sanitize, and then to the sink (that would be the case if the example had sink(sanitize(x)), but in this case sanitize does not return a value).

I can kind of get around this issue for cases where sanitize and sink are in the function call like this:

  override predicate isSink(DataFlow::Node sink) {
    exists(FunctionCall c | c.getTarget().hasName("sink") | c.getAnArgument() = sink.asExpr()) and
    not exists(FunctionCall c |
      c.getTarget().hasName("sanitize") and
      sink.asExpr().getAPredecessor+() = c and
      c.getAnArgument().(VariableAccess).getTarget() = sink.asExpr().(VariableAccess).getTarget()
    )
  }

With this I only get three paths instead of four, but I still don't get the desired two paths because it doesn't handle the general case where the sink and sanitizer are in two different functions.

Is there any way to solve this? Does CodeQL store any paths between uses of a variables in addition to just the path from the definition of the variable to the use of it?

Metadata

Metadata

Assignees

No one assigned

    Labels

    questionFurther information is requested

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions

      X Tutup