Python: promote log injection by yoff · Pull Request #7735 · github/codeql

yoff · 2022-01-25T10:07:13Z

This PR promotes the log injection (CWE-117) query out of experimental.

Some concerns:

the models of loggers, particularly for Django is a bit curated. In general, should we do way more than the contribution when we promote?
the sanitization seems very specific. I added the string const compare sanitizer, but there are probably other ways to sanitize also. Should we have a concept for sanitizers similar to escaping or just reuse escaping?

- move from custom concept `LogOutput` to standard concept `Logging` - remove `Log.qll` from experimental frameworks - fold models into standard models (naively for now) - stdlib: - make Logger module public - broaden definition of instance - add `extra` keyword as possible source - flak: add app.logger as logger instance - django: `add django.utils.log.request_logger` as logger instance (should we add the rest?) - remove LogOutput from experimental concepts

- modernize the sanitizer, but do not make it less specific

should we have the `lgtm,codescanning` handshake or not?

I set the category to newQuery since that is what users will see. When we have tags, it would be nice to tag it as a query promotion.

RasmusWL

python/ql/src/Security/CWE-117/LogInjection.ql needs to have meta-data updated, to include security severity at least. I haven't been involved in looking at this query before, do you feel that @precision high is reasonable? For security-severity, we might just copy the one from java (which is the same as the one for JS)

Looking up those @security-severity scores, I noticed that both queires had @precision medium. That got me thinking, in the example below, neither username nor foo can contain newlines, but would probably end up causing an alert 😐 so we need to be mindful about the precision we set here.

@app.route("/users/<username>")
def show_user(username):
    foo = request.args.get("foo", 10)
    LOGGER.debug(f"showing user {username} with foo={foo}")

nitpick: QLDoc for classes doesn't follow our style-guide (must start with A or An)
nitpick: Would be very nice with documentations links.

(I've made a suggestion for Django to fix the two nitpicks above)

For the qhelp, I see that it is in large part just a copy of the qhelp from the java query. I personally have a mild preference for the JS version (since it better highlights the problems of HTML). Have you made a consideration between the two?

python/ql/lib/semmle/python/frameworks/Django.qll

RasmusWL · 2022-01-31T10:22:08Z

python/ql/lib/semmle/python/frameworks/Stdlib.qll

      |
-        this = Logger::instance().getMember(method).getACall()
-        or
-        this = API::moduleImport("logging").getMember(method).getACall()


I don't agree about this removal. I see that API::moduleImport("logging") is now modeled as a LoggerInstance... but that is not true, these are helper methods defined on the module leve.

Is there value in modelling instances specifically? Or could we just rename instance to something like loggingMethodProvider?

Reverted to previous modelling approach as discussed offline.

python/ql/lib/semmle/python/security/dataflow/LogInjectionCustomizations.qll

RasmusWL · 2022-01-31T10:51:17Z

python/ql/lib/semmle/python/security/dataflow/LogInjectionCustomizations.qll

+  class ReplaceLineBreaksSanitizer extends Sanitizer, DataFlow::CallCfgNode {
+    ReplaceLineBreaksSanitizer() {
+      this.getFunction().(DataFlow::AttrRead).getAttributeName() = "replace" and
+      this.getArg(0).asExpr().(StrConst).getText() in ["\r\n", "\n"]
+    }
+  }


This sanitizer check seems a bit weak. What if it only did text.replace("\r\n", "\n")? I guess it's not trivial to write a sanitizer that knows we've eliminated both, so maybe this is ok? (what are your thoughts on this)

I have had the same concerns from the beginning.

I think it would be great to have a small comment in the code to explain this dilemma, and explain why the solution implemented was deemed ok.

RasmusWL · 2022-01-31T10:54:05Z

python/ql/lib/semmle/python/frameworks/Stdlib.qll

+  module Logger {
+    /**
+     * An instance of `logging.Logger`. Extend this class to model new instances.
+     * Most major frameworks will provide a logger instance as a class attribute.
+     */
+    abstract class LoggerInstance extends API::Node {
+      override string toString() { result = "logger" }
+    }
+
+    /** Gets a reference to the `logging.Logger` class or any subclass. */
+    API::Node subclassRef() {
+      result = API::moduleImport("logging").getMember("Logger").getASubclass*()
+    }
+
+    /** Gets a reference to an instance of `logging.Logger` or any subclass. */
+    API::Node instance() {
+      result instanceof LoggerInstance
+      or
+      result = subclassRef().getReturn()
+      or
+      result = API::moduleImport("logging")
+      or
+      result = API::moduleImport("logging").getMember("root")
+      or
+      result = API::moduleImport("logging").getMember("getLogger").getReturn()
+    }
+  }


I see that I created this non-standard setup with just using API::Node, but now that we're exposing it more publicly, we need to rewrite this to use the standard library modeling setup with InstanceSource and type-tracking (which has a snippet, so it's quite easy).

Looking over the code again, I don't quite know why I did this in the first place. LoggerLogCall can easily be written with the standard approach together with DataFlow::MethodCallNode 🤷

Surely it is nicer, if we can get away with using an API:Node?

Moved to standard setup as discussed offline.

yoff · 2022-02-01T08:22:12Z

For the qhelp, I see that it is in large part just a copy of the qhelp from the java query. I personally have a mild preference for the JS version (since it better highlights the problems of HTML). Have you made a consideration between the two?

I did not even consider that we would have such options, I just checked that it looked reasonable :-) Thanks for the heads-up!

Co-authored-by: Rasmus Wriedt Larsen <rasmuswriedtlarsen@gmail.com>

Given both the original FP score and our concerns regarding sanitizers, `@precision medium`, which is aligned with other languages, feels appropriate.

(combining the Java version and the JS version)

RasmusWL

nice updates so far 👍

python/ql/src/Security/CWE-117/LogInjection.ql

RasmusWL

Two minor things, then we should be good to go 👍

python/ql/lib/semmle/python/frameworks/Stdlib.qll

RasmusWL · 2022-02-08T14:17:08Z

python/ql/lib/semmle/python/security/dataflow/LogInjectionCustomizations.qll

+  class ReplaceLineBreaksSanitizer extends Sanitizer, DataFlow::CallCfgNode {
+    ReplaceLineBreaksSanitizer() {
+      this.getFunction().(DataFlow::AttrRead).getAttributeName() = "replace" and
+      this.getArg(0).asExpr().(StrConst).getText() in ["\r\n", "\n"]
+    }
+  }


I think it would be great to have a small comment in the code to explain this dilemma, and explain why the solution implemented was deemed ok.

Co-authored-by: Rasmus Wriedt Larsen <rasmuswriedtlarsen@gmail.com>

yoff · 2022-02-14T10:40:04Z

I think it would be great to have a small comment in the code to explain this dilemma, and explain why the solution implemented was deemed ok.

I actually got a bit annoyed with the explanation. Do let me know if we can live with postponing this, or if I should attempt a rewrite in this PR..

RasmusWL

I think it would be great to have a small comment in the code to explain this dilemma, and explain why the solution implemented was deemed ok.

I actually got a bit annoyed with the explanation. Do let me know if we can live with postponing this, or if I should attempt a rewrite in this PR..

Thanks for writing this 👍 I think it's very nice that the next person thinking about this code can just read the comment explaining we thought about this 👍 I have a small suggestion for writing it, but otherwise LGTM 👍

python/ql/lib/semmle/python/security/dataflow/LogInjectionCustomizations.qll

…omizations.qll Co-authored-by: Rasmus Wriedt Larsen <rasmuswriedtlarsen@gmail.com>

Trailing whitespace is a bit too easy with the ```suggestions through the UI :|

yoff · 2022-02-15T10:36:15Z

Thanks for the auto format fix :-) (it confused me for a bit that I could not get it to change the file locally 😁)

RasmusWL · 2022-02-17T09:17:48Z

Python 3 Language tests failed seems to have been flaky, so I've rerun them. 🤞

RasmusWL · 2022-02-17T09:50:03Z

Tests are all good, ~~so shipit~~

Actually, we should do a performance test for this 😳

RasmusWL · 2022-02-23T15:21:08Z

Performance does looks fine, so in it goes 👍

yoff requested a review from a team as a code owner January 25, 2022 10:07

github-actions bot added documentation Python labels Jan 25, 2022

yoff added 4 commits January 31, 2022 11:27

python: Add standard customization setup

8b5114d

- modernize the sanitizer, but do not make it less specific

python: Add change note

bf1145e

should we have the `lgtm,codescanning` handshake or not?

python: modern change note

9d41666

I set the category to newQuery since that is what users will see. When we have tags, it would be nice to tag it as a query promotion.

yoff force-pushed the python/promote-log-injection branch from 80da48e to 9d41666 Compare January 31, 2022 10:28

RasmusWL requested changes Jan 31, 2022

View reviewed changes

yoff and others added 5 commits February 1, 2022 10:04

Apply suggestions from code review

c03f89d

Co-authored-by: Rasmus Wriedt Larsen <rasmuswriedtlarsen@gmail.com>

python: "command" -> "log"

7511b33

python: drop precision and add severity score

26befeb

Given both the original FP score and our concerns regarding sanitizers, `@precision medium`, which is aligned with other languages, feels appropriate.

python: rewrite qhelp overview

ecea392

(combining the Java version and the JS version)

python: provide links for Flask

119a7e4

RasmusWL reviewed Feb 1, 2022

View reviewed changes

python/ql/src/Security/CWE-117/LogInjection.ql Show resolved Hide resolved

yoff added 2 commits February 1, 2022 13:31

python: use standard InstanceSource construction

c587084

python: update change note

bec8c0d

yoff requested a review from RasmusWL February 1, 2022 12:42

python: logging.root is not a call

448e078

RasmusWL requested changes Feb 8, 2022

View reviewed changes

yoff and others added 2 commits February 9, 2022 09:22

Update python/ql/lib/semmle/python/frameworks/Stdlib.qll

f21ac04

Co-authored-by: Rasmus Wriedt Larsen <rasmuswriedtlarsen@gmail.com>

python: add apologetic comment

bd14ade

yoff requested a review from RasmusWL February 14, 2022 10:38

RasmusWL reviewed Feb 14, 2022

View reviewed changes

python/ql/lib/semmle/python/security/dataflow/LogInjectionCustomizations.qll Outdated Show resolved Hide resolved

python/ql/lib/semmle/python/security/dataflow/LogInjectionCustomizations.qll Outdated Show resolved Hide resolved

yoff and others added 2 commits February 14, 2022 16:07

Update python/ql/lib/semmle/python/security/dataflow/LogInjectionCust…

62598c0

…omizations.qll Co-authored-by: Rasmus Wriedt Larsen <rasmuswriedtlarsen@gmail.com>

Update python/ql/lib/semmle/python/security/dataflow/LogInjectionCust…

3a995ec

…omizations.qll Co-authored-by: Rasmus Wriedt Larsen <rasmuswriedtlarsen@gmail.com>

yoff requested a review from RasmusWL February 14, 2022 15:08

Python: Autoformat

62d4bb5

Trailing whitespace is a bit too easy with the ```suggestions through the UI :|

RasmusWL approved these changes Feb 15, 2022

View reviewed changes

Merge branch 'main' into python/promote-log-injection

b59ab7f

RasmusWL merged commit aeba497 into github:main Feb 23, 2022

Conversation

yoff commented Jan 25, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

RasmusWL left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

yoff Feb 1, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

yoff commented Feb 1, 2022

Uh oh!

RasmusWL left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

RasmusWL left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

yoff commented Feb 14, 2022

Uh oh!

RasmusWL left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

yoff commented Feb 15, 2022

Uh oh!

RasmusWL commented Feb 17, 2022

Uh oh!

RasmusWL commented Feb 17, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

RasmusWL commented Feb 23, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

yoff commented Jan 25, 2022 •

edited

Loading

yoff Feb 1, 2022 •

edited

Loading

RasmusWL commented Feb 17, 2022 •

edited

Loading