Conversation
- move from custom concept `LogOutput` to standard concept `Logging`
- remove `Log.qll` from experimental frameworks
- fold models into standard models (naively for now)
- stdlib:
- make Logger module public
- broaden definition of instance
- add `extra` keyword as possible source
- flak: add app.logger as logger instance
- django: `add django.utils.log.request_logger` as logger instance
(should we add the rest?)
- remove LogOutput from experimental concepts
- modernize the sanitizer, but do not make it less specific
should we have the `lgtm,codescanning` handshake or not?
I set the category to newQuery since that is what users will see. When we have tags, it would be nice to tag it as a query promotion.
80da48e to
9d41666
Compare
RasmusWL
left a comment
There was a problem hiding this comment.
python/ql/src/Security/CWE-117/LogInjection.ql needs to have meta-data updated, to include security severity at least. I haven't been involved in looking at this query before, do you feel that @precision high is reasonable? For security-severity, we might just copy the one from java (which is the same as the one for JS)
Looking up those @security-severity scores, I noticed that both queires had @precision medium. That got me thinking, in the example below, neither username nor foo can contain newlines, but would probably end up causing an alert 😐 so we need to be mindful about the precision we set here.
@app.route("/users/<username>")
def show_user(username):
foo = request.args.get("foo", 10)
LOGGER.debug(f"showing user {username} with foo={foo}")nitpick: QLDoc for classes doesn't follow our style-guide (must start with A or An)
nitpick: Would be very nice with documentations links.
(I've made a suggestion for Django to fix the two nitpicks above)
For the qhelp, I see that it is in large part just a copy of the qhelp from the java query. I personally have a mild preference for the JS version (since it better highlights the problems of HTML). Have you made a consideration between the two?
| | | ||
| this = Logger::instance().getMember(method).getACall() | ||
| or | ||
| this = API::moduleImport("logging").getMember(method).getACall() |
There was a problem hiding this comment.
I don't agree about this removal. I see that API::moduleImport("logging") is now modeled as a LoggerInstance... but that is not true, these are helper methods defined on the module leve.
There was a problem hiding this comment.
Is there value in modelling instances specifically? Or could we just rename instance to something like loggingMethodProvider?
There was a problem hiding this comment.
Reverted to previous modelling approach as discussed offline.
python/ql/lib/semmle/python/security/dataflow/LogInjectionCustomizations.qll
Outdated
Show resolved
Hide resolved
| class ReplaceLineBreaksSanitizer extends Sanitizer, DataFlow::CallCfgNode { | ||
| ReplaceLineBreaksSanitizer() { | ||
| this.getFunction().(DataFlow::AttrRead).getAttributeName() = "replace" and | ||
| this.getArg(0).asExpr().(StrConst).getText() in ["\r\n", "\n"] | ||
| } | ||
| } |
There was a problem hiding this comment.
This sanitizer check seems a bit weak. What if it only did text.replace("\r\n", "\n")? I guess it's not trivial to write a sanitizer that knows we've eliminated both, so maybe this is ok? (what are your thoughts on this)
There was a problem hiding this comment.
I have had the same concerns from the beginning.
There was a problem hiding this comment.
I think it would be great to have a small comment in the code to explain this dilemma, and explain why the solution implemented was deemed ok.
| module Logger { | ||
| /** | ||
| * An instance of `logging.Logger`. Extend this class to model new instances. | ||
| * Most major frameworks will provide a logger instance as a class attribute. | ||
| */ | ||
| abstract class LoggerInstance extends API::Node { | ||
| override string toString() { result = "logger" } | ||
| } | ||
|
|
||
| /** Gets a reference to the `logging.Logger` class or any subclass. */ | ||
| API::Node subclassRef() { | ||
| result = API::moduleImport("logging").getMember("Logger").getASubclass*() | ||
| } | ||
|
|
||
| /** Gets a reference to an instance of `logging.Logger` or any subclass. */ | ||
| API::Node instance() { | ||
| result instanceof LoggerInstance | ||
| or | ||
| result = subclassRef().getReturn() | ||
| or | ||
| result = API::moduleImport("logging") | ||
| or | ||
| result = API::moduleImport("logging").getMember("root") | ||
| or | ||
| result = API::moduleImport("logging").getMember("getLogger").getReturn() | ||
| } | ||
| } |
There was a problem hiding this comment.
I see that I created this non-standard setup with just using API::Node, but now that we're exposing it more publicly, we need to rewrite this to use the standard library modeling setup with InstanceSource and type-tracking (which has a snippet, so it's quite easy).
Looking over the code again, I don't quite know why I did this in the first place. LoggerLogCall can easily be written with the standard approach together with DataFlow::MethodCallNode 🤷
There was a problem hiding this comment.
Surely it is nicer, if we can get away with using an API:Node?
There was a problem hiding this comment.
Moved to standard setup as discussed offline.
I did not even consider that we would have such options, I just checked that it looked reasonable :-) Thanks for the heads-up! |
Co-authored-by: Rasmus Wriedt Larsen <rasmuswriedtlarsen@gmail.com>
Given both the original FP score and our concerns regarding sanitizers, `@precision medium`, which is aligned with other languages, feels appropriate.
(combining the Java version and the JS version)
RasmusWL
left a comment
There was a problem hiding this comment.
Two minor things, then we should be good to go 👍
| class ReplaceLineBreaksSanitizer extends Sanitizer, DataFlow::CallCfgNode { | ||
| ReplaceLineBreaksSanitizer() { | ||
| this.getFunction().(DataFlow::AttrRead).getAttributeName() = "replace" and | ||
| this.getArg(0).asExpr().(StrConst).getText() in ["\r\n", "\n"] | ||
| } | ||
| } |
There was a problem hiding this comment.
I think it would be great to have a small comment in the code to explain this dilemma, and explain why the solution implemented was deemed ok.
Co-authored-by: Rasmus Wriedt Larsen <rasmuswriedtlarsen@gmail.com>
I actually got a bit annoyed with the explanation. Do let me know if we can live with postponing this, or if I should attempt a rewrite in this PR.. |
RasmusWL
left a comment
There was a problem hiding this comment.
I think it would be great to have a small comment in the code to explain this dilemma, and explain why the solution implemented was deemed ok.
I actually got a bit annoyed with the explanation. Do let me know if we can live with postponing this, or if I should attempt a rewrite in this PR..
Thanks for writing this 👍 I think it's very nice that the next person thinking about this code can just read the comment explaining we thought about this 👍 I have a small suggestion for writing it, but otherwise LGTM 👍
python/ql/lib/semmle/python/security/dataflow/LogInjectionCustomizations.qll
Outdated
Show resolved
Hide resolved
python/ql/lib/semmle/python/security/dataflow/LogInjectionCustomizations.qll
Outdated
Show resolved
Hide resolved
…omizations.qll Co-authored-by: Rasmus Wriedt Larsen <rasmuswriedtlarsen@gmail.com>
…omizations.qll Co-authored-by: Rasmus Wriedt Larsen <rasmuswriedtlarsen@gmail.com>
Trailing whitespace is a bit too easy with the ```suggestions through the UI :|
|
Thanks for the auto format fix :-) (it confused me for a bit that I could not get it to change the file locally 😁) |
|
Python 3 Language tests failed seems to have been flaky, so I've rerun them. 🤞 |
|
Tests are all good, Actually, we should do a performance test for this 😳 |
|
Performance does looks fine, so in it goes 👍 |
This PR promotes the log injection (CWE-117) query out of experimental.
Some concerns: