X Tutup
The Wayback Machine - https://web.archive.org/web/20250522184213/https://github.com/github/codeql/pull/7933
Skip to content

C++: Improve cpp/system-data-exposure #7933

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 12 commits into from
Mar 2, 2022
Merged

C++: Improve cpp/system-data-exposure #7933

merged 12 commits into from
Mar 2, 2022

Conversation

geoffw0
Copy link
Contributor

@geoffw0 geoffw0 commented Feb 10, 2022

Improve cpp/system-data-exposure (CWE-497):

  • convert it to use taint tracking rather than an ad-hoc def-use thing.
  • make it a path problem
  • filter sources that are common false positives
    • for getenv we now try to guess whether the particular environment variable is sensitive
    • sysconf is excluded because all the data it can provide seems fairly benign
  • use RemoteFlowSinkFunction for sinks
    • notably this does not include output to stdout or stderr, thus the SAMATE test is no longer passed - but I don't think a reasonable more accurate version of this query can ever pass that test case.

Results are much more accurate (and much fewer). Its an open question whether they're good enough to warrant @precision high. I'm not sure I can go much further on precision. I could probably restore a few of the flows we've lost (i.e. find more results) though.

LGTM run: https://lgtm.com/query/8145818204639191058/ https://lgtm.com/query/2373242062587923859/ https://lgtm.com/query/1615647564237672449/

  • its difficult to be sure, but all results appear worthy of investigation. There aren't a huge number.

@geoffw0 geoffw0 added the C++ label Feb 10, 2022
@geoffw0 geoffw0 requested a review from a team as a code owner February 10, 2022 18:12
@@ -67,6 +67,6 @@ void CWE535_Info_Exposure_Shell_Error__w32_char_01_bad()
printLine("Unable to login.");
}
/* FLAW: Write sensitive data to stderr */
fprintf(stderr, "User attempted access with password: %s\n", password);
fprintf(stderr, "User attempted access with password: %s\n", password); // [NOT DETECTED]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is unfortunate, but I'm a bit surprised we don't treat fprintf arguments as a sink. It might be worth a separate query for specifically the password-to-console case.

Copy link
Contributor Author

@geoffw0 geoffw0 Feb 11, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I see what you're saying. Flagging writes to stdout / stderr is very noisy but we could write a special case for the very most sensitive information (literally just passwords) being output in such a way. It would catch the SAMATE results but might or might not find much in the real world. I'll try a prototype...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Quick prototype: https://lgtm.com/query/7564791297006366678/

Results are thin but do exist. I agree this should be a separate query.

@geoffw0
Copy link
Contributor Author

geoffw0 commented Feb 24, 2022

This PR should be ready to merge. There will be some follow-up work:

  • create the separate query suggested by @rdmarsh2
  • review precision (probably after a larger test run)

@MathiasVP
Copy link
Contributor

This PR should be ready to merge. There will be some follow-up work:

Is there a DCA run somewhere? I'd be happy to merge it if that looks good!

@geoffw0
Copy link
Contributor Author

geoffw0 commented Feb 28, 2022

DCA results: it appears this query cpp/system-data-exposure is missing from results because Security/CWE/CWE-497/ExposedSystemData.ql is listed in exclude-slow-queries.yml. I don't see any reason for it excluded in its current form. I'll remove it from that list and re-run DCA.

@geoffw0
Copy link
Contributor Author

geoffw0 commented Mar 1, 2022

New DCA results (now including this query!): performance of the new query is about average and overall analysis time is barely impacted, though there is some non-tiny wobble in some individual projects. No changes to results.

Copy link
Contributor

@MathiasVP MathiasVP left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@MathiasVP
Copy link
Contributor

It looks like we need an internal PR to sync the test changes, right?

@geoffw0
Copy link
Contributor Author

geoffw0 commented Mar 1, 2022

Ah, I hadn't noticed that! I'll put one together in about an hour...

@geoffw0 geoffw0 added the depends on internal PR This PR should only be merged in sync with an internal Semmle PR label Mar 1, 2022
@geoffw0
Copy link
Contributor Author

geoffw0 commented Mar 1, 2022

Internal PR made, ready to merge (together).

@MathiasVP MathiasVP merged commit 3681a1b into github:main Mar 2, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C++ depends on internal PR This PR should only be merged in sync with an internal Semmle PR documentation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants
X Tutup