Skip to content

Conversation

@jeongsoolee09
Copy link
Collaborator

@jeongsoolee09 jeongsoolee09 commented Aug 1, 2025

Description

This PR implements the Statements package.

Change request type

  • Release or process automation (GitHub workflows, internal scripts)
  • Internal documentation
  • External documentation
  • Query files (.ql, .qll, .qls or unit tests)
  • External scripts (analysis report or other code shipped as part of a release)

Rules with added or modified queries

  • No rules added
  • Queries have been added for the following rules:
    • RULE-9-4-2
    • RULE-9-5-1
    • RULE-9-5-2
  • Queries have been modified for the following rules:
    • rule number here

Release change checklist

A change note (development_handbook.md#change-notes) is required for any pull request which modifies:

  • The structure or layout of the release artifacts.
  • The evaluation performance (memory, execution time) of an existing query.
  • The results of an existing query in any circumstance.

If you are only adding new rule queries, a change note is not required.

Author: Is a change note required?

  • Yes
  • No

🚨🚨🚨
Reviewer: Confirm that format of shared queries (not the .qll file, the
.ql file that imports it) is valid by running them within VS Code.

  • Confirmed

Reviewer: Confirm that either a change note is not required or the change note is required and has been added.

  • Confirmed

Query development review checklist

For PRs that add new queries or modify existing queries, the following checklist should be completed by both the author and reviewer:

Author

  • Have all the relevant rule package description files been checked in?
  • Have you verified that the metadata properties of each new query is set appropriately?
  • Do all the unit tests contain both "COMPLIANT" and "NON_COMPLIANT" cases?
  • Are the alert messages properly formatted and consistent with the style guide?
  • Have you run the queries on OpenPilot and verified that the performance and results are acceptable?
    As a rule of thumb, predicates specific to the query should take no more than 1 minute, and for simple queries be under 10 seconds. If this is not the case, this should be highlighted and agreed in the code review process.
  • Does the query have an appropriate level of in-query comments/documentation?
  • Have you considered/identified possible edge cases?
  • Does the query not reinvent features in the standard library?
  • Can the query be simplified further (not golfed!)

Reviewer

  • Have all the relevant rule package description files been checked in?
  • Have you verified that the metadata properties of each new query is set appropriately?
  • Do all the unit tests contain both "COMPLIANT" and "NON_COMPLIANT" cases?
  • Are the alert messages properly formatted and consistent with the style guide?
  • Have you run the queries on OpenPilot and verified that the performance and results are acceptable?
    As a rule of thumb, predicates specific to the query should take no more than 1 minute, and for simple queries be under 10 seconds. If this is not the case, this should be highlighted and agreed in the code review process.
  • Does the query have an appropriate level of in-query comments/documentation?
  • Have you considered/identified possible edge cases?
  • Does the query not reinvent features in the standard library?
  • Can the query be simplified further (not golfed!)

Copy link
Collaborator

@MichaelRFairhurst MichaelRFairhurst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is really coming along and looking really good!!

* to a non-const reference variable (thus constituting a `T` -> `&T` conversion.), i.e.
* initialization and assignment.
*/
/*
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Simple comment formatting, unnecessary split

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call. The intention was to split the documentation and the meta-level comment (explaining how this predicate came to be). But like you said it can be disconnected easily, so I'll merge the meta-level comment into the docstring first.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed in c8c0770.

Copy link
Collaborator Author

@jeongsoolee09 jeongsoolee09 Oct 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Somehow this change didn't make it to c8c0770; it did to a recent commit.

predicate loopVariableAssignedToNonConstPointerOrReferenceType(
ForStmt forLoop, VariableAccess loopVariableAccessInCondition
) {
exists(Expr assignmentRhs, DerivedType targetType |
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Likely want to test that this works for a int * const x:

void f(int * const x) {
    (*x)++;
}

int main() {
    for (int i = 0; i < 10; ++i) {
        f(&i);
        std::cout << i << std::endl;
    }
}

I believe what will happen is that int * const x will be a DerivedType of type SpecifiedType with a const specifier. A SpecifiedType is not instanceof PointerType or instanceof ReferenceType and so this predicate will not hold, even though the value of i is modifiable within f.

You may also have problems with typedefs, such as typedef int *int_ptr_t for the same reason.

The solution here I believe will be to call .getUnderlyingType(). Another option frequently used for this is .stripSpecifiers(). Each of these will remove the const and resolve the typedef. I think .stripSpecifiers() may remove the const in const int*, though, which would make it unsuitable here.

Copy link
Collaborator Author

@jeongsoolee09 jeongsoolee09 Sep 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right; the predicate does not catch this example. 🤔 I guess a clever use of one or more of isDeeplyConst, or isDeeplyConstBelow will do the trick.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Forgetting to handle typedefs or meaningless consts is a very common bug. But you'll (mostly) get in the habit soon enough of always calling one of these four member predicates on the Types you handle in your queries:

  • getUnderlyingType()
  • resolveTypedefs()
  • stripSpecifiers()
  • stripTopLevelSpecifiers()

Each one does subtly different things.

In this case, I believe the fix is to do:

  exists(..., Type targetType, DerivedType strippedType |
    isAssignment(assignmentRhs, targetType, _) and
    strippedType = targetType.stripTopLevelSpecifiers()
    not strippedType.getBaseType().isConst() and
    (
      strippedType instanceof PointerType or
      strippedType instanceof ReferenceType
    )

The documentation for stripTopLevelSpecifiers says:

Get this type after any top-level specifiers and typedefs have been stripped.

For example, starting with const i64* const, this predicate will return const i64*.

which is actually wrong, as it ignores the fact that i64 is a TypeDefType, so it actually will result in const long long*. Which is what you want!

The TLDR of the other options:

  • getUnderlyingType() -- resolves TypdefTypes and DeclTypes, but won't drop the outer specifer in const i64* const. Stops at the first non-TypedefType/non-DeclType.
  • stripType() -- resolves all typedefs and decltypes and removes all const/volatile specifiers recursively all the way down the type chain -- not what you want.
  • resolveTypedefs -- resolves all typedefs and decltypes all the way down the type chain without removing const or volatile specifiers. That would handle typedefs but not int const *.

Note that these predicates can have no result. Only a limited set of types are in the database, and these operations just assume that the type you want is one of those types. resolveTypedefs is also bugged and doesn't recurse into ArrayType.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the detailed breakdown of the related predicates. What I want to express here is definitely "The type we get after we strip all the typedefs and the specifiers is const". I've come to believe stripTopLevelSpecifiers is the one I should use, and swapped the portion with your suggestion.

I also patched an equivalent part in loopVariablePassedAsArgumentToNonConstReferenceParameter, in 7d5f08b.

loopCounterType = forLoopCondition.getLoopCounter().getType() and
loopBoundType = forLoopCondition.getLoopBound().getType()
|
loopCounterType.getSize() < loopBoundType.getSize()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two missed cases here:

  • Mixing signed/unsigned types, they may have the same size but they'll hold different ranges.
  • The type and runtime value may lead to different conclusions.

I think you may be able to get away with upperBound(loopCounter) < upperBound(loopBound). That would handle signedness, constants (like x < 10ull), and dynamic ranges (like unsigned long long bound = 10; ... x < bound).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also almost forgot

Another trap case is that when doing upperBound(e) / lowerBound(e) you usually want upperBound(e.getFullyConverted()). Because conversions on e will change the bound.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This eliminated a lot of false positives where the counter variable is int and the loop bound is size_t. Thank you!

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed the upperBound(loopCounter.getFullyConverted()) to typeUpperBound(loopCounter.getType()). As typeUpperBound resolves references, I didn't have to use getBaseType() on it.

We are interested if the underlying *data* can be
mutated, not the pointer itself. Also, the surface
type may be a typedef, so resolve that as well.
Both `TLoopBoundIsMutatedVariableAccess` and `TLoopStepIsMutatedVariableAccess`
transitively rely on `valueToUpdate`, which overapproximates by looking at the
types alone. Therefore we'd like to drop the confidence slightly in reporting
the expression where the expression *might* have been changed.
@jeongsoolee09 jeongsoolee09 marked this pull request as ready for review October 8, 2025 23:56
@Copilot Copilot AI review requested due to automatic review settings October 8, 2025 23:56
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR implements the "Statements" package for the MISRA C++-2023 coding standards, adding three new query rules for analyzing statement structures in C++ code.

  • Added rule implementations for RULE-9-4-2, RULE-9-5-1, and RULE-9-5-2
  • Added comprehensive test files with both compliant and non-compliant examples
  • Created supporting library code for analyzing increment operations and loop conditions

Reviewed Changes

Copilot reviewed 17 out of 17 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
rule_packages/cpp/Statements.json Package configuration defining metadata and properties for the three new statement rules
cpp/misra/src/rules/RULE-9-4-2/AppropriateStructureOfSwitchStatement.ql Query implementation to check proper switch statement structure
cpp/misra/src/rules/RULE-9-5-1/LegacyForStatementsShouldBeSimple.ql Query implementation to enforce simple legacy for-loop patterns
cpp/misra/src/rules/RULE-9-5-2/ForRangeInitializerAtMostOneFunctionCall.ql Query implementation to limit function calls in range-based for initializers
cpp/misra/test/rules/RULE-9-/ Test files and expected results for all three rules
cpp/common/src/codingstandards/cpp/exclusions/cpp/Statements.qll Auto-generated exclusion metadata for the new package
cpp/common/src/codingstandards/cpp/exclusions/cpp/RuleMetadata.qll Updated metadata registry to include Statements package
cpp/common/src/codingstandards/cpp/ast/Increment.qll New library for analyzing increment/decrement operations
cpp/common/src/codingstandards/cpp/Loops.qll Extended loop analysis with LegacyForLoopCondition class
Comments suppressed due to low confidence (1)

rule_packages/cpp/Statements.json:1

  • Fixed typo 'that that' should be 'that'.
{

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

Copy link
Collaborator

@MichaelRFairhurst MichaelRFairhurst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is looking really good. Every time I go through the code again, I'm really impressed with the overall organization and clarity. Nicely done!

Let me know if these next couple suggestions are unclear, we're so close! :)

exists(Expr loopCounterExpr |
loopCounterExpr = this.getAnOperand() and
loopBound = this.getAnOperand() and
loopCounter = loopCounterExpr.getAChild*() and
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

getAChild*() is the right tool here, but it must be coupled with an allow-list or we'll have FNs, because it's still casting a very very wide net.

We should ensure (either here, or reported as an error in the query) that loopCounterExpr is not any arbitrary type of expression.

The following should be non-compliant:

for (int i = 0; f(i) < 10; ++i) {}
for (int i = 0; i * i < 10; ++i) {}
for (int i = 0; i + f() < 10; ++i) {}
for (int i = 0; (i > other_var) < 1; ++i) {}
// etc

Basically, we probably just want an allow-list where every expr from loopCounterExpr to loopCounter is either loopCounter itself or an addition/subtraction with only constant values on one side and an allow-listed expression on the other.

for (int i = 0; i + 10 < 20; ++i) {} // OK, `i` is allowed and `ALLOWED + 10` is allowed
for (int i = 0; 10 - i < 20; ++i) {} // OK, `i` is allowed and  `10 - ALLOWED` is allowed
for (int i = 0; -i < 20; ++i) {} // OK, `i` is allowed and `-ALLOWED` is allowed
for (int i = 0; -i + 10 - < 20; ++i) {} // OK, `i` is allowed and  -ALLOWED is allowed
for (int i = 0; (i + 5) + 3 < 20; ++i) {} // OK, `i` is allowed, `ALLOWED + 5`, and `ALLOWED + 3` is allowed
for (int i = 0; i + (5 + 3) < 20; ++i) {} // OK, `i` is allowed, `ALLOWED + (5 + 3)` is allowed

for (int i = 0; i + i < 20; ++i) {} // BAD, `i` is allowed but `ALLOWED + ALLOWED` is not allowed
for (int i = 0; i + j < 20; ++j) {} // BAD, 'j' is not allowed
for (int i = 0; (i + 10) + (i + 10) < 20; ++i) {} // BAD, `i` and `ALLOWED + 10` is allowed, but `ALLOWED + ALLOWED` is not allowed

Hopefully that mostly makes sense.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand what you mean, I guess what you suggest is to define a allowed class of ASTs where:

  • The base case is a VariableAccess or a unary operator application on it; and
  • The recursive case is an application of addition or subtraction to the allowed class.

However, I think even these are non-compliant since the rules dictate that (paraphrased by me):

The comparison between the loop-counter and the loop-step is done in the condition by only using a relational operator

which is preceded by (also paraphrased and highlighted by me):

The only thing the init-statement does is declaring and initializing a loop-counter of integer type

From these two, we can infer that the rule implicitly wants (1) the loop counter to be a variable, and (2) the loop condition to be a direct comparison between the loop condition variable and some loop-bound expression. This reflected in the alert message we emit in case of a violation of this sub-rule: "The [loop condition] does not determine termination based only on a comparison against the value of the counter variable".

Then, one might ask, why do we distinguish loopCounterExpr from loopCounter in the characteristic predicate? That's for internal purposes only; we somehow need to distinguish which of the operands has loopCounter in it, so we dig into one of the expressions to find a variable whose target is the initialized counter variable. I'd say it's unfortunate but necessary.

Then, a follow-up objection might be: Why don't we demand that one of the operands to the relational operator is a variable access in the first place? That's because we don't want to tailor this code specifically to this rule, because it's library code. 9-5-1 wants one of the things compared is an access to the initialized variable, but other rules might not, and in those cases this class (LegacyForLoopCondition) can be used in other ways.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I thought that this code was intended to handle the following case, based on real code examples in open pilot:

for (int i = 0; i + 10 < j; ++i) {}

I believe we have leeway to be able to interpret 9-5-1 to accept or reject this. It's trivially clear that i + 10 < j is equivalent to i < j + 10. It's fair to say that the MISRA team doesn't think of every edge case and that we have some discretion here. It's also fair to say that this violates the plain text of the rule.

I'm happy with us rejecting the above case if you think so. However, I don't think the query currently rejects it because of the .getAChild*(). Quickly looking for calls to getLoopCounter, I don't see anything that later checks that getLoopCounter() is a direct operand of the LegacyForLoopCondition.

I believe that 9-5-1 will also currently not flag

for (int i = 0; f(i) < j; ++i) {}

which is clearly intended to be non-compliant.

So we should add tests for these cases. We should test f(i) < j case is flagged as non compliant, and we should add a test for i + 10 < j with a comment about how we came to decide that it should or shouldn't be considered compliant.

My preference would be to remove getAChild*(), and either replace it with getAnOperand() or a predicate that finds a restricted set of operations. Currently LegacyForLoopCondition is only used here, and we can document its restrictions. If you would rather leave it as is and add checks in 9-5-1 that getLoopCounter() = loopCondition.getAnOperand(), that is reasonable too. But getAChild*() casts a very wide net that might not be expected in future queries, so I wouldn't say we need to have that behavior now.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I believe it is reasonable to do so. Thank you for your second opinion!

This refined definition can handle more cases than the previous one
that only looked into the loop body, and better matches the description
in the comment above.
This is to cover the cases where the pointers are
constant but the data behind it can be mutated
through it.
Copy link
Collaborator

@MichaelRFairhurst MichaelRFairhurst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great, I believe we should just add tests for these two cases and ensure they pass, and then I think we're good!

for (int i = 0; f(i) < 10; ++i) {}
// and
void reject(int &args...) {}
void accept(const int &args...) {}
for(int i = 0; i < 10; ++i) reject(i); // non compliant
for(int i = 0; i < 10; ++i) accept(i); // compliant

/* A function call where the argument is passed as varargs */
call.getTarget().getNumberOfParameters() <= i and
/* The rule states that the type should match the "adjusted" type of the argument */
targetType = loopVariableAccessInCondition.getFullyConverted().getType()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this is incorrect. Take the following example:

01  void f(int &args...);
02  void g(int args...);
03  void main() {
04    for(int i = 0; i < 10; ++i) {
05     f(i);
06     g(i);
07   }
08  }

In this example, we want to report f(i) as non-compliant while g(i) is OK. This code is looking to see if i is converted to a const or a non-const reference type. However, it's looking at the wrong i.

loopVariableAccessInCondition refers to the i on line 3. But what we want to see is if the is on lines 4 and 5 are converted to const int& or int&.

We should add tests for this case as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants