Clear collection reassignment#16549
Conversation
…clear() instead of reassigning a new empty collection, which is more efficient as it avoids unnecessary deallocation and reallocation.
…ng results earlier
|
r? @dswij rustbot has assigned @dswij. Use Why was this reviewer chosen?The reviewer was selected based on:
|
This comment has been minimized.
This comment has been minimized.
|
Lintcheck changes for 2a5e173
This comment will be updated if you push new changes |
There was a problem hiding this comment.
This is a good first step, but this requires changes:
- Since this lint only triggers on call to
something::new(), it should probably be part of theclippy_lints::methodsmodule. - You should add tests showing that the lint will not trigger if the code belongs to a macro (this will probably be taken care of automatically once integrated into the
methodsmodule). - You should also add more complex tests such as
*v = Vec::new();, as it should be replaced by(*v).clear();or, even better,v.clear();instead.
We usually avoid comparing strings in Clippy. Method names are compared using symbols (sym::new()), and types names usually use diagnostic items.
r? samueltardieu
@rustbot author
| fn is_collection_new_call(cx: &LateContext<'_>, func: &Expr<'_>) -> bool { | ||
| if let ExprKind::Path(QPath::TypeRelative(ty, method)) = func.kind | ||
| && method.ident.name.as_str() == "new" | ||
| { | ||
| let ty = cx.typeck_results().node_type(ty.hir_id); | ||
| let type_name = ty.to_string(); | ||
|
|
||
| // Check if it's one of the standard collections | ||
| // Type names come in the format "std::vec::Vec<T>" or "std::collections::HashMap<K, V>" | ||
| type_name.contains("::Vec<") | ||
| || type_name.contains("::HashMap<") | ||
| || type_name.contains("::HashSet<") | ||
| || type_name.contains("::VecDeque<") | ||
| || type_name.contains("::BTreeMap<") | ||
| || type_name.contains("::BTreeSet<") | ||
| || type_name.contains("::BinaryHeap<") | ||
| || type_name.contains("::LinkedList<") | ||
| } else { | ||
| false | ||
| } | ||
| } |
There was a problem hiding this comment.
This is not the proper way of checking for the type, we don't use string matching in Clippy as this is not robust. You should look at the diagnostic items page in the compiler dev guide, and use functions from clippy_utils to check for them.
| /// vec.push(1); | ||
| /// vec.clear(); | ||
| /// ``` | ||
| #[clippy::version = "1.XX.0"] |
There was a problem hiding this comment.
| #[clippy::version = "1.XX.0"] | |
| #[clippy::version = "1.95.0"] |
| /// ```no_run | ||
| /// let mut vec = Vec::new(); | ||
| /// vec.push(1); | ||
| /// vec = Vec::new(); |
There was a problem hiding this comment.
You should insert a f(&vec) in the middle, otherwise it makes little sense to push an element and erase the collection afterwards without using it.
|
Reminder, once the PR becomes ready for a review, use |
|
Note that you can run the tests including the internal ones by using |
|
@samueltardieu Thanks for the review. I will work on these changes:
I will need a bit of time to study the methods module structure |
- Use diagnostic items instead of string matching for type checking - Move lint to methods module as requested - Add macro handling via in_external_macro check - Support dereference patterns (*v = Vec::new() → v.clear()) - Change applicability to MaybeIncorrect - Add comprehensive tests for all collection types - Fix version to 1.95.0 Addresses all review comments from samueltardieu on PR rust-lang#16549
This comment has been minimized.
This comment has been minimized.
|
|
||
| # Remove jujutsu directory from search tools | ||
| .jj | ||
| .DS_Store |
There was a problem hiding this comment.
This doesn't belong to this PR.
There was a problem hiding this comment.
This file should not have been added
| @@ -0,0 +1,152 @@ | |||
| #![warn(clippy::clear_instead_of_new)] | |||
| #![allow(unused)] | |||
There was a problem hiding this comment.
This should not be needed:
| #![allow(unused)] |
| #![allow(clippy::clear_instead_of_new)] | ||
| #![allow(unused, clippy::useless_vec)] |
There was a problem hiding this comment.
| #![allow(clippy::clear_instead_of_new)] | |
| #![allow(unused, clippy::useless_vec)] | |
| #![allow(clippy::clear_instead_of_new, clippy::useless_vec)] |
| @@ -1,3 +1,4 @@ | |||
| #![allow(clippy::clear_instead_of_new)] | |||
There was a problem hiding this comment.
Move it after the #![warn(…)]
| #![allow(clippy::clear_instead_of_new)] | ||
| #![allow(clippy::useless_vec, clippy::manual_repeat_n)] |
| } | ||
| } | ||
|
|
||
| #[allow(clippy::clear_instead_of_new)] |
There was a problem hiding this comment.
Please you #[expect] instead.
| let is_new_call = if let ExprKind::Call(func, args) = value.kind | ||
| && args.is_empty() | ||
| && let ExprKind::Path(QPath::TypeRelative(ty_path, method)) = func.kind | ||
| && method.ident.name == sym::new |
There was a problem hiding this comment.
This will have been checked in the methods module check_expr() function already.
| // Get the type definition, peeling any references | ||
| let ty = ty.peel_refs(); | ||
|
|
||
| // Check if it's one of the standard library collections | ||
| // We check for types that have a .clear() method | ||
| if let ty::Adt(adt, _) = ty.kind() { | ||
| let def_id = adt.did(); | ||
|
|
||
| // Use diagnostic items to identify the type | ||
| // This is the proper way to check types in Clippy, not string matching | ||
| cx.tcx.is_diagnostic_item(sym::Vec, def_id) | ||
| || cx.tcx.is_diagnostic_item(sym::HashMap, def_id) | ||
| || cx.tcx.is_diagnostic_item(sym::HashSet, def_id) | ||
| || cx.tcx.is_diagnostic_item(sym::VecDeque, def_id) | ||
| || cx.tcx.is_diagnostic_item(sym::BTreeMap, def_id) | ||
| || cx.tcx.is_diagnostic_item(sym::BTreeSet, def_id) | ||
| || cx.tcx.is_diagnostic_item(sym::BinaryHeap, def_id) | ||
| || cx.tcx.is_diagnostic_item(sym::LinkedList, def_id) | ||
| } else { | ||
| false | ||
| } |
There was a problem hiding this comment.
You can do this more efficiently with something like:
| // Get the type definition, peeling any references | |
| let ty = ty.peel_refs(); | |
| // Check if it's one of the standard library collections | |
| // We check for types that have a .clear() method | |
| if let ty::Adt(adt, _) = ty.kind() { | |
| let def_id = adt.did(); | |
| // Use diagnostic items to identify the type | |
| // This is the proper way to check types in Clippy, not string matching | |
| cx.tcx.is_diagnostic_item(sym::Vec, def_id) | |
| || cx.tcx.is_diagnostic_item(sym::HashMap, def_id) | |
| || cx.tcx.is_diagnostic_item(sym::HashSet, def_id) | |
| || cx.tcx.is_diagnostic_item(sym::VecDeque, def_id) | |
| || cx.tcx.is_diagnostic_item(sym::BTreeMap, def_id) | |
| || cx.tcx.is_diagnostic_item(sym::BTreeSet, def_id) | |
| || cx.tcx.is_diagnostic_item(sym::BinaryHeap, def_id) | |
| || cx.tcx.is_diagnostic_item(sym::LinkedList, def_id) | |
| } else { | |
| false | |
| } | |
| matches!(ty.peel_refs().opt_diag_name(), Some(sym::HashMap | sym::HashSet | …)) |
There was a problem hiding this comment.
Changes to this file don't seem to belong to this PR.
| fn remove_by_id(&mut self, id: HirId) { | ||
| if let Some(param) = self.get_by_id_mut(id) { | ||
| param.uses = Vec::new(); | ||
| param.uses.clear(); |
There was a problem hiding this comment.
Could you split this into a separate commit that gets applied first (so that the tests always pass)?
| @@ -1,3 +1,4 @@ | |||
| #![allow(clippy::clear_instead_of_new)] | |||
- Use diagnostic items instead of string matching for type checking - Move lint to methods module as requested - Add macro handling via in_external_macro check - Support dereference patterns (*v = Vec::new() → v.clear()) - Use get_diagnostic_name for efficient compiler query - Change applicability to MaybeIncorrect - Add comprehensive tests for all collection types Addresses all review comments from samueltardieu on PR rust-lang#16549
|
|
|
All the requested changes have been addressed:
@rustbot ready |
|
@samueltardieu Thank you for the review. I'll make the changes accordingly |
In which case can the applicability be incorrect? |
The applicability can be incorrect in uninitialized variables, different capacity semantics For eg:- Lint suggests: v.clear() Vec::new() drops old vec, creates new one with capacity 0 |
It would be best for the lint not to trigger at all in this case (see
This is a real one. Maybe an extra note could be added to the diagnostic saying that using |
Add clear_instead_of_new lint
What does this PR do?
Adds a new perf lint that detects when a collection is reassigned to a new empty collection instead of calling .clear().
Why?
Reassigning a collection with collection = CollectionType::new() is inefficient because it:-
Advantage
Using .clear() is more efficient as it:-
What collections are supported?
Example
Before:
After:
Fixes #16520
changelog: [
clear_instead_of_new]: new lint to suggest using .clear() instead of reassigning empty collections