Issue #19617: added Checks to cover special characters OpenJDK Style §2 - Java Source Files#19715
Issue #19617: added Checks to cover special characters OpenJDK Style §2 - Java Source Files#19715Anushreebasics wants to merge 1 commit intocheckstyle:masterfrom
Conversation
22a933c to
936f6fc
Compare
|
@romani @vivek-0509 please review |
|
@vivek-0509 please review |
|
@romani please review |
| // violation above 'Consider using special escape sequence.' | ||
|
|
||
| private final String escapedLetter = "\u0041"; | ||
| // violation above 'Unicode escape(s) usage should be avoided.' |
There was a problem hiding this comment.
Not sure what is wrong with this, it is good ASCII file.
There was a problem hiding this comment.
The file being ASCII-encoded is not the issue here. The OpenJDK rule in section 2.1 also forbids escaped Unicode sequences in Java source, so "\u0041" is still a violation even though the file itself contains only ASCII bytes. I added a clarifying comment to make that intent explicit.
There was a problem hiding this comment.
Please point to jdk spec it is written that escaped are forbidden.
There was a problem hiding this comment.
Thanks — per the Java Language Specification https://docs.oracle.com/javase/specs/jls/se17/html/jls-3.html#jls-3.3 , Unicode escapes are part of the language lexical rules and are permitted; the JLS only documents how they are processed and the cases that are compile-time errors (e.g. \u000A inside a literal). The rule I added is an OpenJDK/Code-style guideline to avoid escaped characters in source for readability/portability, not a language prohibition. If you’d like, I can add a link in the PR to the OpenJDK coding-style page that recommends avoiding escaped Unicode in source
There was a problem hiding this comment.
I can add a link in the PR to the OpenJDK coding-style page that recommends avoiding escaped Unicode in source
yes.
There was a problem hiding this comment.
Note that this implies that other white space characters (in, for instance, string and character literals) must be written in escaped form.
', ", , \t, \b, \r, \f, and \n should be preferred over corresponding octal (e.g. \047) or Unicode (e.g. \u0027) escaped characters.
so only very special set of unicode escapes are forbidden, please look at Google style , they do same/similar rule.
7b72fd8 to
02a55cb
Compare
|
@romani please review |
| private final String escapedTab = "\011"; | ||
| // violation above 'Consider using special escape sequence.' | ||
|
|
||
| private final String escapedLetter = "\u0041"; | ||
| // ASCII bytes, but the unicode escape below is still a violation. | ||
| // violation above 'Unicode escape(s) usage should be avoided.' | ||
|
|
||
| } |
There was a problem hiding this comment.
please add few more examles of escaped unicodes to Correct file, to show that not all unicodes are forbidden.
|
|
||
| private final String escapedLetter = "\u0041"; | ||
| // ASCII bytes, but the unicode escape below is still a violation. | ||
| // violation above 'Unicode escape(s) usage should be avoided.' |
There was a problem hiding this comment.
private final String escapedLetter = "\u0041";
// ASCII bytes, but the unicode escape below is still a violation.
// violation above 'Unicode escape(s) usage should be avoided.'
// violation above should right after javacode, and message should be different, please investigate why test it is not failing.
add exactly this tests.
add all of this to test code. |
c5f4575 to
4cf0f87
Compare
|
@romani please review |
| private final char formFeed = '\f'; | ||
| private final char carriageReturn = '\r'; | ||
| private final char newLine = '\n'; | ||
|
|
| // violation above 'special escape sequence' | ||
|
|
||
| private final char newLineOctal = '\012'; | ||
| // violation above 'special escape sequence' |
There was a problem hiding this comment.
Please do extra comment to explain what should be used instead
|
All codes that referenced in jdk style should be tests. |
90c20b0 to
d97208d
Compare
|
@romani please review |
fixes #19617
Summary
Added special-character checks to OpenJDK style config: [openjdk_checks.xml:66]
Added IllegalTokenText with the OpenJDK escape-preference pattern
Added AvoidEscapedUnicodeCharacters
Updated OpenJDK style coverage page so mapping is accurate: [openjdk_style.xml:141]
Section 2 now explicitly includes charset=US-ASCII
Section 2.1 now references IllegalTokenText and AvoidEscapedUnicodeCharacters
Added new OpenJDK Chapter 2 integration tests:
[SpecialCharactersTest.java:1]
[InputSpecialCharactersValid.java:1]
[InputSpecialCharactersInvalid.java:1]