Regular Expresions for Source Code Analysis
Some codebases are stunningly large. Because there's no way to read over every line of code, I use complex search queries to look for interesting, potentially vulnerable code.
I recently started to keep track of some of the better queries I've written while doing penetration tests at work so that I could publish them here.
Just to clarify a couple confusing columns:
vuln | lang | noise | type | files | query |
---|---|---|---|---|---|
XSS | low | regex | (backend) |
<(a|span|p|img|button|script|style|input|form|i|b|u|iframe|body|base|select|label|div|td|tr|table) (.*?href|.*?src|onload|id|title)=[``'"].*?[{\+``"'] |
|
XSS | med | regex | (backend) |
<a href=['"].*?[{\+"'] |
|
SQLi | med | regex | (backend) |
(((SELECT|DELETE)[^\S].*?[^\S]FROM[^\S]|INSERT[^\S].*?[^\S]INTO))(.*?[^\S]WHERE.*?(=|IS))? |
|
Secrets | med | regex | (secret|pass(word)?|API_?KEY)\s*?=[^=]?\s*?("(.*?)"|'(.*?)')? |
||
Cmd Inj | med | regex | (Process\.Start|[^\S]system|execve|(\.getRuntime\(\)|runtime|[^\S]rt).exec)\(("([^"]*?)"|.*?)\) |
||
XSS | ASP.NET | high | regex | *.ts |
<[\S ]{1,5}>?.*(\$\{.*?\}) |
XSS | ASP.NET | high | regex | *.cs |
<(a|span|p|img|button|script|style|input|form|i|b|u)( |>).*(\{.*?\}) |
XSS | ASP.NET | med | regex | *.cs |
(\.Append\()?.*?(<(a|span|p|img|button|script|style|input|form|i|b|u)( |>)).*?([^\$]?".?\+.?[^\$]?") |
SQLi | ASP.NET | high | regex | *.cs |
(ExecuteSQLCommand) |
XXE | ASP.NET | high | regex | *.cs |
XmlDocument\( |
rXSS | Java | med | simple | *.java, *.jsp |
<%=*request.getParameter*%> |
rXSS | Java | low | regex | <%=((?!Constant|getContext|Globals|Sanitize|StringEscapeUtils|new (Long|Integer|Boolean)|application\.getAttribute).)*?%> |
|
SQLi | Java | med | regex | *.java, *.jsp |
^(?!\s*(\/\/|\<logic|\*|\{|logger|whereTo|The|\(like|\/\*|if|boolean|Voter|see\:|return|\<bean|bp\.set|Action|public\ |\<input|\<label|\<\%|2\.|private)).+where.* |
CRLFi | Java | high | simple | *.java, *.jsp |
*.setHeader(* |
Double Formatting | Python | low | regex | *.py |
[^%\S]f'.*%[a-zA-Z] |
This table is pretty small right now. As I do more penetration tests at work over this summer, I will continue to update this table.