Set-based Pattern Matching Example
Posted by rcbarnett on January 02, 2008.
ModSecurity 2.5 introduces two new operators (@pm and @pmFromFile) which implement set-based pattern matching by using the Aho-Corasick algorithm. Set-based matching is much quicker then using Regular Expressions. For those users who are concerned with performance (meaning trying to limit latency from a legitimate client's perspective) then set-based pattern matching is a great enhancement. If rules are written properly, you can achieve the same level of security by using these new operators while simultaneously decreasing the time it takes to complete the check.
The key is to make sure that the set-based patterns (plain text strings) are critial to the success of the attack. So, when performing technical vulnerability research, you must first search for all of the necessary conditions for an attack to succeed. You then start by sending attacks that triggers the vulnerability remotely. The attack should be used to vary all the “interesting-looking” parts of the attack. Changes are made one at a time, in steps, keeping careful notes. (Strings, flags, length values, banners, version numbers, character encoding, white space… the list goes on. All are good things to vary.) If the attack succeeds even when a particular variable is set to a random value, that variable is not important for the signature or rule creation. Eventually you can identify the complete set of variables that are important to the attack’s success, and arrive at a set of criteria that must be collectively satisfied for any attack to succeed. If there are multiple distinct attack vectors, you must perform this analysis on each one separately.
Given a set of criteria that must be satisfied for an attack to succeed, it is possible to describe rule logic that has virtually zero false negatives. That is, an attack simply cannot succeed unless the HTTP request has exactly the characteristics that the rule is looking for. Once you have identified these necessary components, they can then be used as the input strings to the set-based matching operators.
While the set-based matching is very fast, you will still be missing some logic to be able to validate the attack. It is for this reason that a good approach is to combine set-based matching with regular expression rules by chaining the indivudual rules together. Essentially, the 1st part of the chained rule uses the set-based matching operator to run as a pre-qualifier to very quickly check to see if the transaction data has a high likelihood of matching. If the set-based matching portion matches, then th 2nd part of the chained rule (which uses the standard regular expression strings) is executed. The end result to this configuration is that for normal, non-malicious users, the latency for running all of the ModSecurity inspection rules will be decreased.
Let's take a look at this Blind SQL Injection rule from the Core Rules -
SecRule REQUEST_FILENAME|ARGS|ARGS_NAMES|REQUEST_HEADERS|XML:/*|!REQUEST_HEADERS:Referer
"(?:\b(?:(?:s(?:ys\.(?:user_(?:(?:t(?:ab(?:_column|le)|rigger)|object|view)s|c(?:onstraints
|atalog))|all_tables|tab)|elect\b.{0,40}\b(?:substring|ascii|user))|m(?:sys(?:(?:queri|ac)e|
relationship|column|object)s|ysql\.user)|c(?:onstraint_type|harindex)|waitfor\b\W*?\bdelay|
attnotnull)\b|(?:locate|instr)\W+\()|\@\@spid\b)" \
"capture,t:htmlEntityDecode,t:lowercase,t:replaceComments,ctl:auditLogParts=+E,log,auditlog,
msg:'Blind SQL Injection Attack. Matched signature <%{TX.0}>',id:'950007',severity:'2'"
We can now update this rule to become a chained rule and use the @pm operator to run some pre-qualifier checks -
SecRule REQUEST_FILENAME|ARGS|ARGS_NAMES|REQUEST_HEADERS|XML:/*|!REQUEST_HEADERS:Referer
"@pm sys.user_triggers sys.user_objects @@spid msysaces instr sys.user_views sys.tab
charindex sys.user_catalog constraint_type locate select msysobjects attnotnull sys.user_tables
sys.user_tab_columns sys.user_constraints mysql.user sys.all_tables msysrelationships
msyscolumns msysqueries" \
"chain,t:htmlEntityDecode,t:lowercase,t:replaceComments,ctl:auditLogParts=+E,log,auditlog,
msg:'Blind SQL Injection Attack. Matched signature <%{TX.0}>',id:'950007',severity:'2'"
SecRule REQUEST_FILENAME|ARGS|ARGS_NAMES|REQUEST_HEADERS|XML:/*|!REQUEST_HEADERS:Referer
"(?:\b(?:(?:s(?:ys\.(?:user_(?:(?:t(?:ab(?:_column|le)|rigger)|object|view)s|c(?:onstraints
|atalog))|all_tables|tab)|elect\b.{0,40}\b(?:substring|ascii|user))|m(?:sys(?:(?:queri|ac)e|
relationship|column|object)s|ysql\.user)|c(?:onstraint_type|harindex)|attnotnull)\b|(?:locate|
instr)\W+\()|\@\@spid\b)" "capture,t:htmlEntityDecode,t:lowercase,t:replaceComments"
Now, let's test out the new rules to see what the processing time is for each of these rules if the request is normal. First let's look at what the time is for the normal Core Rule -
Executing operator "rx" with param "(?:\\b(?:(?:s(?:ys\\.(?:user_(?:(?:t(?:ab(?:_column|le)|rigger)|obj
ect|view)s|c(?:onstraints|atalog))|all_tables|tab)|elect\\b.{0,40}\\b(?:substring|ascii|user))|m(?:sys(
?:(?:queri|ac)e|relationship|column|object)s|ysql\\.user)|c(?:onstraint_type|harindex)|waitfor\\b\\W*?\
\bdelay|attnotnull)\\b|(?:locate|instr)\\W+\\()|\\@\\@spid\\b)" against ARGS:LoginEmail.
Target value: "aaa"
Operator completed in 14 usec.
Notice that it took approximately 14 usec for this optimized regular expression rule to run. Now, let's contrast this with the same rule running with the @pm operator -
Executing operator "pm" with param "sys.user_triggers sys.user_objects @@spid msysaces instr sys.user_v iews sys.tab charindex sys.user_catalog constraint_type locate select msysobjects attnotnull sys.user_t ables sys.user_tab_columns sys.user_constraints mysql.user sys.all_tables msysrelationships msyscolumns msysqueries" against ARGS:LoginEmail. Target value: "aaa" Operator completed in 9 usec.
As you can see, the processing time was decreased down to just 9 usec! This may not seem like much, however keep in mind that this is just for one rule. The overall effect of using the set-based pattern matching operators will become apparent when you are using a larger number of rules. Keep an eye out for updates to the Core Rules as they will be changing in the future to better leverage these new operators.
Posted by rcbarnett at 09:41 PM
Using Transactional Variables Instead of SecRuleRemoveById
Posted by rcbarnett on December 04, 2007.
Using SecRuleRemoveById to handle false positives
The SecRuleRemoveById directive is most often used when ModSecurity users are trying to deal with a false postive situation. Used on its own, it is a global directive that will disable a rule that was specified before it based on its rule id number. While users can technically take this approach and just use SecRuleRemoveById on its own, we caution against this. Just because a rule triggered a false positive match does not mean that the only recourse is to disable the rule entirely! Remember, the rule was created to address a specific security issue so every effort should be made to only disable a rule or make an exception in certain cases.
Limitations of SecRuleRemoveById
The problem is that SecRuleRemoveByID is somewhat limited in its capabilities for selectively disabling rules. One of the common methods of attempting to selectively disable a Mod rule is to nest the SecRuleRemoveById directive inside of an Apache scope location (such as Location) like this -
<Location /path/to/foo.php> SecRuleRemoveById 950009 </Location>
There currently aren't many other options for using SecRuleRemoveById to disable a rule other than triggering on URI location as shown above. A similar issue was identified with other global directives and was addressed in ModSecurity 2.0 by making it possible to update these settings on a per rule basis by using the "ctl:" action. In future versions of ModSecurit we will implement a "ctl:RemoveById" action to handle this. In the meantime, however, what else can a user do to selectively disable rules bases on arbitrary request data?
Using Transactional Variables (TX)
The approach that am going to discuss is meant as an example only and its usage should be fully considered prior to implementation. I believe that the TX variable is not currently being widely used by ModSecurity users. This may be caused by two main reasons - 1) The Core Rules don't use them, and 2) We don't have proper "use-case" documentation showing how you might use it more effectively. It is with the later issue that I hope this post will help.
Transaction variables are really cool and Ivan explained their general usage in a SecurityFocus interview. Here are the relevant sections -
The addition of custom variables in ModSecurity v2.0 (along with a number of related improvements) marks a shift toward providing a generic tool that you can use in almost any way you like. Variables can be created using variable expansion or regular expression sub-expressions. Special supports exists for counters, which can be created, incremented, and decremented. They can also be configured to expire or decrease in value over time. With all these changes ModSecurity essentially now provides a very simple programming language designed to deal with HTTP. The ModSecurity Rule Language simply grew organically over time within the constraints of the Apache configuration file.
In practical terms, the addition of variables allows you to move from the "all-or-nothing" type of rules (where a rule can only issue warnings or reject transactions) to a more sensible anomaly-based approach. This increases your options substantially. The all-or-nothing approach works well when you want to prevent exploitation of known problems or enforce positive security, but it does not work equally well for negative security style detection. For the latter it is much better to establish a per-transaction anomaly score and have a multitude of rules that will contribute to it. Then, at the end of your rule set, you can simply test the anomaly score and decide what to do with the transaction: reject it if the score is too large or just issue a warning for a significant but not too large value.
What I am about to show is an implementation of this concept.
Enabling/Disabling rules using TX variables
The first step in this process is to update your your modsecurity_crs_15_customrules.conf file to specify which rules will be active. If you aren't familiar with the the modsecurity_crs_15_customrules.conf file and its usage, please see this prior Blog post. The following two entries use the SecAction directive to set two different TX variables -
# Set the enforce variable to 0 to disable and 1 to enable # Rule ID 950002 is for "System Command Access" SecAction "phase:1,pass,nolog,setvar:tx.ruleid_950002_enforced=1, \ setvar:tx.ruleid_950002_matched=0"
As the comment text indicates, you can quickly toggle whether or not this rule is active by changing the tx.ruleid_950002_enforced variable to 0. With this directive, every request will have these two TX variables initially set. If you have ever seen any of those nature shows on television where the researchers capture an animal, tag it and then release it back into the wild, we are essentially doing the same thing. We are just "tagging" the current request with some data that will be updated and/or evaluate by later rules.
Altering the Core Rules
The next step in this process is to update the individual Core Rules files to edit the rules so that instead of applying a disruptive action (such as deny), they will only set a new TX variable upon a match. The idea is to decouple the evaluation of the attack pattern in the transaction from the disruptive action application (which will happen in the next step). Here is an example from the modsecurity_crs_40_generic_attacks.conf file for the command access rule -
#
# Command access
#
SecRule REQUEST_FILENAME "\b(?:n(?:map|et|c)|w(?:guest|sh)|cmd(?:32)?|telnet|rcmd|ftp)\.exe\b" \
"capture,t:htmlEntityDecode,t:lowercase,log,pass,id:'12345',msg:'System Command Access. \
Matched signature <%{TX.0}>',setvar:tx.ruleid_950002_matched=1"
Now, if an inbound request matched this rule, then the tx variable called "ruleid_950002_matched" will be set to "1". This updates the original setting of this variable from the SecAction in the modsecurity_crs_15_customrules.conf file. This rule will also log the detection of this rule to the error_log file.
Evaluating the TX variables for blocking
The next step is to add a new rule to your modsecurity_crs_60_customrules.conf file to actually implement the blocking aspect of this process -
SecRule TX:RULEID_950002_ENFORCED "@eq 1" "chain,t:none,ctl:auditLogParts=+E,deny,log, \
auditlog,status:501,msg:'System Command Access. Matched signature <%{TX.0}>',id:'950002',severity:'2'"
SecRule TX:RULEID_950002_MATCHED "@eq 1"
The above example is a chained rule set where the first line checks to see if this rule should even be evaluated. If the tx value is set to 1 (meaning yes we are evaluating this rule) then it will go to the 2nd part of the chained rule and check to see if the matched TX value is 1 (meaning that the inbound request matched the RegEx check from the modsecurity_crs_40_generic_attacks.conf file). If both of these TX values return true then the entire chained rule matches and the actions on the 1st line is triggered and the request is denied. Here is what the short error_log message would look like -
[Sat Jun 23 18:04:54 2007] [error] [client 192.168.1.103] ModSecurity: Access denied with code 501 (phase 2). \ Operator EQ match: 1. [id "950002"] [msg "System Command Access. Matched signature"] [severity "CRITICAL"] \ [hostname "www.example.com"] [uri "/bin/ftp.exe"] [unique_id "@D6NJMCoD4QAABSNAoMAAAAA"]
What does this approach do for you?
At this point, you may be asking "Ok, how are these rules any different from the Core Rules? Didn't you just make the rules more complex?" It is true that functionally speaking, these new rules work exactly the same as the current Core Rule ID 950002. If client sent a request with one of those OS commands in them then it would be blocked by either rule set.
Advantages of this approach
The advantage of using this approach is that you now have extended flexibility to decided under what circumstances a rule will be evaluate or by which an exception can be made to a rule.
1) You could disable rules in phase:1. With the current approach of SecRuleRemoveById being used inside Apache scope directives, you could only run within phase:2 or beyond. With this approach, you could easily create a rule that runs in phase:1 and evaluates some variable (perhaps a remote IP or something) and then just sets "setvar:tx.ruleid_950002_enforced=0" and it will disable that rule.
2) Besides just deciding on whether or not the rule itself will be evaluated, you could also selectively decide if an inbound request matches the rule or not. Let's say that you keep having a false positive on rule ID 950002 when a client uses a specific web client (user-agent string). You could then easily add a rule to your modsecurity_crs_60_customrules.conf file to check for this user-agent string value and then use set "setvar:tx.ruleid_950002_matched=0" to set the TX variable back to 0 even if the rule had matched in the modsecurity_crs_40_generic_attacks.conf file :) Here is an example rule you would place in the *60* file before the blocking rule -
SecRule REQUEST_HEADERS:User-Agent "^Browser_1234$" \ "phase:2,log,t:none,id:'123456',setvar:tx.ruleid_950002_matched=0"
As you can see, using this approach you have much more flexibility to determine when and where you want to implement an exception to a rule and you can then use "setvar" to easily change the TX variables. This provides you with many more options than using a global directive.
Disadvantages of this approach
1) This approach pretty much goes against the recommendations that I have been promoting previously about trying to limit editing of the Core Rules themselves.
2) This approach also introduces more directives than would normally be present in your configurations. As we have stated in many previous posts, the more rules that you have the higher the impact on performance. This means that for those users who are concerned with performance may not want to use this approach.
Remember, however, that I said that the purpose of this post is simply to present an alternative approach to evaluating requests and to show a use-case example of using TX variables.
Posted by rcbarnett at 07:18 PM
Virtual Patching During Incident Response: United Nations Defacement
Posted by rcbarnett on August 27, 2007.
Virtual Patching is a policy for a web application firewall (in this case ModSecurity) that is able to identify attempts to exploit a specific Website vulnerability. ModSecurity analyzes transactions and intercepts attacks in transit, so malicious traffic never reaches the target Website. The end result is that even if a vulnerability still exists within the application’s source code, the virtual patch will protect against clients attempting to exploit it.
Virtual Patching is an extremely valuable technique that can be used to provide immediate protection against identified vulnerabilities. The trick here, however, is that you first must identify them! You can’t really create a patch if you don’t know what the problem is. There are six main processes that may yield vulnerability information that you can then take action on by virtually patching them:
- Vendor contact (e.g. pre-warning)
- Public disclosure
- Bug report to the development team
- Vulnerability assessment (internal or external)
- Code review
- Security incident
All of these scenarios are somewhat similar in that they all provide vulnerability information in reports of some sort. The only exception is that last one, a Security Incident. This is a unique situation in that there are no ifs, ands or buts involved in the discussions as to whether or not you need to respond to this issue. Any sort of lead time that you may have been counting on for a normal patching process, source code fix or system outage is suddenly thrown out the windows as proper Incident Response steps require you to act immediately. This is where Virtual Patching can prove to be invaluable.
Want a real life example?
In case you missed it, the United Nations (UN) website was recently defaced by a defacement trio known as "KEREM125 M0STED AND GSY." They defaced the site by adding html text to the speeches page. An archived screen shot is located here - notice the text under the "Latest speeches" window. And then here on the specific speech page.
While the details of the specific attack vector can not be 100% confirmed, it is suspected that the attackers used an SQL Injection vulnerability. A software developer named Giorgio Maone chronicled this incident on his Blog site. Maone partly deduced that SQL Injection was the likely attack vector by the missing apostrophe/single-quote in the word "dont" in the defacement text. Single-quotes are normally a key component of creating proper SQL query syntax and it is assumed that attempting to include it in the text would have complicated the SQL Injection attack. Maone also showed that the "statID" parameter for the statments_full.asp page is the most likely candidate for the attack as this URL – "http://www.un.org/apps/news/infocus/sgspeeches/statments_full.asp?statID=105'" – reveals the following DB error message:
ADODB.Recordset.1 error '80004005' SQLState: 37000 Native Error Code: 8180 SQLState: 37000 Native Error Code: 105 [MERANT][ODBC SQL Server Driver][SQL Server]Unclosed quotation mark before the character string ''. [MERANT][ODBC SQL Server Driver][SQL Server]Statement(s) could not be prepared. /apps/news/infocus/sgspeeches/statments_full.asp, line 26
UN Incident Response Issues
Once the defacement was identified, the main UN site was taken offline and the following message was presented to clients:
This site will be temporarily unavailable due to scheduled maintenance.
After a number of hours, the site came back online. Unfortunately, vulnerability had not been patched as the same error message could be generated. It seemed that there was some sort of basic filter in place that attempted to filter out the single-quote character, however this is not a sufficient fix as there are SQL Injection queries that do not rely on this character. There is also the possibility of bypassing this filter by using the char() function. A short time later, the entire site was offline and presented clients with this message:
The UN website is undergoing urgent maintenance and is currently unavailable. Please check back in a short while.
When the site did come back online the Speeches section was not available and the same old vulnerability was still present...
What was happening?
We can only speculate at this point as to what was happening behind closed doors in the UN Incident Response team meetings, however they obviously had difficulty with addressing the standard "Eradication Phase" of the issue. When the choice is either being totally offline while waiting for the source code to be fixed vs. putting the site back online and monitoring the logs more closely for issues, the latter will always win out.
How a Virtual Patch could have helped
Looking at the URL again, we can narrow down the issue to the statments_full.asp application and specifically to the statID parameter. Looking at the normal, expected values associated with the statID parameter you can see that the data should only be digits. The following ModSecurity Virtual Patch could have been used to fix this issue by implementing a positive security ruleset:
<Location /apps/news/infocus/sgspeeches/statments_full.asp>
SecRule &ARGS "!@eq 1"
SecRule ARGS_NAMES "!^statid$"
SecRule ARGS:statID "!^\d{1,3}$"
</Location>
This rule uses the normal Apache Location directive as a container for the ModSecurity rules. Inside this location, we are enforcing the following three rules:
- The statments_full.asp page will only accept 1 argument
- The name of the argument must be statID (note that the rule uses all lowercase as there is certain transformation functions being applied to normalize traffic)
- The value of statID can only be 1 to 3 numeric digits
If this rule were in place, the example URL provided above would have been denied with this alert message:
[Wed Jun 13 01:06:37 2007] [error] [client 192.168.15.1] ModSecurity: Access denied with code 403 (phase 2). Match of "rx ^\\\\d{1,3}$" against "ARGS:statID" required. [file "/usr/local/apache/conf/rules/modsecurity_crs_15_customrules.conf"] [line "4"] [hostname "www.un.org"] [uri "/apps/news/infocus/sgspeeches/statments_full.asp?statID=105'"] [unique_id "lCFILsCoD4QAABWcDp4AAAAD"]
This Virtual Patch would have provided instant protection against this issue until the actual source code could have been update or fixed.
Posted by rcbarnett at 04:31 PM
On Your Marks, Get Set, Go: Vulnerabilty Mitigation Race
Posted by rcbarnett on July 27, 2007.
In many ways vulnerability remediation is like a Track and Field race and the firing of the starters pistol is the public vulnerability announcement. The goal of the race is to be the first one to either exploit or patch a vulnerability. The participants in the race may include; 1) Organizations running the vulnerable application, 2) Attackers looking to exploit the vulnerabilty manually, or 3) The odds on favorite to win the race - an automated worm program. Oraganizations looking to mitigate or patch their systems are the long-shots to win this race. Let's look at a breakdown of the challenges that organizations face:
Not Hearing the Starter's Pistol
Unfortunately, many organizations don't realize that they are even in a race! This can be attributed poor monitoring of vulnerability alerts. If you aren't signed up on your Vendor's mail-list or you don't have someone checking out US-CERT or the SANS Internet Storm Center (ISC) daily then you are immediately giving the attackers a 50 yard lead in this race...
What Do You Mean We Don't Have The Baton?
If you are running in a relay race, you need to have a baton to pass to each memeber of your team. In this case, the baton is the vendor's security patch. You might be ready, willing and able to start the patching process, however if the vendor doesn't release the patch, you can't really start the race then can you?
Not Getting A Clean Handoff
Each leg of the relay could be though of as a step in the patching process such as installation on a test host, then pushing the patch out to development, then regression testing and finally out to production. As each phase completes its tasks, it then needs to notify the next group and "hand off the baton" so they can move forward with testing. If this doesn't happen, then the patch will never make it to the finish line - which is when the patches are applied to production hosts. I can't tell you how many times I have seen customers who have patches that make through one or two phases but then just seem to fall off the priority list.
Getting Disqualified
In a relay race, if you step outsite of your lane, then can be disqualified. Similarly, if a security patch ever causes any sort of disruption to normal service then the patch is usually not applied. If there are problems during regression testing, then odds are that the security patch will not make it to the finish line. In the end, functionality will always trump security.
Let's Not Pull A Hamstring
Many organizations want to minimize being disqualified so they take a rather slow, methodical approach to the race and decide just to walk it. These are the organizations who only have quarterly downtime for patching. These companies may get a ribbon for participation but they will never win the race.
Don't Have A Lane To Run In
What happens if you are not able to apply any patches at all to your web application? Two valid scenarios may be companies who have outsourced the development of their web application and/or who are using an older version of a COTS product where the vendor is no longer providing patches. What options are left for these companies to compete in this race?
Evening The Odds
So, where does that leave us then? Is there anything that organizations can do the even the playing field in this race? The answer is yes. Virtual Patching can help by providing immediate mitigations to the vulnerability. If an organization were to implement a Virtual Patch on a web application firewall, this will act as a stop-gap measure to prevent remote exploitation of the vulnerability until the actual patch is applied. Using the relay race analogy again, this would be like forcing the attackers to run a steeplechase type of race where there are water pits and 10 ft. tall hurdles in their lane while you are allowed to run a normal race without any obstacles. In this type of scenario, you have a much better chance of beating the attackers to the finish line and protecting your web applications.
Virtual Patching Webcast
If you would like to know more about Virtual Patching, Ivan Ristic and I will be jointly presenting a webcast on this topic very soon -
Date: Wednesday, August 8, 2007
Time: 8AM, Pacific DT
Registration Link
Posted by rcbarnett at 05:25 PM
ScallyWhack: ModSecurity Rules Package to Deal with Trac Comment Spam
Posted by ivanr on June 29, 2007.
Michael Renzmann wrote to the ModSecurity mailing list recently announcing project ScallyWhack. It's a set of rules specially designed to detect comment spam against Trac installations. This is interesting for several reasons. It's a project with potential to be very useful for many people running Trac, it appears to be well thought out and well designed, but it is also the first independent project to focus on writing rules to fit a purpose. First of many I hope!
Posted by ivanr at 09:01 AM
Optimizing Regular Expressions
Posted by ofer on June 12, 2007.
As many of you have noticed, the Core Rule Set contains very complex regular expressions. For example:
(?:\b(?:(?:s(?:elect\b(?:.{1,100}?\b(?:(?:length|count|top)\b.{1,100}
?\bfrom|from\b.{1,100}?\bwhere)|.*?\b(?:d(?:ump\b.*\bfrom|ata_type)|
(?:to_(?:numbe|cha)|inst)r))|p_(?:(?:addextendedpro|sqlexe)c|...
These regular expressions are assembled from a list of simpler regular expressions for efficiency reasons. A single optimized regular expression test takes much less time than a series of simpler regular expression tests. The downside is readability and ease of editing. A future version of ModSecurity will overcome this limitation, but meanwhile, in order to optimize performance you have to think about optimization yourself.
Manual assembly and optimization is both hard and error prone, so for the Core Rule Set we use a clever Perl Module: Regexp::Assemble. As the name suggests, Regexp::Assemble knows how to assemble a number of regular expressions into one optimized regular expression.
Since Regexp::Assemble is not a program, but rather a Perl module, you will need some glue code to use it. The following instructions will help you if you are not Perl Wizards.
If you don't have Perl, you will need to install it. The easiest Perl distribution to install, especially if you use Windows, is ActivePerl.
Now install Regexp::Assemble. If you used ActivePerl, you can use the following command:
ppm install regexp-assemble
If you use another Perl distribution, you will need to download the module and use the normal Perl module installation procedure as outlined in the README file.
Once you have Perl and Regexp::Assemble installed, all you need is this little script:
#!/usr/local/bin/perl
use strict;
use Regexp::Assemble;
my $ra = Regexp::Assemble->new;
while (<>)
{
$ra->add($_);
}
print $ra->as_string() . "\n";
The script will take either standard input or an input file with each line containing a regular expression and print out the optimized expression:
regexp_builder.pl simple_regexps.txt > optimized_regexp.txt
On a Unix system you might need to change the fist line to point to the local Perl interpreter. On Windows you may need to precede the script name with the command 'perl'.
And if all this is too complex, you can just download the pre-compiled version for Windows.
Posted by ofer at 08:02 PM
ModSecurity Rule for Full-width/Half-width Unicode Evasion Detection
Posted by rcbarnett on May 23, 2007.
You have probably heard it by now, but US-CERT released a Vulnerability Note last week entitled "HTTP content scanning systems full-width/half-width Unicode encoding bypass." The short of it is that many HTTP content scanning systems (think IDS/IPS/WAFs) may not be able to properly decode data that is encoded using Unicode full-width/half-width encoding thus allowing a possible evasion issue for malicious traffic.
This is yet another Impedance Mismatch issue where one host may interpret data a certain way while another interprets in differently. In this case, security devices that use decoding functions may not properly decode the data and cannot therefore apply certain signatures. If this is the case, then you would have a false negative if the request was malicious and the destination host is able to decode the data and process it.
The $1,000,000 question here is does this issue affect ModSecurity? Yes. The %u syntax is a Microsoft-specific extension. While ModSecurity does not decode such encodings by default (meaning you have to explicitly address the issue in your rules if you need this feature) you can choose to decode them using the urlDecodeUni transformation function. In the current version of ModSecurity this transformation function cannot deal with the above-mentioned evasion technique. (On the positive side, the transformation function behaves exactly as documented in the reference manual.) Fortunately, it is quite easy to create a ModSecurity rule that can identify and block any use of this type of encoding. Here is the example rule that you can use:
# Disallow use of full-width unicode
SecRule REQUEST_FILENAME|ARGS|ARGS_NAMES|REQUEST_HEADERS|XML:/*|!REQUEST_HEADERS:Referer \
"\%u[fF]{2}[0-9a-fA-F]{2}" \
"t:none,deny,log,auditlog,status:400,msg:'Unicode Full/Half Width Abuse Attack Attempt',id:'950116',severity:'4'"
This rule is also included in the latest development release of the Core Rules.
Posted by rcbarnett at 01:45 AM
Regular Expression Development Tools
Posted by ofer on March 29, 2007.
Since ModSecurity is based on regular expressions, a lot of rule creation requires developing and testing regular expressions. Therefore I looked for a tool that can be used to test regular expressions for validity and accuracy before using the regular expression in a ModSecurity rule. I found two free tools that let you do that:
- The Regex Coach is simple and powerful. You simply type your expression at the upper box and text to match at the bottom one and any matches, if found, are highlighted in the text. In between the boxes you can control the regular expression flags such as “ignore case” or “global match”. The Regex Coach does not stop there, it provides insight into the regular expression matching process but showing a tree view of the regular expression and letting you follow the matching process step by step.
- Expresso - Unfortunately The Regex Coach chokes on the regular expressions we use in ModSecurity Core Rule Set. So I searched and found an alternative that works fine with our regular expression: Expresso. While free, it is not your typical open source software. Apart from using .NET framework, it politely asks for a (free) registration and generally seems to move away from free. It is also more complex and while very strong on peripheral features such as a library of regular expressions and saving your test work in a project file, it actually knows less about regular expressions. But it works with complex ones.
Posted by ofer at 04:20 PM
2.X/1.X Rule differences for identify missing/empty headers and variables
Posted by rcbarnett on March 22, 2007.
There are certain scenarios where you might want to create white-listed ModSecurity rulesets which enforce that certain headers/variables are both present and not empty. This Blog entry highlights an important difference between the 1.X and 2.X ModSecurity Rules Languages in this regard.
ModSecurity 2.X Rules
There are some good examples of this in Core Rules file - modsecurity_crs_21_protocol_anomalies.conf file. Take for example the following entries -
SecRule &REQUEST_HEADERS:Host "@eq 0" "skip:1,log,auditlog,msg:'Request Missing a Host Header',id:'960008',severity:'4'"
SecRule REQUEST_HEADERS:Host "^$" "log,auditlog,msg:'Request Missing a Host Header',id:'960008',severity:'4'"
The 1st rule uses the "&" operator to enable counting the number of variables in a collection. So, this 1st rule will identify if the Host header is not present in the request. The 2nd rule uses the RegEx "^$" that will match if the Host header is present, but is empty.
ModSecurity 1.X Rules
The older ModSecurity 1.X rules language operated a bit differently. If you wanted to use a similar rule as the one above, you could use this rule -
SecFilterSelective HTTP_Host "^$"
This rule means two different things -
1) The Host header is missing and/or
2) The Host header is empty
Which is better?
On the surface, you might think "The 1.X rules way is better since you only need 1 rule..." however you need to realize that anytime you have rules or directives that implicitly enforce certain capabilities you run the risk of having false positives as it could match things that you didn't want them to. For instance, what if you have a situation where certain web clients (such as mobile devices) legitimately include some headers, however they are empty? Do you want to automatically block these clients? With the ModSecurity 1.X Rule Language, you would have to remove the entire rule. With the ModSecurity 2.X Rule Language, however, you are able to create rules to more accurately apply the logic that you desire.
Posted by rcbarnett at 01:52 PM
Handling False Positives and Creating Custom Rules
Posted by rcbarnett on February 16, 2007.
It is inevitable; you will run into some False Positive hits when using web application firewalls. This is not something that is unique to ModSecurity. All web application firewalls will generate false positives from time to time. The following information will help to guide you through the process of identifying, fixing, implementing and testing new custom rules to address false positives.
Every rule set can have false positive in new environments
False Positives happen with ModSecurity + the Core Rules mainly as a by product of the fact that the rules are "generic" in nature. There is no way to know exactly what web application is going to be run behind it. That is why the Core Rules are geared towards blocking the known bad stuff and forcing some HTTP compliance. This catches the vast majority of attacks.
Use DetectionOnly mode
Any new installation should initially use the log only Rule Set version or if no such version is available, set ModSecurity to Detection only using the SecRuleEngine DetectionOnly command. After running ModSecurity in a detection only mode for a while review the events generated and decide if any modification to the rule set should be made before moving to protection mode.
Don't be too hasty to remove a rule
Just because a particular rule is generating a false positive on your site does not mean that you should remove the rule entirely. Remember, these rules were created for a reason. They are intended to block a known attack. By removing this rule completely, you might expose your website to the very attack that the rule was created for. This would be the dreaded False Negative.
ModSecurity rules are open source
Thankfully, since ModSecurity’s rules are open source, this allows you the capability to see exactly what the rule is matching on and also allows you to create your own rules. With closed-source rules, you can not verify what it is looking for so you really have no other option but to remove the offending rule.
The logs are your friend
In order to verify if you indeed have a false positive, you need to review your logs. This means that you need to look in the audit_log file first to see what the ModSecurity message states. It will provide information as to which rule triggered. This same information is also available within the error_log file. The last place to look, and actually the best source of information, is the modsec_debug.log file. This file can show everything that ModSecurity is doing, especially if you turn up the SecDebugLogLevel to 9. Keep in mind, however, that increasing the verboseness of the debug log does impact performance. While increasing the verboseness for all traffic is usually not feasible, what you can do is to create a new rule that uses the “ctl” action to turn up the debugloglevel selectively. For instance, if you identify a False Positive from only one specific user, you could add in a rule such as this:
SecRule REMOTE_ADDR "^192\.168\.10\.69$" phase:1,log,pass,ctl:debugLogLevel=9
This will set the debugLogLevel to 9 only for requests coming from that specific source IP address. Perhaps that still generates a bit too much traffic. You could tighten this down a bit to increase the logging only for the specific file or argument that is causing the false positive:
SecRule REQUEST_URI "^/path/to/script.pl$" phase:1,log,pass,ctl:debugLogLevel=9 or SecRule ARGS:variablename “something” phase:1,pass,ctl:debugLogLevel=9
Now that you have verbose information in the debug log file, you can review it to ensure that you understand what portion of the request was being inspected when the specific rule trigger and you can also view the payload after all of the transformation functions have been applied.
Try to avoid altering the Core Rules
In general, it is recommended that you try to limit your alteration of the Core Rules as much as possible. The more you alter the rule files, the less likely it will be that you will want to upgrade to the newer releases since you would have to recreate your customizations. What we recommend is that you try to contain your changes to your own custom rules file(s) that are particular to your site. This is where you would want to add new signatures and to also create rules to exclude False Positives from the normal Core Rules files. There are two main ways to integrate your custom rules so that they work with the Core Rules.
1. Adding new white-listing rules
If you need to add new white-listing rules so that you can, for instance, allow a specific client IP address to pass through all of the ModSecurity rules you should place this type of rule after the modsecurity_crs_10_config.conf file but BEFORE the other Core Rules. This is accomplished by creating a new rule file called – modsecurity_crs_15_customrules.conf and place it in the same directory as the other Core Rules. This is assuming you are using the Apache Include directive to call up the Core Rules like this –
<IfModule security2_module> Include conf/rules/*.conf </IfModule>By naming your file with the “_15_” string in it, it will be called up just after the config file. This will ensure that your new white-list rule will be executed early and you can then use such actions as allow and ctl:ruleEngine=Off to allow the request through the remainder of the rules.
2. Adding new negative policy rules
If you need to add new negative policy rules, such as when you need to update a Core Rule that is causing a false positive, you should add these rules to a new rule file that come AFTER all of the other Core Rules. Call this new file something like – modsecurity_crs_60_customrules.conf. Just make sure that number in the filename is higher than any other rules file so it is read last. The rationale for placing these types of rules after the other rules is that you can then match up these new replacement rules with corresponding SecRuleRemoveByID directives that will then disable the specific Core Rule(s) that are causing False Positives. It is important to note that you need to use SecRuleRemoveById AFTER ModSecurity has knowledge of the Rule ID you are actually removing. If you were to place this directive in the modsecurity_crs_15_customrules.conf file, it would not work correctly as the rule ID you are specifying does not exist yet. That is why this directive should be called up in your custom rules file that comes at the end. Using this method allows you to turn off rules without having to actually go into the Core Rules files and comment out or update specific rules.
Fixing the false positive
OK, so now you have identified the specific Core Rule that is causing the false positive. Let’s say that the rule that is causing a false positive is the following one in the modsecurity_crs_40_generic_attacks.conf file –
# XSS SecRule REQUEST_FILENAME|ARGS|ARGS_NAMES|REQUEST_HEADERS "(?:\b(?:on(?:(?:mo(?:use(?:o(?:ver|ut)|down|move|up)|ve)|ke y(?:press|down|up)|c(?:hange|lick)|s(?:elec|ubmi)t|(?:un)?load|dragdrop|resize|focus|blur)\b\W*?=|abort\b)|(?:l(?:ows rc\b\W*?\b(?:(?:java|vb)script|shell)|ivescript)|(?:href|url)\b\W*?\b(?:(?:java|vb)script|shell)|mocha):|type\b\W*?\b (?:text\b(?:\W*?\b(?:j(?:ava)?|ecma)script\b| [vbscript])|application\b\W*?\bx-(?:java|vb)script\b)|s(?:(?:tyle\b\W*= .*\bexpression\b\W*|ettimeout\b\W*?)\(|rc\b\W*?\b(?:(?:java|vb)script|shell|http):)|(?:c(?:opyparentfolder|reatetextr ange)|get(?:special|parent)folder|background-image:|@import)\b|a(?:ctivexobject\b|lert\b\W*?\())|<(?:(?:body\b.*?\b(? :backgroun|onloa)d|input\b.*?\\btype\b\W*?\bimage)\b|!\[CDATA\[|script|meta)|.(?:(?:execscrip|addimpor)t|(?:fromcharc od|cooki)e|innerhtml)\b)" \ "log,id:950004,severity:2,msg:'Cross-site Scripting (XSS) Attack'"
Your next step is to just copy and paste it into the new modsecurity_crs_60_customrules.conf file. Let’s assume that the false positive hit with this rule is when it is inspecting a specific portion of your Cookie header called Foo. The Cookie data is included within the REQUEST_HEADERS variable. You now need to make a few edits to the rule to update it to remove the false hit. The bolded sections of code are the relevant updates -
# XSS SecRule REQUEST_FILENAME|ARGS|ARGS_NAMES|REQUEST_HEADERS| !REQUEST_HEADERS:Cookie|REQUEST_COOKIES|REQUEST_COOKIES_NAMES|!REQUEST_COOKIES_NAMES:/^Foo$/ "(?:\b(?:on(?:(?:mo(?:use(?:o(?:ver|ut)|down|move|up)|ve)|ke y(?:press|down|up)|c(?:hange|lick)|s(?:elec|ubmi)t|(?:un)?load|dragdrop|resize|focus|blur)\b\W*?=|abort\b)|(?:l(?:ows rc\b\W*?\b(?:(?:java|vb)script|shell)|ivescript)|(?:href|url)\b\W*?\b(?:(?:java|vb)script|shell)|mocha):|type\b\W*?\b (?:text\b(?:\W*?\b(?:j(?:ava)?|ecma)script\b| [vbscript])|application\b\W*?\bx-(?:java|vb)script\b)|s(?:(?:tyle\b\W*= .*\bexpression\b\W*|ettimeout\b\W*?)\(|rc\b\W*?\b(?:(?:java|vb)script|shell|http):)|(?:c(?:opyparentfolder|reatetextr ange)|get(?:special|parent)folder|background-image:|@import)\b|a(?:ctivexobject\b|lert\b\W*?\())|<(?:(?:body\b.*?\b(? :backgroun|onloa)d|input\b.*?\\btype\b\W*?\bimage)\b|!\[CDATA\[|script|meta)|.(?:(?:execscrip|addimpor)t|(?:fromcharc od|cooki)e|innerhtml)\b)" \ "log,id:1,severity:2,msg:'Cross-site Scripting (XSS) Attack'"
This updated rule is doing three things –
1. We are using the exclamation point character to create an inverted rule meaning do NOT inspect the REQUEST_HEADERS variable whose name is Cookie. The problem here is that this variable location is too generic/broad and we are only interested is excluding one specific Cookie location from this check and not the entire Cookie value. We don’t want to allow other possible XSS attack vectors within the Cookie value.
2. Since we still want to inspect the Cookie values, we have now opted to include additional Cookie variables that were not present before. We can now include both REQUEST_COOKIES and REQUEST_COOKIES_NAMES variables to the check. We are then finally using another inverted rule to exclude checking this rule against any Cookie whose name is exactly “Foo.” This is accomplished by using a regular expression argument to the REQUEST_COOKIES_NAMES variable.
3. Finally, we are also updating the “id” meta-data action by changing to a new number that represents a custom rule range. The range: 1 -99999 is reserved for your internal use.
The last thing to do is to use SecRuleRemoveById to disable the Core Rule that was causing the problem –
SecRuleRemoveById 950004
Testing the new rules
The final step is to actually test out your new configs and verify that the old rule is not executing and the new rule is not triggering a false positive hit. The easiest method to use is to just resend the previously offending request to the web server and then monitor the audit_log file to see if the request becomes blocked or if the ModSecurity message is generated.
Easy Implementation of new Core Rules
With this type of methodology, you can create custom exclusions and fix false positives and it also allows for easy updating of the Core Rules themselves. What we don’t want to have happen is that current Mod users have altered the Core Rules files extensively for their environment that they do not want to upgrade when new Core Rule releases are available for fear of having to re-implement all of their custom configs. With this scenario, you can download new Core Rules versions as they are released and then just copy over your new ModSecurity custom rule files and you are ready to go!
Posted by rcbarnett at 06:17 PM
Key Advantages of the Core Rule Set
Posted by ofer on January 02, 2007.
Following a question on the core rule set on the ModSecuirty mailing list, I would like to list some of the key properties of the core rule set. The focus of the core rule set is to be a "rule set" rather than a set of rules and the properties below are all derived from that:
Performance - The core rule set is optimized for performance. The amount and content of the rules used predominantly determines the performance impact of ModSecurity, so the performance optimization of the rule set is very important.
Quality - While there will always be false positives, and the core rule set is young, we spend a lot of time trying to make the core rule set better. Some of the things we do are:
- Regression tests - we have a regression test, so every new version we ship is tested to ensure it does not break anything. Actually every report of a false positive, once solved, gets into the regression test.
- Real traffic testing - we are continuously converting Giga bytes of cap files to tests and send them through ModSecurity to detect potential false positives. I think you could see the result in version 1.3.2 which includes many fixes based on these tests.
Generic Detection - The core rule set is tuned to detect generic attacks and does not include specific rules for known vulnerabilities. Due to this feature the core rule set has better performance, is more "plug and play" and requires less updates. If you want to patch known vulnerabilities you may look for rules from gotroot or convert snort rules such as those at bleeding threats, but you must select only the rules that apply to you, otherwise performance may suffer.
Event Information - Each rule in the core rule set has a unique ID and a textual message. In the future we are going to add classification using a new tag action in ModSecurity, as well as longer information regarding each rule using comments in the files themselves.
Plug and Play - We try to make the rule set as plug and play as possible. Since its performance is good and it employs generic detection, and since the number of false positives is getting lower all the time, the core rule set can be installed as is with little twisting and tweaking.
To get deeper into the core rule set you may want to read the presentation I gave about it in a recent OWASP chapter meeting.
Posted by ofer at 05:46 PM
Using ModSecurity 2 Collections in Rules
Posted by ofer on December 28, 2006.
A recent posting on the ModSecurity mailing list by K.C. Li is a very good excuse to discuss some major changes between ModSecurity version 1 and 2 and how to it influence rule writing. K. C. used the following rule in ModSecurity v1:
SecFilterSelective ARGS "(^|[^_])(comments?|story)=.*(href|http)"
This rule searched for the values "href" or "http" in a bunch of different parameters: story, _story, comment, comments, _comment and _comments. The rule replaces 6 rules, each one specific to a parameter. While the rule is very effective, as K.C. writes, it suffers from the following shortcomings:
- It only detects these parameters if they appear first.
- It searches for "href" and "http" everywhere, spanning to fields beyond the specific ones searched.
- it might find href and http as part of longer words and not as separate tokens.
The rule can be corrected like this:
SecFilterSelective ARGS "(?:^|\&)_?(?:comments?|story)=[^\&]*\b(?:href|http)\b"
By adding checks for a "&" prior to the parameter name and ensuring "&" does not exists between the parameter name and the keyword, we make sure that we capture the parameters in any location in the request string and that the tokens are part of the value for this parameter only. The meta character "\b" is a regular expression meta character that matches a word boundary, ensuring that "href" and "http" are tokens. The construct "?:" at the beginning of each parentheses is a performance optimization which prevents the parentheses from capturing the value, a side effect that is not needed unless we use the capture action.
Well, but all this become very complex.
While in ModSecurity 1.x the ARGS location is simply a string that represented either QUERY_STRING or POST_PAYLOAD, in ModSecurity 2 ARGS is a collection that enables searching in individual parameters. Collections are fundamental in ModSecurity 2 and I suggest reading the relevant section in ModSecurity 2 reference guide.
You can still use the location QUERY_STRING|REQUEST_BODY to rewrite the rule for ModSecurity 2.0 but using the ARGS collection will make the rule much simpler. Using a regular expression to select the elements of the collection tested, the following rule will do the same in ModSecurity 2:
SecRule ARGS:'/(?:^|^_)(?:comments?|story)$/' "\b(?:href|http)\b"
The other rules that K.C. uses can also use collections and be converted to a single ModSecurity V2 rule. Instead of:
SecFilterSelective HTTP_x-aaaaaaaaa|HTTP_XAAAAAAAAA ".+" SecFilterSelective HTTP_x-aaaaaaaaaaa|HTTP_XAAAAAAAAAAA ".+" SecFilterSelective HTTP_x-aaaaaaaaaaaa|HTTP_X_AAAAAAAAAAAA ".+"
You can simply write:
SecRule "&REQUEST_HEADERS:'/^(?i)x[-_]a{9,12}$/'" "@gt 0"
This rule uses the "&" construct to count the number of elements in a collection, or a subset if a regular expression is used to select elements from the collection.
To complement the discussion, a word about actions in ModSecurity 2. Just as in ModSecurity 1, there is no need to explicitly state actions in each rule and the actions listed in SecDefaultAction will be used. However, due to the bigger role that actions now have in ModSecurity, it is advisable to add them to each rule. Especially important are:
- The phase action. As ModSecurity 2 now has 4 phases, specifying the phase becomes very important
- Meta information actions. As ModSecurity rules and rule sets are becoming bigger, it is important to maintain the meta information. And as stated in the manual it is recommended to use only numbers between 1 and 99999 for internally developed rule IDs
- Anti evasion transformation functions are not explicit in ModSecurity 2.0, and should be set in either a SetDefaultAction directive or in the action list for the event. In this case I would use lowercase and urlDecodeUni.
So K.C. rules becomes:
SecRule "ARGS:'/(?:^|^_)(?:comments?|story)$/'" "\b(?:href|http)\b" \
"deny,log,status:403,phase:2,t:lowercase,t:urlDecodeUni,id:90004,severity:2,msg:'Comment Spam'"
SecRule "&REQUEST_HEADERS:'/^(?i)x[-_]a{9,12}$/'" "@gt 0" \
"deny,log,status:403,phase:2,t:lowercase,id:90005,severity:2,msg:'Comment Spam'"
Posted by ofer at 10:46 AM
Why So Many Events?
Posted by ofer on November 30, 2006.
When you start using ModSecurity 2.0 with the Core Rule Set, you may notice that you get (too) many events. There are two common areas in the Core Rule Set that cause a lot of events: search engine detections and missing HTTP headers.
File "modsecurity_crs_55_marketing.conf" includes rules to detect access by Google, Yahoo and MSN. These rules tend to generate a large number of events. This events are interesting from the marketing point of view, but are not very important from the security point of view. Also, admittedly, neither the audit log, nor the ModSecurity console, display those events in a manner suitable for presenting to marketing guys. So, if those events bother you, you may consider removing this file.
On the other hand, the 2nd source of events, missing HTTP headers, provides good indication of malicious requests. This is the reason that the Core Rule Set checks that a request has a "host", a "user-agent" and an "accept" headers and blocks the requests otherwise. In many systems there are valid requests that do not have those headers. These are usually generated by some automation tool used by the system. A good example are monitoring tools that periodically check that a site is alive and kicking. Such monitoring tools many times issues simple and non standard HTTP request. Therefore we would not want to remove the missing HTTP headers rules, but rather create specific exceptions for the valid request source. In many cases this would be an exception based on a source IP.
In the next blog entry I will cover techniques to create exceptions in ModSecurity.
Posted by ofer at 10:59 PM