If I understand correctly expression .ht* in the next code will match all that starts with .ht, so my .ht_lalala is safe.
<Files ".ht*">
Require all denied
</Files>
But what about next one?
(^\.ht|~$|back|BACK|backup|BACKUP$)
Is it correct for matching files: .htaccess, back, backup, BACKUP? Or next will be better instead
(^\.ht*|back*|BACK*$)
What I'd like to understand is what ~$ actually means in my code. I don't know where I saw it, but I have it in my code, and now I doubt that it's correct. Maybe it meant to be something like (^.ht|~$) just for one group.
I know basic things about regex, what is ^ and $, and that * means 0 or N from previous text/token, but ~ doesn't make sense inside the pattern, unless it's just a simple character and it does nothing but matches ~. I've read Apache docs, I guess for multiple matches FilesMatch and DirectoryMatch is better, however regular expressions can also be used on directives Files and Directory, with the addition of the ~ character, as is stated in the docs examples.
<Files ~ "\.(gif|jpe?g|png)$">
#...
</Files>
And well, what I want exactly is to know how to match different files or directories.
One more thing, should I escape the .? Because default httpd.conf doesn't do so. Or it's just different for httpd.conf and .htaccess (which doesn't make sense to me)
CodePudding user response:
<Files ".ht*">
In this context, .ht* is not a regular expression (regex). It is a "wild-card string", where ? matches any single character, and * matches any sequence of characters. (Whilst this is also a valid regex - a regex would match differently).
But what about next one?
(^\.ht|~$|back|BACK|backup|BACKUP$)
This is a regex (it cannot be used in the <Files> directive as you have written above, without enabling regex pattern matching with the ~ argument - as you have used later.)
In this regex, ~$ matches any string that ends with a literal ~ (tilde character). This is sometimes used to mark backup files.
It also matches...
- Any string that starts
.ht(which naturally includes.htaccess). - Any string that contains
backorBACKorbackup(matchingbackupis obviously redundant). - Any string that ends with
BACKUP.
Consequently, this does not look like it's doing quite what you think it's doing.
Or next will be better instead
(^\.ht*|back*|BACK*$)
Whilst this is a valid regex, you've obviously reverted back to a mix of "wild-card" pattern matching. Bear in mind that in regex speak, the * quantifier matches the previous token 0 or more times. It does not match "any characters", as in wild-card pattern matching.
This still matches ".htaccess", but only because the pattern is not anchored. For example, ^\.ht*$ (with an end-of-string anchor) would not match ".htaccess".
<Files ~ "\.(gif|jpe?g|png)$">
With the Files directive, the ~ argument enables regex pattern matching. (As you've stated.) This is quite different from when ~ is used inside the regex pattern itself.
One more thing, should I escape the
.? Because default httpd.conf doesn't do so. Or it's just different for httpd.conf and .htaccess (which doesn't make sense to me)
I think you're mixing things up. In your first example, it's not a regex, it's a "wild-card" pattern (as stated above). In this context, the . must not be backslash-escaped. It matches a literal . (dot). The . carries no special meaning here. The . should only be escaped if you need to match a literal dot in a regular expression.
For example, the following are equivalent:
# Wild-card string match
<Files ".ht*">
and
# Regex pattern match
<Files ~ "^\.ht">
(However, it is preferable to use FilesMatch instead of Files ~ to avoid any confusion. FilesMatch is "newer" syntax.)
There is no difference between httpd.conf and .htaccess in this regard.
CodePudding user response:
When in doubt, RTFM.
~ enables regex. Without it, you just get access to wildcards ? and *.
As far as I know Apache uses the PCRE flavor of regex.
So once you've enabled regex via ~ then use https://regex101.com/r/lPkMHK/1 to test the behavior of the regex you've written.
