Penetration testing without information gathering is incomplete. Information is required to narrow down the attack vectors. In the case of web penetration testing, ethical hackers need to understand the web design topology in order to enumerate and attack the target web server or application. This includes the backend web content and links that form the frontend of the web host. The manual collection and validation of all this information is a towing job. However, there are tools available to automate the process. There are web fuzzers like wfuzz or traditional web traversal tools, such as Dirsearch to collect desired content or links from target web services. This article is about Dirsearch, a command-line tool that helps penetration testers to extract the hidden files from the directories and sub-directories of the target web server. The tool comes with a predefined long list of file extensions stored in the form of a wordlist in the Dirsearch source package. It applies the brute-force technique to go through each directory, sub-directory, and links on the target server. The .php, .htm, .xml, .tar, .zip, .txt, and .pdf are some common example extensions listed in the precompiled wordlist. The tool has the ability to ignore specific files or directories during the scan process.
The Dirsearch installation is a fairly simple process.
1) Download the source code from Github using the following command.
git clone https://github.com/maurosoria/dirsearch.git
2) Navigate to the dirsearch directory to locate the requirements.txt file.
3) Finally, execute the requirements.txt file using the following Python3 command.
pip3 install -r requirements.txt
How Dirsearch Works?
The Dirsearch has a rich help menu that can be explored using the following command.
python3 dirsearch.py --help
The help menu can be categorized into five sections. The first section is an options menu showing all the necessary parameters to write and execute the commands.
There are certain settings options listed in the help menu. The dictionary settings show different flags that can be used to refine the basic scan functionality of the Dirsearch tool.
The connections settings include options to customize the communication process between the Dirsearch tool and the target web server.
The general settings represent mostly the optimization parameters.
The last section of the Dirsearch help menu shows different formats to store the scan results. The default output format is the text file containing all the scanning results.
Dirsearch Scan Examples
The following command is the basic format of scanning the target web server without any filters.
python3 dirsearch.py -u <target web server>
The command directs the tool to scan all the directories, sub-directories, and files available on the target web host. We can run the command against a demo web server (webscantest.com) to analyze the scanning process.
python3 dirsearch.py -u https://webscantest.com
The initial screen shows important information regarding the scanning procedure, such as the wordlist size; target file extensions; the number of threads; request methods; errors log file; and output file location.
In the next phase, the tool starts the brute-force process and compiles the following data results.
(i) Directories, subdirectories, and files’ paths discovered during the scanning process.
(ii) Status code for each valid file.
(iii) Size of each directory, sub-directory, and discovered file.
Data refining or filtering is a handy feature in the Dirsearch tool. We can use different parameters to include or exclude certain file types or directories during the scanning process. For example, we can tell the scanner to exclude the .php extensions using the following command.
python3 dirsearch.py -u http://testphp.vulnweb.com -X .php
Similarly, we can program the Dirsearch to show results for only specific file types using the suffix parameter. For instance, we can use the –suffix parameter to get only .php results.
python3 dirsearch.py -u http://testphp.vulnweb.com --suffix .php
The Dirsearch can make complex content discovery easier for penetration testers. Deep scanning is a powerful feature of the tool. The Discovery of directories, sub-directories, and custom files with any known extension makes the tool an ultimate choice for serious penetration testing projects. The only drawback of Dirsearch is the slow Bruteforce process that can take a lot of time against heavy web servers.