Advanced Multi-Threaded File Search and Filter Utility in C++
Build a high-performance, multi-threaded C++ utility to recursively search through directories for files matching specific content and metadata filters, then output a summary report. This mini-project will test your architectural design, concurrency control, file I/O, and advanced C++ skills.
Challenge prompt
Create a C++ program that recursively searches a given directory for files containing a specified keyword within their content. Your utility must support filtering results by file extension(s), minimum and maximum file size, and last modified date range. Implement efficient multi-threading to utilize all available CPU cores for scanning files concurrently. At the end, generate a summary report listing all matched files with their path, size, last modified timestamp, and a snippet of the matched content surrounding the keyword in each file. Handle errors gracefully and optimize for large directory structures with potentially thousands of files.
Guidance
- • Use std::filesystem for directory traversal and metadata extraction.
- • Implement thread pools or std::async with concurrency safety to parallelize file reading and filtering.
- • To extract snippets with the keyword, read partial file content around the first match instead of loading entire files into memory.
- • Carefully design data structures to safely aggregate results from different threads and avoid race conditions.
Hints
- • Consider using std::mutex, std::lock_guard, or concurrent queues for thread-safe result storage.
- • Minimize disk IO by filtering metadata before reading file content wherever possible.
- • Split directory traversal and file reading/filtering into separate phases to improve concurrency and error isolation.
Starter code
#include <filesystem>
#include <iostream>
#include <vector>
#include <string>
#include <mutex>
#include <thread>
#include <future>
// Define a struct to hold file match information
struct FileMatch {
std::filesystem::path filePath;
std::uintmax_t fileSize;
std::filesystem::file_time_type lastModified;
std::string snippet;
};
// Function declarations
std::vector<std::filesystem::path> recursiveFileSearch(const std::filesystem::path& dir);
bool fileContainsKeyword(const std::filesystem::path& filePath, const std::string& keyword, std::string& snippet);
int main() {
// TODO: Implement argument parsing, threading logic, filtering, and reporting
std::cout << "Implement the multi-threaded file search and filter utility here." << std::endl;
return 0;
}Expected output
Summary Report: Matched Files: 3 1. /path/to/file1.txt | Size: 2048 bytes | Modified: 2024-05-20 15:32 | Snippet: "...keyword example inside file1..." 2. /path/to/file2.cpp | Size: 4096 bytes | Modified: 2024-05-18 09:12 | Snippet: "...code snippet with keyword..." 3. /path/to/notes.md | Size: 1024 bytes | Modified: 2024-05-22 11:03 | Snippet: "...documentation mentioning keyword..."
Core concepts
Challenge a Friend
Send this duel to someone else and see if they can solve it.