Gumaa
Gumaa11mo ago

Excluding certain directories when using "walk"

Hello, I'm writing a script that should crawl through a directory and search for a given text inside of these files. I'm having trouble understanding the documentation about walk options, specifically the skip option. It says:
skip: RegExp[] = undefined

List of regular expression patterns used to filter entries. If specified, entries matching the patterns specified by this option are excluded.
skip: RegExp[] = undefined

List of regular expression patterns used to filter entries. If specified, entries matching the patterns specified by this option are excluded.
But it is enigmatic - what exactly is being checked? The name of a file? Path of the file? Absolute path? Relative path? It does not say. If you are wondering I'm doing something like this:
const iter = walkSync(projectFolder, {
skip: [
/node_modules/,
/.git/,
/static/,
/lang/,
],
});

for (const entry of iter) {
if (entry.isFile) {
const content = Deno.readTextFileSync(entry.path);
checkContent(content);
}
}
const iter = walkSync(projectFolder, {
skip: [
/node_modules/,
/.git/,
/static/,
/lang/,
],
});

for (const entry of iter) {
if (entry.isFile) {
const content = Deno.readTextFileSync(entry.path);
checkContent(content);
}
}
So I just provided the name of a folder that I want to ignore. But I'm not sure if this is the correct approach or if this might ignore some extra files that I did not intent to ignore.
1 Reply
Deno AI Helper
Deno AI Helper11mo ago
The skip option in the walk function from Deno's standard library indeed uses regular expressions to filter out directories or files that you don't want to include in your search. The regular expressions you provide are matched against the path of each file or directory. However, the documentation does not explicitly state whether the paths checked are absolute or relative. In your case, you're using regular expressions like /node_modules/, /.git/, /static/, and /lang/. These will match any file or directory path that contains these strings anywhere in the path. This means it could potentially exclude more files or directories than you intend if these strings appear in places other than just the directory names you want to exclude. For example, if there's a file named node_modules_example.txt in a different directory, it would also be skipped because its name contains the string node_modules. If you want to ensure that you're only skipping the specific directories, you might need to adjust your regular expressions to be more specific or structure your project in a way that avoids naming conflicts. However, without more information from the documentation, it's hard to provide a definitive answer. You could experiment with your code and use console.log to print out the paths of the files being checked to see if they're absolute or relative, and adjust your regular expressions accordingly.