Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> You can't ignore any file that may be interesting to someone, which is basically all of them.

ripgrep already does ignore certain files! Specifically, it ignores binary and hidden files, and files in your .gitignore. This is part of what makes ripgrep's default output so much more "readable" than GNU grep's default output: its few default filters mean that it's not matching from e.g. the contents of your .git/ directory, whereas GNU grep does show those matches by default.

> Could the_mitsuhiko, the creator of flask, provide such a list for flask?

That wasn't what I was suggesting. I'm saying that each repo owner should be maintaining such a list for their own repo, custom-tailored to it. Just like every repo owner manages their repo's .gitignore, .gitattributes, .dockerignore, etc.

And specifically, that list shouldn't be a set of things that are always filtered out of all searches, but rather a set of patterns that apply abstract purpose-type tags to files, ala .gitattributes, such that it makes them much easier (or even "default") to filter them in/out of your searches.

Given a patterns file that would look like:

    *.test.js purpose=test
    tests/**  purpose=test
    apps/*/tests/** purpose=test
...ripgrep and other search tools could then see those files as having that purpose-tag "test" attached; and you'd be able to filter files in/out by their purpose-tag, just like you can filter files in/out by their filetype with ripgrep's -t/-T switches.

(You could go wild from there, if you like, and let this patterns-file match not just files but lines inside files. For example, doc-comments lines [=~ lines with syntax X in files of type Y] could be tagged as having purpose=docs.)

And I would then propose, on top of that, some higher-level behavior — not necessarily implemented in rg(1) itself — that could look at a file of "filter-context specifications" like this:

    default: code
    code: +purpose=impl -purpose=test -purpose=examples -purpose=docs
...and then, when you do a search, it would use that `default` search-scope if not otherwise specified; or use another scope if you name it. So given the config above, by default you'd be searching only code; but you could instead search your docs scope, or your examples scope, or an implicit "all" scope.

The big win with those purpose-tags and scopes, would be if they had conventional or formal (e.g. URNs in an RDF namespace) names, such that different repos could agree on their usage.

Then tools that do code-search across many different repos, like Github, could layer UI on top of these purpose-tags and search-scopes discovered from the repo, to enable the repo to be searched using this explicitly-encoded abstract understanding of the purpose/function of the different parts of the repo. (Picture the Github search autocomplete for a search 'foo' giving several drop-down options like "Search 'foo' in [code] of this repo", "Search 'foo' in [docs] of this repo", etc. Then, after you get to the search page, a set of checkboxes to refine your search from the original scope, into a custom scope, by including/excluding arbitrary purpose-tags.)



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: