You should checkout powershell; it supports converting CSV into in-memory structured data and then you can run regular powershell queries on that data:
$> csvData = @"
Name,Department,Salary
John Doe,IT,60000
Jane Smith,Finance,75000
Alice Johnson,HR,65000
Bob Anderson,IT,71000
"@;
$> csvData
| ConvertFrom-Csv
| Select Name, Salary
| Sort Salary -Descending
Name Salary
---- ------
Jane Smith 75000
Bob Anderson 71000
Alice Johnson 65000
John Doe 60000
You can also then convert the results back into CSV by piping into ConvertTo-Csv
/tmp/> "Name,Department,Salary
::: John Doe,IT,60000
::: Jane Smith,Finance,75000
::: Alice Johnson,HR,65000
::: Bob Anderson,IT,71000" |
::: from csv |
::: select Name Salary |
::: sort-by -r Salary
╭───┬───────────────┬────────╮
│ # │ Name │ Salary │
├───┼───────────────┼────────┤
│ 0 │ Jane Smith │ 75000 │
│ 1 │ Bob Anderson │ 71000 │
│ 2 │ Alice Johnson │ 65000 │
│ 3 │ John Doe │ 60000 │
╰───┴───────────────┴────────╯
for something a bit more robust, check out DuckDB. It's a library you can embed, use it to run SQL on local files as well as connect to databases, do joins, analytics, etc.
Agreed. The article mentioned duckdb and I'm her to thumbs-up the use of DuckDB wholeheartedly. If you like the world of public CSV files as data sources that you can query or cross-query, duckdb is the tool for you. Just follow the demo on the duckdb website and you'll be wow'd for sure.
I use both, and I have found it helpful to have nushell around when munging csv and parquet files, even when working with duckdb - I find it quicker to ask for the first few rows of a thing or do simple stuff with it, then deeper analysis with duckdb.
- Send the csv data from stdin (using echo and referred to in the command by -)
- Refer to the data in the query by stdin. You may also use the _t_N syntax (first table is _t_1, then _t_2, etc.), or the file name itself before the .csv extension if we were using files.
- Pipe the output to the table command for formatting.
- Also, the shape of the result is printed to stderr (the (4, 2) below).
$ echo 'Name,Department,Salary
John Doe,IT,60000
Jane Smith,Finance,75000
Alice Johnson,HR,65000
Bob Anderson,IT,71000' |
qsv sqlp - 'SELECT Name, Salary FROM stdin ORDER BY Salary DESC' |
qsv table
(4, 2)
Name Salary
Jane Smith 75000
Bob Anderson 71000
Alice Johnson 65000
John Doe 60000