mirror of
https://github.com/dathere/100.dathere.com.git
synced 2025-12-18 16:19:26 +00:00
Compare commits
No commits in common. "7bea3dd0d2446750df01a655c0a03ee3360b53aa" and "eca830a9ad9e46c1190f917b0a68404d6e29db6b" have entirely different histories.
7bea3dd0d2
...
eca830a9ad
6 changed files with 9 additions and 286 deletions
13
README.md
13
README.md
|
|
@ -12,7 +12,7 @@ This codebase includes source code for "100 exercises with qsv" found at [100.da
|
|||
|
||||
Ensure you are using one of the following OS/software:
|
||||
|
||||
- Windows Subsystem for Linux 2 (not Windows so that the bash kernel can work) running Ubuntu
|
||||
- Windows Subsystem for Linux 2 (not Windows) running Ubuntu
|
||||
- macOS
|
||||
- Linux
|
||||
|
||||
|
|
@ -32,12 +32,17 @@ git clone https://github.com/dathere/100.dathere.com.git
|
|||
```
|
||||
|
||||
3. Change your directory into this folder `book`.
|
||||
4. Run `uv venv --python 3.11`, this should generate a `.venv` folder.
|
||||
5. On macOS and Linux run `source .venv/bin/activate`
|
||||
4. Run `uv venv`, this should generate a `.venv` folder.
|
||||
5.
|
||||
|
||||
- On macOS and Linux
|
||||
- Run `source .venv/bin/activate`
|
||||
- On Windows
|
||||
- Run `.venv\Scripts\activate`
|
||||
|
||||
6. Run `uv pip install -r requirements-local.txt`.
|
||||
7. Run `uv pip install -e ./bash_kernel` and then `python -m bash_kernel.install` to install the Bash kernel.
|
||||
8. You may need to add qsv to your `PATH` first. Then, run `jb build . --all` to build the book or save a `.md`, `.ipynb`, or `.yml` file in VS Code for the Run on Save extension to run relevant commands.
|
||||
8. You may need to add qsv to your `PATH` first. Then, run `jb build .` to build the book or save a `.md`, `.ipynb`, or `.yml` file in VS Code for the Run on Save extension to run relevant commands.
|
||||
9. Serve the build locally. For example using VS Code, click on `_build/html/index.html` and click Open with Live Server which should launch a local build of the website and should reload within a few seconds each time you save a `.md` or `.yml` file in VS Code (you may need to refresh the page once Run on Save is done each time). You may need to navigate to the proper URL such as http://localhost:5500/\_build/html/.
|
||||
|
||||

|
||||
|
|
|
|||
2
_toc.yml
2
_toc.yml
|
|
@ -14,6 +14,4 @@ chapters:
|
|||
title: "Lesson 2: Piping commands"
|
||||
- file: lessons/3/index
|
||||
title: "Lesson 3: qsv and JSON"
|
||||
- file: lessons/4/index
|
||||
title: "Lesson 4: Running Polars SQL queries with qsv"
|
||||
- file: appendix
|
||||
|
|
|
|||
|
|
@ -1,6 +0,0 @@
|
|||
id,primary_color,secondary_color,length,air_conditioner,amenities
|
||||
001,black,blue,full,true,"wheelchair ramp, tissue boxes, cup holders, USB ports"
|
||||
002,black,red,full,true,"wheelchair ramp, tissue boxes, USB ports"
|
||||
003,white,blue,half,true,"wheelchair ramp, tissue boxes"
|
||||
004,orange,blue,full,false,"wheelchair ramp, tissue boxes, USB ports"
|
||||
005,black,blue,full,true,"wheelchair ramp, tissue boxes, cup holders, USB ports"
|
||||
|
|
|
@ -1,127 +0,0 @@
|
|||
{
|
||||
"cells": [
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"# Exercise 4: Running Polars SQL queries with qsv\n",
|
||||
"\n",
|
||||
"1. Display all of the data."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"qsv sqlp --help"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"2. Display the first 2 rows of data."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {
|
||||
"vscode": {
|
||||
"languageId": "plaintext"
|
||||
}
|
||||
},
|
||||
"outputs": [],
|
||||
"source": []
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"3. Display all the bus IDs with their lengths and whether they have air conditioning. Then render this output with `qsv table`."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {
|
||||
"vscode": {
|
||||
"languageId": "plaintext"
|
||||
}
|
||||
},
|
||||
"outputs": [],
|
||||
"source": []
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"4. Display all bus IDs which have air conditioning. Output the data in JSON format."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {
|
||||
"vscode": {
|
||||
"languageId": "plaintext"
|
||||
}
|
||||
},
|
||||
"outputs": [],
|
||||
"source": []
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"5. Display all bus IDs which have cup holders. Output the data in JSONL format."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {
|
||||
"vscode": {
|
||||
"languageId": "plaintext"
|
||||
}
|
||||
},
|
||||
"outputs": [],
|
||||
"source": []
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"6. Get the count of all buses where the primary color is either black or white."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {
|
||||
"vscode": {
|
||||
"languageId": "plaintext"
|
||||
}
|
||||
},
|
||||
"outputs": [],
|
||||
"source": []
|
||||
}
|
||||
],
|
||||
"metadata": {
|
||||
"kernelspec": {
|
||||
"display_name": "Bash",
|
||||
"language": "bash",
|
||||
"name": "bash"
|
||||
},
|
||||
"language_info": {
|
||||
"codemirror_mode": "shell",
|
||||
"file_extension": ".sh",
|
||||
"mimetype": "text/x-sh",
|
||||
"name": "bash"
|
||||
}
|
||||
},
|
||||
"nbformat": 4,
|
||||
"nbformat_minor": 4
|
||||
}
|
||||
|
|
@ -1,147 +0,0 @@
|
|||
---
|
||||
jupytext:
|
||||
text_representation:
|
||||
extension: .md
|
||||
format_name: myst
|
||||
kernelspec:
|
||||
display_name: Bash
|
||||
language: bash
|
||||
name: bash
|
||||
---
|
||||
|
||||
# Lesson 4: Running Polars SQL queries with qsv
|
||||
|
||||

|
||||
|
||||
The [Polars](https://pola.rs) library is used by qsv to enhance data engineering capabilities. One of the multiple benefits that Polars provides for qsv is the ability to run Polars SQL queries with qsv's `sqlp` command.
|
||||
|
||||
```{code-cell}
|
||||
:tags: ["scroll-output"]
|
||||
qsv sqlp -h
|
||||
```
|
||||
|
||||
There are plenty of example queries you can copy for your usage in the help message of `qsv sqlp` above.
|
||||
|
||||
Note that when you run a query, you may get the shape of the output data from standard error (`stderr`) after the output. For example you may see `(5, 6)` after the output representing 5 rows and 6 columns.
|
||||
|
||||
We'll hide the shape from the output by adding the `-q` or `--quiet` flag in the exercises.
|
||||
|
||||
## Exercise 4: Running Polars SQL queries with qsv
|
||||
|
||||
[](https://mybinder.org/v2/gh/dathere/100.dathere.com/main?labpath=lessons%2F4%2Fexercise.ipynb)
|
||||
|
||||
Use `qsv sqlp` and its options to complete each of the following tasks on the `buses.csv` file (assume the headers are included in the output, otherwise you may usually pipe the output into `qsv behead` if needed):
|
||||
|
||||
1. Display all of the data.
|
||||
2. Display the first 2 rows of data.
|
||||
3. Display all the bus IDs with their lengths and whether they have air conditioning. Then render this output with `qsv table`.
|
||||
4. Display all bus IDs which have air conditioning. Output the data in JSON format.
|
||||
5. Display all bus IDs which have cup holders. Output the data in JSONL format.
|
||||
6. Get the count of all buses where the primary color is either black or white.
|
||||
|
||||
> Here we show the usage text of `qsv sqlp` for your reference. Solve this exercise using [Thebe](exercises-setup:thebe), [Binder](exercises-setup:binder) or [locally](exercises-setup:local).
|
||||
|
||||
```{code-cell}
|
||||
:tags: ["scroll-output"]
|
||||
qsv sqlp --help
|
||||
```
|
||||
|
||||
::::{admonition} Solution for task 1
|
||||
:class: dropdown seealso
|
||||
|
||||
```bash
|
||||
qsv sqlp buses.csv 'SELECT * FROM buses' -q
|
||||
```
|
||||
|
||||
You can also replace `buses` with `_t_1` as per the help message.
|
||||
|
||||
The output should be:
|
||||
|
||||
```csv
|
||||
id,primary_color,secondary_color,length,air_conditioner,amenities
|
||||
1,black,blue,full,true,"wheelchair ramp, tissue boxes, cup holders, USB ports"
|
||||
2,black,red,full,true,"wheelchair ramp, tissue boxes, USB ports"
|
||||
3,white,blue,half,true,"wheelchair ramp, tissue boxes"
|
||||
4,orange,blue,full,false,"wheelchair ramp, tissue boxes, USB ports"
|
||||
5,black,blue,full,true,"wheelchair ramp, tissue boxes, cup holders, USB ports"
|
||||
```
|
||||
|
||||
::::
|
||||
|
||||
::::{admonition} Solution for task 2
|
||||
:class: dropdown seealso
|
||||
|
||||
```bash
|
||||
qsv sqlp buses.csv 'SELECT * FROM buses LIMIT 2' -q
|
||||
```
|
||||
|
||||
```csv
|
||||
id,primary_color,secondary_color,length,air_conditioner,amenities
|
||||
1,black,blue,full,true,"wheelchair ramp, tissue boxes, cup holders, USB ports"
|
||||
2,black,red,full,true,"wheelchair ramp, tissue boxes, USB ports"
|
||||
```
|
||||
|
||||
::::
|
||||
|
||||
::::{admonition} Solution for task 3
|
||||
:class: dropdown seealso
|
||||
|
||||
```bash
|
||||
qsv sqlp buses.csv 'SELECT id,length,air_conditioner FROM buses' -q | qsv table
|
||||
```
|
||||
|
||||
```
|
||||
id length air_conditioner
|
||||
1 full true
|
||||
2 full true
|
||||
3 half true
|
||||
4 full false
|
||||
5 full true
|
||||
```
|
||||
|
||||
::::
|
||||
|
||||
::::{admonition} Solution for task 4
|
||||
:class: dropdown seealso
|
||||
|
||||
```bash
|
||||
qsv sqlp buses.csv "SELECT id FROM buses WHERE air_conditioner = 'true'" --format json -q
|
||||
```
|
||||
|
||||
```json
|
||||
[{"id":1},{"id":2},{"id":3},{"id":5}]
|
||||
```
|
||||
|
||||
::::
|
||||
|
||||
::::{admonition} Solution for task 5
|
||||
:class: dropdown seealso
|
||||
|
||||
```bash
|
||||
qsv sqlp buses.csv "SELECT id FROM buses WHERE amenities ILIKE '%cup holders%'" --format jsonl -q
|
||||
```
|
||||
|
||||
```json
|
||||
{"id":1}
|
||||
{"id":5}
|
||||
```
|
||||
|
||||
::::
|
||||
|
||||
::::{admonition} Solution for task 6
|
||||
:class: dropdown seealso
|
||||
|
||||
```bash
|
||||
qsv sqlp buses.csv "SELECT COUNT(*) FROM buses WHERE primary_color = 'black' OR primary_color = 'white'" -q
|
||||
```
|
||||
|
||||
```csv
|
||||
len
|
||||
4
|
||||
```
|
||||
|
||||
Notice the output is a table with a single column named `len` and a single record with the count of `4`. How can we get just the count `4` as the output?
|
||||
|
||||
One way is to pipe the command into `qsv behead`. Another way may be to not get the count within the SQL query but rather pipe the output into `qsv count`. There are often many ways to solve the same problem with qsv!
|
||||
|
||||
::::
|
||||
Binary file not shown.
|
Before Width: | Height: | Size: 118 KiB |
Loading…
Add table
Add a link
Reference in a new issue