feat: initial commit for 100.dathere.com and first exercise

This commit is contained in:
rzmk 2024-05-29 09:03:38 -04:00
commit 86f90af434
35 changed files with 860 additions and 0 deletions

59
.github/workflows/publish.yml vendored Normal file
View file

@ -0,0 +1,59 @@
name: deploy-book
# Run this when the master or main branch changes
on:
push:
branches:
- main
# If your git repository has the Jupyter Book within some-subfolder next to
# unrelated files, you can make this run only if a file within that specific
# folder has been modified.
#
# paths:
# - some-subfolder/**
# This job installs dependencies, builds the book, and pushes it to `gh-pages`
jobs:
deploy-book:
runs-on: ubuntu-latest
permissions:
pages: write
id-token: write
steps:
- uses: actions/checkout@v3
# Install dependencies
- name: Set up Python 3.11
uses: actions/setup-python@v4
with:
python-version: 3.11
- name: Install dependencies
run: |
python -m bash_kernel.install
# (optional) Cache your executed notebooks between runs
# if you have config:
# execute:
# execute_notebooks: cache
# - name: cache executed notebooks
# uses: actions/cache@v3
# with:
# path: _build/.jupyter_cache
# key: jupyter-book-cache-${{ hashFiles('requirements.txt') }}
# Build the book
- name: Build the book
run: |
jupyter-book build .
# Upload the book's HTML as an artifact
- name: Upload artifact
uses: actions/upload-pages-artifact@v2
with:
path: "_build/html"
# Deploy the book's HTML to GitHub Pages
- name: Deploy to GitHub Pages
id: deployment
uses: actions/deploy-pages@v2

3
.gitignore vendored Normal file
View file

@ -0,0 +1,3 @@
.venv
_build
.ipynb_checkpoints

3
.vscode/extensions.json vendored Normal file
View file

@ -0,0 +1,3 @@
{
"recommendations": ["emeraldwalk.runonsave", "ritwickdey.liveserver", "ms-toolsai.jupyter", "ms-toolsai.vscode-jupyter-cell-tags"]
}

19
.vscode/settings.json vendored Normal file
View file

@ -0,0 +1,19 @@
{
"emeraldwalk.runonsave": {
"shell": "/bin/bash",
"commands": [
{
"match": ".md$",
"cmd": "source .venv/bin/activate && jb clean . && jb build ."
},
{
"match": ".yml$",
"cmd": "source .venv/bin/activate && jb clean . && jb build ."
},
{
"match": ".ipynb$",
"cmd": "source .venv/bin/activate && jb clean . && jb build ."
}
]
}
}

53
README.md Normal file
View file

@ -0,0 +1,53 @@
# 100.dathere.com
**Try out available exercises:** [100.dathere.com](https://100.dathere.com)
This codebase includes source code for "100 exercises with qsv" found at [100.dathere.com](https://100.dathere.com).
![100.dathere.com preview](media/100.dathere.com-preview.png)
## How to run the Jupyter Book locally
Ensure you are using one of the following OS/software:
- Windows Subsystem for Linux 2 (not Windows) running Ubuntu
- macOS
- Linux
0. Install the prerequisites:
- [Git](https://git-scm.com/)
- [Visual Studio Code](https://code.visualstudio.com/) - Code editor
- [Live Server extension](https://marketplace.visualstudio.com/items?itemName=ritwickdey.LiveServer) - Local server extension (to view Jupyter Book locally and hot reload)
- [Run on Save extension](https://marketplace.visualstudio.com/items?itemName=emeraldwalk.RunOnSave) - Allows for auto-build on save for the file types specified in [../.vscode/settings.json](../.vscode/settings.json)
- [Python](https://python.org/)
- [uv](https://github.com/astral-sh/uv) - Python package manager
1. Clone the repository to your local device using [Git](https://git-scm.com/):
```bash
git clone https://github.com/dathere/100.dathere.com.git
```
3. Change your directory into this folder `book`.
4. Run `uv venv`, this should generate a `.venv` folder.
5.
- On macOS and Linux
- Run `source .venv/bin/activate`
- On Windows
- Run `.venv\Scripts\activate`
6. Run `uv pip install -r requirements-local.txt`.
7. Run `uv pip install -e ./bash_kernel` and then `python -m bash_kernel.install` to install the Bash kernel.
8. Run `jb build .` to build the book or save a `.md`, `.ipynb`, or `.yml` file in VS Code for the Run on Save extension to run relevant commands.
9. Right click on `_build/html/index.html` and click Open with Live Server which should launch a local build of the website and should reload within a few seconds each time you save a `.md` or `.yml` file in VS Code (you may need to refresh the page once Run on Save is done each time).
![Live Server example](media/live-server-example.png)
---
© Copyright [datHere](https://dathere.com)
![datHere logo dark](media/datHere-logo.png#gh-dark-mode-only)
![datHere logo light](media/datHere-logo.png#gh-light-mode-only)

47
_config.yml Normal file
View file

@ -0,0 +1,47 @@
# Book settings
# Learn more at https://jupyterbook.org/customize/config.html
title: 100 exercises with qsv
author: "<a href='https://dathere.com'>datHere</a> and <a href='https://github.com/dathere/100.dathere.com/graphs/contributors'>contributors</a>."
logo: media/qsv-logo.png
exclude_patterns:
[
"README.md",
".venv",
".vscode",
".gitignore",
"postBuild",
"requirements*.txt",
"bash_kernel"
]
# Force re-execution of notebooks on each build.
# See https://jupyterbook.org/content/execute.html
execute:
execute_notebooks: force
# Add GitHub buttons to your book
# See https://jupyterbook.org/customize/config.html#add-a-link-to-your-repository
html:
use_repository_button: true
favicon: media/qsv-logo-dark-icon.png
extra_footer: |
<p>
© Copyright datHere
</p>
# Information about where the book exists on the web
repository:
url: https://github.com/dathere/100.dathere.com # Online location of your book
branch: main # Which branch of the repository should be used when creating links (optional)
launch_buttons:
notebook_interface: jupyterlab
binderhub_url: https://mybinder.org
sphinx:
config:
html_show_copyright: false
html_theme_options:
logo:
image_light: media/qsv-logo.png
image_dark: media/qsv-logo-dark.png
language: en

14
_toc.yml Normal file
View file

@ -0,0 +1,14 @@
# Table of contents
# Learn more at https://jupyterbook.org/customize/toc.html
format: jb-book
root: intro
title: Introduction
chapters:
- file: exercises-setup
- file: notes
title: Getting started
sections:
- file: lessons/0/notes
title: Exploring qsv help messages and syntax
- file: appendix

55
appendix.md Normal file
View file

@ -0,0 +1,55 @@
---
jupytext:
text_representation:
extension: .md
format_name: myst
kernelspec:
display_name: Python 3
language: python
name: python3
---
# Appendix
Here you may find conceptual content related to the exercises in the book.
## qsv version
qsv has multiple versions and may differ for each system. Here we run [a command](https://github.com/jqnatividad/qsv/blob/master/docs/PERFORMANCE.md#version-details) to show what version of qsv this book is using along with other information:
```bash
qsv --version
```
```{code-cell} python3
:tags: [remove-input, scroll-output]
result = !qsv --version
print(result.n)
```
## qsv release assets
A mapping of qsv release files for an arbitrary version X.Y.Z and platforms they may run on are shown in the table below.
```{table} qsv release assets (vX.Y.Z)
:name: qsv-release-assets
| File | Platform |
| ----------------------------------------- | ------------------------- |
| `qsv-X.Y.Z-x86_64-pc-windows-msvc.zip` | 64-bit MSVC (Windows 10+) |
| `qsv-X.Y.Z.msi` | Windows |
| `qsv-X.Y.Z-x86_64-pc-windows-gnu.zip` | 64-bit MinGW (Windows 10+) |
| `qsv-X.Y.Z-i686-pc-windows-msvc.zip` | 32-bit MSVC (Windows 10+) |
| `qsv-X.Y.Z-aarch64-apple-darwin.zip` | ARM64 macOS (11.0+, Big Sur+) |
| `qsv-X.Y.Z-x86_64-apple-darwin.zip` | 64-bit macOS (10.12+, Sierra+) |
| `qsv-X.Y.Z-aarch64-unknown-linux-gnu.zip` | ARM64 Linux (kernel 4.1, glibc 2.17+) |
| `qsv-X.Y.Z-x86_64-unknown-linux-gnu.zip` | 64-bit Linux (kernel 3.2+, glibc 2.17+) |
| `qsv-X.Y.Z-x86_64-unknown-linux-musl.zip` | 64-bit Linux with musl 1.2.3 |
| `qsv-X.Y.Z-i686-unknown-linux-gnu.zip` | 32-bit Linux (kernel 3.2+, glibc 2.17+) |
```
:::{note}
The listed OS/architecture are primarily based on [information from "The rustc book"](https://doc.rust-lang.org/nightly/rustc/platform-support.html).
:::

5
bash_kernel/.gitignore vendored Normal file
View file

@ -0,0 +1,5 @@
__pycache__
*.pyc
build/
dist/
MANIFEST

14
bash_kernel/LICENSE Normal file
View file

@ -0,0 +1,14 @@
Copyright (c) 2015, Thomas Kluyver and contributors
All rights reserved.
BSD 3-clause license:
Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:
1. Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.
2. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.
3. Neither the name of the copyright holder nor the names of its contributors may be used to endorse or promote products derived from this software without specific prior written permission.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

22
bash_kernel/README.rst Normal file
View file

@ -0,0 +1,22 @@
A simple IPython kernel for bash
This requires IPython 3.
To install::
pip install bash_kernel
python -m bash_kernel.install
To use it, run one of:
.. code:: shell
ipython notebook
# In the notebook interface, select Bash from the 'New' menu
ipython qtconsole --kernel bash
ipython console --kernel bash
For details of how this works, see the Jupyter docs on `wrapper kernels
<http://jupyter-client.readthedocs.org/en/latest/wrapperkernels.html>`_, and
Pexpect's docs on the `replwrap module
<http://pexpect.readthedocs.org/en/latest/api/replwrap.html>`_

View file

@ -0,0 +1,3 @@
"""A bash kernel for Jupyter"""
__version__ = '0.4.1'

View file

@ -0,0 +1,3 @@
from ipykernel.kernelapp import IPKernelApp
from .kernel import BashKernel
IPKernelApp.launch_instance(kernel_class=BashKernel)

View file

@ -0,0 +1,48 @@
import base64
import imghdr
import os
#from IPython.
_TEXT_SAVED_IMAGE = "bash_kernel: saved image data to:"
image_setup_cmd = """
display () {
TMPFILE=$(mktemp ${TMPDIR-/tmp}/bash_kernel.XXXXXXXXXX)
cat > $TMPFILE
echo "%s $TMPFILE" >&2
}
""" % _TEXT_SAVED_IMAGE
def display_data_for_image(filename):
with open(filename, 'rb') as f:
image = f.read()
os.unlink(filename)
image_type = imghdr.what(None, image)
if image_type is None:
raise ValueError("Not a valid image: %s" % image)
image_data = base64.b64encode(image).decode('ascii')
content = {
'data': {
'image/' + image_type: image_data
},
'metadata': {}
}
return content
def extract_image_filenames(output):
output_lines = []
image_filenames = []
for line in output.split("\n"):
if line.startswith(_TEXT_SAVED_IMAGE):
filename = line.rstrip().split(": ")[-1]
image_filenames.append(filename)
else:
output_lines.append(line)
output = "\n".join(output_lines)
return image_filenames, output

View file

@ -0,0 +1,47 @@
import json
import os
import sys
import getopt
from jupyter_client.kernelspec import KernelSpecManager
from IPython.utils.tempdir import TemporaryDirectory
kernel_json = {"argv":[sys.executable,"-m","bash_kernel", "-f", "{connection_file}"],
"display_name":"Bash",
"language":"bash",
"codemirror_mode":"shell",
"env":{"PS1": "$"}
}
def install_my_kernel_spec(user=True, prefix=None):
with TemporaryDirectory() as td:
os.chmod(td, 0o755) # Starts off as 700, not user readable
with open(os.path.join(td, 'kernel.json'), 'w') as f:
json.dump(kernel_json, f, sort_keys=True)
# TODO: Copy resources once they're specified
print('Installing IPython kernel spec')
KernelSpecManager().install_kernel_spec(td, 'bash', user=user, replace=True, prefix=prefix)
def _is_root():
try:
return os.geteuid() == 0
except AttributeError:
return False # assume not an admin on non-Unix platforms
def main(argv=[]):
prefix = None
user = not _is_root()
opts, _ = getopt.getopt(argv[1:], '', ['user', 'prefix='])
for k, v in opts:
if k == '--user':
user = True
elif k == '--prefix':
prefix = v
user = False
install_my_kernel_spec(user=user, prefix=prefix)
if __name__ == '__main__':
main(argv=sys.argv)

View file

@ -0,0 +1,154 @@
from ipykernel.kernelbase import Kernel
from pexpect import replwrap, EOF
from subprocess import check_output
from os import unlink
import base64
import imghdr
import re
import signal
import urllib
__version__ = '0.2'
version_pat = re.compile(r'version (\d+(\.\d+)+)')
from .images import (
extract_image_filenames, display_data_for_image, image_setup_cmd
)
class BashKernel(Kernel):
implementation = 'bash_kernel'
implementation_version = __version__
@property
def language_version(self):
m = version_pat.search(self.banner)
return m.group(1)
_banner = None
@property
def banner(self):
if self._banner is None:
self._banner = check_output(['bash', '--version']).decode('utf-8')
return self._banner
language_info = {'name': 'bash',
'codemirror_mode': 'shell',
'mimetype': 'text/x-sh',
'file_extension': '.sh'}
def __init__(self, **kwargs):
Kernel.__init__(self, **kwargs)
self._start_bash()
def _start_bash(self):
# Signal handlers are inherited by forked processes, and we can't easily
# reset it from the subprocess. Since kernelapp ignores SIGINT except in
# message handlers, we need to temporarily reset the SIGINT handler here
# so that bash and its children are interruptible.
sig = signal.signal(signal.SIGINT, signal.SIG_DFL)
try:
self.bashwrapper = replwrap.bash()
finally:
signal.signal(signal.SIGINT, sig)
# Register Bash function to write image data to temporary file
self.bashwrapper.run_command(image_setup_cmd)
def do_execute(self, code, silent, store_history=True,
user_expressions=None, allow_stdin=False):
if not code.strip():
return {'status': 'ok', 'execution_count': self.execution_count,
'payload': [], 'user_expressions': {}}
interrupted = False
try:
output = self.bashwrapper.run_command(code.rstrip(), timeout=None)
except KeyboardInterrupt:
self.bashwrapper.child.sendintr()
interrupted = True
self.bashwrapper._expect_prompt()
output = self.bashwrapper.child.before
except EOF:
output = self.bashwrapper.child.before + 'Restarting Bash'
self._start_bash()
if not silent:
image_filenames, output = extract_image_filenames(output)
# Send standard output
stream_content = {'name': 'stdout', 'text': output}
self.send_response(self.iopub_socket, 'stream', stream_content)
# Send images, if any
for filename in image_filenames:
try:
data = display_data_for_image(filename)
except ValueError as e:
message = {'name': 'stdout', 'text': str(e)}
self.send_response(self.iopub_socket, 'stream', message)
else:
self.send_response(self.iopub_socket, 'display_data', data)
if interrupted:
return {'status': 'abort', 'execution_count': self.execution_count}
try:
exitcode = int(self.bashwrapper.run_command('echo $?').rstrip())
except Exception:
exitcode = 1
if exitcode and not (code.rstrip().endswith("-h") or code.rstrip().endswith("--help")):
error_content = {'execution_count': self.execution_count,
'ename': '', 'evalue': str(exitcode), 'traceback': []}
self.send_response(self.iopub_socket, 'error', error_content)
error_content['status'] = 'error'
return error_content
else:
return {'status': 'ok', 'execution_count': self.execution_count,
'payload': [], 'user_expressions': {}}
def do_complete(self, code, cursor_pos):
code = code[:cursor_pos]
default = {'matches': [], 'cursor_start': 0,
'cursor_end': cursor_pos, 'metadata': dict(),
'status': 'ok'}
if not code or code[-1] == ' ':
return default
tokens = code.replace(';', ' ').split()
if not tokens:
return default
matches = []
token = tokens[-1]
start = cursor_pos - len(token)
if token[0] == '$':
# complete variables
cmd = 'compgen -A arrayvar -A export -A variable %s' % token[1:] # strip leading $
output = self.bashwrapper.run_command(cmd).rstrip()
completions = set(output.split())
# append matches including leading $
matches.extend(['$'+c for c in completions])
else:
# complete functions and builtins
cmd = 'compgen -cdfa %s' % token
output = self.bashwrapper.run_command(cmd).rstrip()
matches.extend(output.split())
if not matches:
return default
matches = [m for m in matches if m.startswith(token)]
return {'matches': sorted(matches), 'cursor_start': start,
'cursor_end': cursor_pos, 'metadata': dict(),
'status': 'ok'}

11
bash_kernel/flit.ini Normal file
View file

@ -0,0 +1,11 @@
[metadata]
module = bash_kernel
author = Thomas Kluyver
author-email = thomas@kluyver.me.uk
home-page = https://github.com/takluyver/bash_kernel
requires = pexpect (>=3.3)
description-file = README.rst
classifiers = Framework :: IPython
License :: OSI Approved :: BSD License
Programming Language :: Python :: 3
Topic :: System :: Shells

View file

@ -0,0 +1,21 @@
[build-system]
requires = ["flit_core >=3.2,<4"]
build-backend = "flit_core.buildapi"
[project]
name = "bash_kernel"
authors = [
{name = "Thomas Kluyver", email = "thomas@kluyver.me.uk"},
]
readme = "README.rst"
dependencies = ["pexpect (>=4.0)", "ipykernel"]
classifiers = [
"Framework :: Jupyter",
"License :: OSI Approved :: BSD License",
"Programming Language :: Python :: 3",
"Topic :: System :: Shells",
]
dynamic = ["version", "description"]
[project.urls]
Source = "https://github.com/takluyver/bash_kernel"

72
exercises-setup.md Normal file
View file

@ -0,0 +1,72 @@
## How to set up the exercises
### Choose an approach
Each exercise should have a launch button for launching an online Jupyter lab environment which may take some seconds to launch completely. You may choose to launch an exercise by clicking the following button when it appears in a lesson:
![Binder](https://mybinder.org/badge_logo.svg)
However, you may choose to run the exercises on your system. The rest of this page is an overview of how to load the exercises and how to install qsv locally.
### 1. Download and extract the exercises
1. [Click here to download the `100.dathere.com.zip` file](https://github.com/dathere/100.dathere.com/archive/refs/heads/main.zip).
2. Unzip `100.dathere.com.zip`. You may delete everything except the `lessons` folder.
As you follow along with a lesson page on `100.dathere.com`, once an exercise appears then you may change directory into the relevant folder (e.g., `cd lessons/0` for the first exercise). We recommend you ignore the `notes.md` file in each lesson folder as the lesson is rendered as intended on `100.dathere.com` and the file may also contain the solution to the exercise.
### 2. Set up qsv
:::note
If you already have qsv installed on your system and accessible from `PATH` then you may [skip to step 3](#optional-set-up-qsv-bash-completions).
:::
#### Download and extract qsv
You may download qsv as an executable file which you may run in a terminal like other commands. There are [multiple ways](https://github.com/jqnatividad/qsv#installation-options) to download qsv and multiple versions of qsv.
Here's one way to download the latest version (arbitrarily represented as version `X.Y.Z`). You may download the latest version of qsv from the latest releases on GitHub at: [https://github.com/jqnatividad/qsv/releases/latest](https://github.com/jqnatividad/qsv/releases/latest#). Under the Assets section of the latest release you may find many files, so choose the right one based on your operating system and system architecture.
Here are the files we suggest if you're unsure:
| OS | Suggested file(s) |
| ------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| Windows | Run the `qsv-X.Y.Z.msi` installer and go through its installation process. |
| macOS | Depending on your architecture, choose between `qsv-X.Y.Z-aarch64-apple-darwin.zip` and `qsv-X.Y.Z-x86_64-apple-darwin.zip` |
| Linux | `qsv-X.Y.Z-x86_64-unknown-linux-gnu.zip` (the `musl.zip` may not have all of qsv's capabilities available but may have more compatibility with various Linux distros) |
:::{seealso}
If you'd like to know more about each file, see the [table for qsv release assets](qsv-release-assets).
:::
If you have a `.zip` file downloaded then make sure to unzip it and locate the `qsv` file within it (or `qsv.exe` for Windows). You may start using qsv with that file right away if you'd like!
#### Add qsv to your PATH
To ensure you may access qsv from your terminal without having to specify a path, you'll need to add qsv to your PATH.
If you used the `qsv-X.Y.Z.msi` installer for Windows then this should already have been done for you. You may verify this works by opening a terminal (Windows Terminal, Command Prompt, Git Bash, Powershell, etc.) and run `qsv`. This should output the list of available commands as intended.
For macOS and Linux there are various ways to add qsv to the PATH. One way is moving the `qsv` binary file to `/usr/local/bin`, which you may do by changing your directory to where `qsv` is located and running:
```bash
sudo mv qsv /usr/local/bin
```
You may need to restart your terminal. Try running `qsv --list`, which should output the list of available commands.
### 3 (Optional). Set up qsv bash completions
Bash completions allow you to press the tab key at certain locations while typing a qsv command to get suggestions (completions) so you may view available commands, subcommands, and options within your terminal (assuming you're using a compatible terminal such as Git Bash on Windows).
![qsv bash completions example](media/qsv-completions-demo.gif)
You can download the current bash completions file from qsv's source code at [`contrib/bashly/completions.bash`](https://github.com/jqnatividad/qsv/blob/master/contrib/bashly/completions.bash). Then you may run `source completions.bash` to enable the completions in your current terminal instance, and you may also move it to your home directory (`~/completions.bash`) and create a `.bashrc` file in your home directory (`~/.bashrc`) to include `source completions.bash` as a line within it for the completions script to run when you launch a bash terminal.
## Recap
If you chose to do a local installation, then by now you should have the following available on your system:
- The `lessons` folder
- qsv (available from your PATH)
- qsv bash completions (optional)

35
intro.md Normal file
View file

@ -0,0 +1,35 @@
```{admonition} This site is a work in progress.
:tags: important
You may report errors on [the issues page](https://github.com/dathere/100.dathere.com/issues) for this book's GitHub repository.
```
# 100 exercises with qsv
Welcome to 100 exercises with qsv!
In this book you may learn how to solve various data engineering issues with [qsv](https://github.com/jqnatividad/qsv).
qsv is a **command-line tool** built with Rust for spreadsheet data wrangling (CSV, Excel, etc.) and can handle large file sizes in relatively fast speeds. With [50+ commands](https://github.com/jqnatividad/qsv?tab=readme-ov-file#available-commands) (when all features are enabled), there are plenty of use cases qsv can handle.
If you're unfamiliar with qsv then don't worry. The initial exercises are intended for beginners that haven't tried qsv yet. We'll explore multiple features qsv has to offer while solving problems you may find in real-world scenarios. You may learn to use qsv in an interactive way by practicing exercises to resolve data wrangling issues.
```{admonition} Check out qsv pro!
:class: dropdown seealso
qsv pro (preview) is a **cross-platform desktop app** for spreadsheet data wrangling and it uses qsv for many of its core features. Check out [qsvpro.dathere.com](https://qsvpro.dathere.com) to download and try it out!
![qsv pro (preview) screenshot](media/qsv-pro-feature.png)
```
## How to engage with the book
At the end of each lesson there may be an exercise that you may complete on your computer. Each lesson usually follows this outline:
- Lesson content
- Exercise instructions (may also include instructions on how to verify your solution)
- Solution (optional hints, try the exercise first before viewing any hints though)
You don't need to follow all the lessons in order nor do you need to complete all of them. Feel free to skip around to lessons you find intriguing.
Click **Next** on the bottom of this page to go to the next page.

25
lessons/0/exercise.ipynb Normal file
View file

@ -0,0 +1,25 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Exercise 0: Total rows"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"language_info": {
"name": "python"
}
},
"nbformat": 4,
"nbformat_minor": 2
}

4
lessons/0/fruits.csv Normal file
View file

@ -0,0 +1,4 @@
fruit,price
apple,2.50
banana,3.00
carrot,1.50
1 fruit price
2 apple 2.50
3 banana 3.00
4 carrot 1.50

113
lessons/0/notes.md Normal file
View file

@ -0,0 +1,113 @@
---
jupytext:
text_representation:
extension: .md
format_name: myst
kernelspec:
display_name: Bash
language: bash
name: bash
---
# Getting help
## Listing all commands
This may be your first time using qsv, so let's see what qsv has to offer. In your terminal run qsv with the `--list` flag:
```{code-cell}
:tags: ["scroll-output"]
qsv --list
```
Here we see a list of commands and a brief description about them.[^1]
## Viewing a command's help message
You may view a command's help message by running:
```bash
qsv <command> --help
```
For example I may run the following to get the help message for the `headers` command:
```{code-cell}
:tags: ["scroll-output"]
qsv headers --help
```
Usually you'll find a similar structure for other qsv commands:
- Description about the command
- More details
- Examples and/or a link to them
- Usage format
- Subcommands[^2]
- Arguments
- Options (flags)
## Displaying headers of a CSV
Let's try viewing the headers in the `fruits.csv` file located in `lessons/0`. Based on the command format in the "Usage" section of the help message for `qsv headers`, we'll run:
```{code-cell}
qsv headers fruits.csv
```
## Recap
In this lesson we've covered how to:
- List all available qsv commands with `qsv --list`
- View the help message for an individual command with `qsv <command> --help`
- Interpret the parts of a command help message
- Run a command on an arbitrary CSV file, getting the headers with `qsv headers <filepath>`
Now it's your turn to take on the first exercise.
## Exercise 0: Total rows
Using a qsv command, get the total number of rows that are in the `fruits.csv` file.
[![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/dathere/100.dathere.com/main?labpath=lessons%2F0%2Fexercise.ipynb)
:::{hint}
:class: dropdown
The `count` command may be useful for this exercise. Make sure to learn how `qsv count` determines the row count in order to complete this exercise as intended.
:::
::::{admonition} Solution
:class: dropdown seealso
As with other solutions you may see in the upcoming exercises, there may be many ways to solve an exercise with qsv. A solution could be running the command:
```bash
qsv count fruits.csv --no-headers
```
And the output should be:
```bash
4
```
:::{admonition} Why not 3?
:class: dropdown hint
The exercise requires finding the **total number of rows** in `fruits.csv`. As described in the help message for `qsv count` (you may run `qsv count -h` to get the help message):
<q>Note that the **count will not include the header row (unless `--no-headers` is given)**.</q>
If you run `qsv count fruits.csv` then in your terminal you should see `3` as the output. Running it again this time with the `--no-headers` flag (or `-n` for short), you get the correct number of total rows `4` which includes the header row (which is the first row in the CSV file).
It may sound unusual that by using the `--no-headers` flag, the header row gets included in the row count. You may share any ideas for improvements to qsv on [qsv's GitHub discussions](https://github.com/jqnatividad/qsv/discussions).
:::
::::
[^1]: Not all 50+ commands may be listed using `qsv --list` since features may be disabled for a given qsv binary file (e.g., OS compatibility for certain commands).
[^2]: In this case `qsv headers` does not have any subcommands.

Binary file not shown.

After

Width:  |  Height:  |  Size: 187 KiB

BIN
media/datHere-logo-dark.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 5.4 KiB

BIN
media/datHere-logo.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 5.2 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 97 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 701 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 2.8 KiB

BIN
media/qsv-logo-dark.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 5.7 KiB

BIN
media/qsv-logo.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 5.9 KiB

BIN
media/qsv-pro-feature.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 440 KiB

14
notes.md Normal file
View file

@ -0,0 +1,14 @@
# Getting started
This section includes introductory lessons including concepts found commonly in qsv and command-line usage. If you're new to qsv or want to follow the exercises in order, continue on to the first lesson.
:::{admonition} Reminder for each exercise
:class: important
If you're running exercises locally then make sure you change your directory (`cd`) into the relevant directory in the `lessons` folder for each exercise. For example for the first exercise you may run this command from the root folder:
```bash
cd lessons/0
```
:::

14
postBuild Normal file
View file

@ -0,0 +1,14 @@
#!/bin/bash
set -ex
pip install bash_kernel
python -m bash_kernel.install
mkdir path_files
export PATH=/home/jovyan/path_files:$PATH
cd path_files
curl -LO https://github.com/jqnatividad/qsv/releases/download/0.128.0/qsv-0.128.0-x86_64-unknown-linux-gnu.zip
unzip qsv-0.128.0-x86_64-unknown-linux-gnu.zip
rm qsv*
cd ..

2
requirements-local.txt Normal file
View file

@ -0,0 +1,2 @@
jupyter-book
jupyterlab