To simplify the process of defining batches, we are considering the following batch configuration design:
batches:
- name: {document_text}
folder: batches/{document_text | normalize_key}
variables:
- id: document_text
path: datasets/proverbs.txt
The author can specify batch variables to load from specific paths such as a text file or spreadsheet in order to loop through different values of an input variable.
If loading your batch variable from a text file, we will skip lines starting with the hash # character so that you easily comment out specific values.
Defining a batch variable was already supported in crosscompute 0.8 but we are moving the batch variable configuration to a separate section called batches
and expanding functionality to support filters such as normalize_key
and spreadsheet paths for batch variable combinations (see example below).
batches:
- name: {author_name} - {document_text}
folder: batches/{author_name}/{document_text}
variables:
- id: author_name
path: datasets/quotes.csv
- id: document_text
path: datasets/quotes.csv
Here is the full configuration for the paint-letters widget example from crosscompute-examples:
---
# version of crosscompute
crosscompute: 0.9.0
# name of your automation
name: Paint Letters
# version of your automation
version: 0.0.1
# input configuration
input:
# input variables
# - id to use when referencing your variable in the template
# - view to use when rendering your variable on the display
# - path where your script loads the variable,
# relative to the input folder
variables:
- id: document_text
view: text
path: document.txt
# output configuration
output:
# output variables
# - id to use when referencing your variable in the template
# - view to use when rendering your variable on the display
# - path where your script saves the variable,
# relative to the output folder
variables:
- id: choropleth
view: image
path: choropleth.svg
# test configuration
# - folder that contains an input subfolder with paths for
# input variables that define a specific test
tests:
- folder: tests/standard
# batch configuration
# - name of the batch; can include variable ids for batch template
# - folder that contains an input subfolder with paths for
# input variables that define a specific batch
# - variable configuration for batch template
- id: { id of the input variable that you want to batch }
path: { path containing different values for this variable }
batches:
- name: {document_text}
folder: batches/{document_text | normalize_key}
variables:
- id: document_text
path: datasets/proverbs.txt
# script configuration
script:
# folder where your script should run
folder: .
# command to use to run your script
command: python -c "$(jupyter nbconvert run.ipynb --to script --stdout)"