Site Generator
A tool for generating static html and gemtext pages from markdown-like source documents.
Source codeBy keeping the functionality simple and the code structure straightforward, adding and removing features becomes an inviting task.
> sitegen help sitegen usage: sitegen make [-psP] [<options>...] <out_dir> [<site_dir>] sitegen index [-plua] [<options>...] sitegen docs sitegen html_template sitegen gmi_template commands: make The main generation tool index Outputs site indeces in various formats docs Print the sitegen documentation in sitegen format html_template Print the default html template gmi_template Print the default gmi template
Building the site
> sitegen make --help sitegen make usage: sitegen make [-psP] [<options>...] <out_dir> [<site_dir>] arguments: out_dir the output folder for all generated content site_dir the input folder for the site itself, defaults to cwd options: --help show this text -p, --private include private content in the build --html <template> render html output with the template --gmi <template> render gmi output with the template -s, --silent don't output filenames as they are rendered -P <path>, --path <path> appended to the PATH when running shell commands
The site generator will generate /html and /gmi directories inside of the given out directories. My publishing process looks something like:
rm -rf out mkdir -p out/html out/gmi cp -r public-html/* out/html/ cp -r public/* out/html/ cp -r public/* out/gmi sitegen make out site && \ scp -r out/html/* username@clarity.flowers:/path/to/http/server &&\ scp -r out/gmi/* username@clarity.flowers:/path/to/gmi/server
Please note that these documents are full-fledged programs. Running the site generator on a document that you haven't vetted yourself is like running a shell script you got off the internet without making sure you know what it does. Use at your own risk.
Indexing
> sitegen index --help sitegen index usage: sitegen index [-plua] [<options>...] options: -p, --private include private content in the list -l <num>, --limit <num> limit the number of entries to the top num -u, --updates only include updates (and not additions) -a, --additions only include additions (and not updates)
Indexing allows you to create lists of files inside of your site, which you can combine with the ability to run arbitrary commands inside of documents to produce in-page tables of contents.
> sitegen index --limit 5 *
Document format
Sitegen documents are UTF8-encoded .txt files. The first few lines of the file are the title and metadata, broken by an empty line, followed by its contents.
Info
The beginning of a document starts with the title and the date it was written, followed by optional metadata information.
Site Generator Written 2021-02-04
You can keep track of the history of updates to the page:
Updated 2021-02-05 Added a bunch of features Updated 2021-02-06 You can have multiple updates now
You can also mark a page as private or unlisted
Private Unlisted
Private pages will only be generated if you include the "--private" option during generation. Unlisted pages will be generated, but won't show up during indexing.
The "using" setting declares what program the file should use as its shell (defaulting to $SHELL).
Using bash
Body
The remaining text comprises the document itself. Documents are raw text with special "blocks" marked by prefixing a line with special characters.
Raw text is copied literally in gemini, and separated into paragraphs by empty lines in html. Newlines are carried-over in the final content, so you'll need an editor that support text wrapping for large blocks of text.
There are two levels of heading available:
# Section heading ## Sub-section heading
Links are written gemini-style.
=> gemini://gemini.circumlunar.space/ About Gemini
Images must include a title (which will be used for the gemini link) and alt text (which will be used in html).
!> a_selfie.png A selfie ~~~ A selfie of me, wearing a red dress.
Links to internal documents need to have a different extension based on the output format. Using "*" as an extension will accomplish this.
=> computers.* Computers
Block quotes are formatted into paragraphs the same way raw text is:
> This line and... > ... this one are all one paragraph within this quote block. > > This is a separate paragraph in the same quote block.
There is only a single style of list, with only one line allowed per entry:
- list - items
Preformatted text is preceded by two spaces:
Here is a code block: echo "hello world"
Sometimes you'll want certain blocks to only be included in certain output formats. You can start the line with the given extension and write raw text that will be literally copied into matching documents.
.html <h1><img src="logo.png" alt="The logo for My Website"/></h1> .gmi # My Website
Lines prefixed with a semicolon are private. You can use them to keep track of personal notes that don't need to be shared, or to comment your documents. Private lines are only included in the final output if you pass the "--private" option during generation.
; This is a private line
Parsing for private lines happens during a pre-parsing step, which means your private lines can include formatting like any other.
; => somewhere_secret.* A private link
You can also run arbitrary commands with your shell and parse the output as formatted text!
: echo "# Hello world : => hello.png Hello world!"
The commands use the value set by the "using" info header, or otherwise default to "$SHELL". This has pretty huge implications as it allows you to write "plugins" in any language that generate data without constraints. The shell is invoked with its working directory as the directory of the document file, and the "$FILE" environment variable is set to the name of the file (without the .txt extension.)
A pattern I find myself using fairly often (including in this document) is to pipe the result of commands into sed in order to format the output as a preformatted block.
: sitegen index --help | sed 's/.*/ &/'
Example
For a complete example of a document, here is the source of the page you are currently reading:
Site Generator Written 2021-02-04 Using bash Updated 2021-02-06 Added a bunch of features Updated 2021-02-16 Blank lines are now handled literally in gemini Updated 2021-02-21 Rewrote to a more conventional documation format. Updated 2021-03-07 Added images Updated 2021-03-09 Added templates Updated 2021-08-06 Support for CRLF line endings, moved to sourcehut A tool for generating static html and gemtext pages from markdown-like source documents. => https://git.sr.ht/~clarity/sitegen Source code By keeping the functionality simple and the code structure straightforward, adding and removing features becomes an inviting task. : echo ' > sitegen help' : echo ' ' : sitegen help | sed 's/.*/ &/' # Building the site : echo ' > sitegen make --help' : echo ' ' : sitegen make --help | sed 's/.*/ &/' The site generator will generate /html and /gmi directories inside of the given out directories. My publishing process looks something like: rm -rf out mkdir -p out/html out/gmi cp -r public-html/* out/html/ cp -r public/* out/html/ cp -r public/* out/gmi sitegen make out site && \ scp -r out/html/* username@clarity.flowers:/path/to/http/server &&\ scp -r out/gmi/* username@clarity.flowers:/path/to/gmi/server Please note that these documents are full-fledged programs. Running the site generator on a document that you haven't vetted yourself is like running a shell script you got off the internet without making sure you know what it does. Use at your own risk. # Indexing : echo ' > sitegen index --help' : echo ' ' : sitegen index --help | sed 's/.*/ &/' Indexing allows you to create lists of files inside of your site, which you can combine with the ability to run arbitrary commands inside of documents to produce in-page tables of contents. : echo ' > sitegen index --limit 5 *' : echo ' ' : sitegen index --limit 5 * | sed 's/.*/ &/' # Document format Sitegen documents are UTF8-encoded .txt files. The first few lines of the file are the title and metadata, broken by an empty line, followed by its contents. ## Info The beginning of a document starts with the title and the date it was written, followed by optional metadata information. Site Generator Written 2021-02-04 You can keep track of the history of updates to the page: Updated 2021-02-05 Added a bunch of features Updated 2021-02-06 You can have multiple updates now You can also mark a page as private or unlisted Private Unlisted Private pages will only be generated if you include the "--private" option during generation. Unlisted pages will be generated, but won't show up during indexing. The "using" setting declares what program the file should use as its shell (defaulting to $SHELL). Using bash ## Body The remaining text comprises the document itself. Documents are raw text with special "blocks" marked by prefixing a line with special characters. Raw text is copied literally in gemini, and separated into paragraphs by empty lines in html. Newlines are carried-over in the final content, so you'll need an editor that support text wrapping for large blocks of text. There are two levels of heading available: # Section heading ## Sub-section heading Links are written gemini-style. => gemini://gemini.circumlunar.space/ About Gemini Images must include a title (which will be used for the gemini link) and alt text (which will be used in html). !> a_selfie.png A selfie ~~~ A selfie of me, wearing a red dress. Links to internal documents need to have a different extension based on the output format. Using "*" as an extension will accomplish this. => computers.* Computers Block quotes are formatted into paragraphs the same way raw text is: > This line and... > ... this one are all one paragraph within this quote block. > > This is a separate paragraph in the same quote block. There is only a single style of list, with only one line allowed per entry: - list - items Preformatted text is preceded by two spaces: Here is a code block: echo "hello world" Sometimes you'll want certain blocks to only be included in certain output formats. You can start the line with the given extension and write raw text that will be literally copied into matching documents. .html <h1><img src="logo.png" alt="The logo for My Website"/></h1> .gmi # My Website Lines prefixed with a semicolon are private. You can use them to keep track of personal notes that don't need to be shared, or to comment your documents. Private lines are only included in the final output if you pass the "--private" option during generation. ; This is a private line Parsing for private lines happens during a pre-parsing step, which means your private lines can include formatting like any other. ; => somewhere_secret.* A private link You can also run arbitrary commands with your shell and parse the output as formatted text! : echo "# Hello world : => hello.png Hello world!" The commands use the value set by the "using" info header, or otherwise default to "$SHELL". This has pretty huge implications as it allows you to write "plugins" in any language that generate data without constraints. The shell is invoked with its working directory as the directory of the document file, and the "$FILE" environment variable is set to the name of the file (without the .txt extension.) A pattern I find myself using fairly often (including in this document) is to pipe the result of commands into sed in order to format the output as a preformatted block. : sitegen index --help | sed 's/.*/ &/' ## Example For a complete example of a document, here is the source of the page you are currently reading: : sed 's/.*/ &/' ${FILE}.txt # Templates Documents are rendered into templates, which allow you to use variable escapes to insert your content and document metadata. ## Variables Variables are surrounded in two curly braces. The title of the document is {{title}} The available variables are: - title: The title you put at the top of your document - file: The name of the output file, without any extension - dir: The name of the file's directory (empty for root-level files) - written: The written date of the file - updated: The most recent updated date of the file, if any - parent: The path to the page one level "higher" than this one (".." for files named "index" and "." for all other files) - parent_name: The title of the parent page either "return home" or "{{dir}} index" - content: The rendered document content ## Date Formatting Dates can be formatted: Written {{written|Weekday, Month D, YYYY}} Written Monday, June 1, 2020 The format options are: - D - the day as one or two digits (1, 3, 15) - DD - the day as two digits (01, 03, 15) - Weekday - the day of the week (Monday, Friday) - Day - the day of the week, short (Mon, Fri) - M - the month as one or two digits (1, 10) - MM - the month as two digits (01, 10) - Month - the month, written out (January, October) - Mon - the month, short (Jan, Oct) - YY - the year as two digits (98, 20) - YYYY - the year as four digits (1998, 2020) Any non-format characters will be copied literally. ## Conditional Rendering You can render content conditionally by including a question mark inside variable escapes. All content inside (including more escapes) will only be rendered if that variable "exists". Some variables (like the title) will always exist, but others (like the updated date, back url, or directory) may need special care. {{parent?<a href="{{parent}}">{{parent_name}}</a>}} Note that {{parent?{{parent}}}} is identical to {{parent}} as empty variables will always be skipped over. ## Content If the location of the content isn't included, the content will be rendered after the template (treating the whole template as a header). You can't include the content more than once. ## Default templates html: : sitegen html_template | sed 's/.*/ &/' gemini: : sitegen gmi_template | sed 's/.*/ &/'
Templates
Documents are rendered into templates, which allow you to use variable escapes to insert your content and document metadata.
Variables
Variables are surrounded in two curly braces.
The title of the document is {{title}}
The available variables are:
- title: The title you put at the top of your document
- file: The name of the output file, without any extension
- dir: The name of the file's directory (empty for root-level files)
- written: The written date of the file
- updated: The most recent updated date of the file, if any
- parent: The path to the page one level "higher" than this one (".." for files named "index" and "." for all other files)
- parent_name: The title of the parent page
either "return home" or "{{dir}} index"
- content: The rendered document content
Date Formatting
Dates can be formatted:
Written {{written|Weekday, Month D, YYYY}} Written Monday, June 1, 2020
The format options are:
- D - the day as one or two digits (1, 3, 15)
- DD - the day as two digits (01, 03, 15)
- Weekday - the day of the week (Monday, Friday)
- Day - the day of the week, short (Mon, Fri)
- M - the month as one or two digits (1, 10)
- MM - the month as two digits (01, 10)
- Month - the month, written out (January, October)
- Mon - the month, short (Jan, Oct)
- YY - the year as two digits (98, 20)
- YYYY - the year as four digits (1998, 2020)
Any non-format characters will be copied literally.
Conditional Rendering
You can render content conditionally by including a question mark inside variable escapes. All content inside (including more escapes) will only be rendered if that variable "exists". Some variables (like the title) will always exist, but others (like the updated date, back url, or directory) may need special care.
{{parent?<a href="{{parent}}">{{parent_name}}</a>}}
Note that
{{parent?{{parent}}}}
is identical to
{{parent}}
as empty variables will always be skipped over.
Content
If the location of the content isn't included, the content will be rendered after the template (treating the whole template as a header). You can't include the content more than once.
Default templates
html:
<!DOCTYPE html> <html> <head> <meta charset="UTF-8"/> <meta name="viewport" content="width=device-width, initial-scale=1.0"> <link rel="stylesheet" type="text/css" href="/style.css" /> <title>{{title}}</title> </head> <body> {{parent?<a href="{{parent}}">{{parent_name}}</a>}} <main> <header> <h1>{{title}}</h1> Written {{written|Month D, YYYY}}{{updated?, last changed {{updated|Month D, YYYY}}}} </header> {{content}} </main> </body> </html>
gemini:
# {{title}} Written {{written|Month D, YYYY}}{{updated?, last changed {{updated|Month D, YYYY}}}}