tacl script

Subcommands:

usage: tacl [-h] [--version]
            {align,catalogue,counts,diff,excise,highlight,intersect,join-works,lifetime,ngrams,normalise,prepare,query,results,search,split,stats,strip,sdiff,sintersect}
            ...

Analyse the text of corpora in various simple ways.

options:
  -h, --help            show this help message and exit
  --version             show program's version number and exit

subcommands:
  {align,catalogue,counts,diff,excise,highlight,intersect,join-works,lifetime,ngrams,normalise,prepare,query,results,search,split,stats,strip,sdiff,sintersect}
    align               Show aligned sets of matches between two witnesses
                        side by side.
    catalogue           Generate a catalogue file.
    counts              List counts of n-grams in each labelled witness.
    diff                List n-grams unique to each sub-corpus.
    excise              Remove specified n-grams from specified works'
                        witnesses.
    highlight           Output a witness with specified n-grams visually
                        highlighted.
    intersect           List n-grams common to all sub-corpora.
    join-works          Join multiple TEI XML works into a single new work.
    lifetime            Generate a report on the lifetime of n-grams.
    ngrams              Generate n-grams from a corpus.
    normalise           Create a normalised copy of a corpus.
    prepare             Convert CBETA TEI XML files into an XML form suitable
                        for stripping.
    query               Run a query from a file.
    results             Modify a query results file.
    search              List witnesses containing at least one of the supplied
                        n-grams.
    split               Split an existing work into multiple works.
    stats               Generate summary statistics for a set of results.
    strip               Generate files for use with TACL from a corpus of TEI
                        XML.
    sdiff               List n-grams unique to each results file.
    sintersect          List n-grams common to all results files.