Extensions

Overview

Extensions are small Python functions you register with nexa-gauge to customize behavior the CLI alone can't express. They share a common pattern:

  1. Decorate a function with @register_<thing>("name") in any Python file.
  2. Point the CLI at that file with --extension-file ./my_file.py (repeatable).
  3. Select the registered name at run time (e.g. --transform hotpot_qa).

There is no packaging, no sys.path setup, and no entry-point registration — the file is imported once before iteration so its decorators fire.

Available today:

ExtensionDecoratorSelector flagPurpose
Transforms@register_transform("name")--transform <name>Reshape one raw record into nexa-gauge's expected dict shape before scanning.

Coming later: prompt overrides (@register_prompt), custom rubrics, and more. The --extension-file flag will load all of them; each gets its own selector flag.


Transforms

A transform is a small Python function that reshapes one raw record into the dict shape nexa-gauge expects, before any node sees it. Reach for it when your data — Hugging Face dataset, exported production logs, or any local JSON — doesn't line up with the canonical fields and a simple column rename isn't enough.

When to reach for a transform

Two tools, two different problems:

MismatchTool
Same data, different column name (textoutput)--field LOGICAL=COLUMN
Structural reshape (nested dict, multiple source columns → one field, computed values)@register_transform + --transform
BothCompose: transform reshapes first, --field renames after.

Reach for --field whenever it works — it's a one-line flag and covers the common case. Transforms exist for the rest.

A common trigger: hotpotqa/hotpot_qa rows look like this:

json
{
  "input": "...",
  "answer": "...",
  "context": {
    "title": ["Atacama Desert", "Chile"],
    "sentences": [
      ["The Atacama is a desert plateau.", "It spans 1,000 km."],
      ["Chile is in South America.", "Its capital is Santiago."]
    ]
  }
}

context is a nested dict, not a string or list of strings. There is no single column to alias into nexa-gauge's context field. The fix is a 10-line transform that zips title and sentences into a list of paragraphs.

Write a transform

python
from ng_core import register_transform


@register_transform("my_dataset")
def my_dataset(record: dict) -> dict:
    # reshape record into nexa-gauge's expected dict shape
    return {
        "case_id":    record["id"],
        "input":   record["q"],
        "output": record["a"],
        # ...context, reference as needed
    }

For the full worked example — hotpotqa/hotpot_qa with its nested context field — see Hugging Face datasets → Reshape nested structures with @register_transform.

The contract:

  • Input: one raw record dict (whatever the adapter yields).
  • Output: a dict with any subset of case_id, input, output, context, reference. Other keys are ignored.
  • Pure and threadsafe. No I/O, no shared mutable state. Transforms run in the producer thread, before per-case parallel fanout.
  • Errors surface as InputParseError with the record index, so they slot into the existing CLI error path.

Note: geval and redteam are nexa-gauge metric configs — not dataset data. Don't try to construct them from a transform; configure them at the record level instead. See Data Schema.

Run it

Two flags wire the transform into a run:

bash
nexagauge run <node> \
  --input <source> \
  --extension-file ./my_transforms.py \
  --transform <name>
  • --extension-file points at the Python file(s) to import. Repeatable.
  • --transform selects which registered transform to apply per record.

The same flags work with nexagauge estimate. For the full invocation against hotpot_qa, see Hugging Face datasets.

Compose with --field

Transforms reshape; --field renames. They chain naturally:

code
raw record
    ↓  transform (optional, restructures)
shaped dict
    ↓  --field aliases (optional, renames columns)
scanner-ready dict
    ↓  scan
typed Inputs

Use both together when your transform produces a dict whose keys aren't quite the canonical names yet:

bash
nexagauge run eval \
  --input hf://my-org/dataset \
  --extension-file ./my_transforms.py \
  --transform my_dataset \
  --field input=user_question

Errors

All transform-related failures surface as InputParseError, so they render through the same CLI error path as adapter and scanner failures.

SituationResult
--transform set, name not registeredExits with InputParseError listing the registered names.
--extension-file path does not existInputParseError: Extension file not found: <path>
Transform raises on a recordInputParseError(record_index=N) — halts the run.
Transform returns non-dictInputParseError: Transform '<name>' returned <type> on record <idx>, expected a dict.

See also