Stage Inference Example with Python SDK

Stage Inference Example with Python SDK#

This example demonstrates how to use the Python SDK to upload a dataset to a stage, submit a batch inference job, and download the results.

from materialized_intelligence.sdk import MaterializedIntelligence

mi.set_api_key("your_api_key") # you can skip this if you set your key via `mi login`

# create a stage
stage_id = mi.create_stage("your_stage_id")
print(f"Created stage with ID: {stage_id}")

# Let's upload some files to the stage. In this case, the two files are in the same directory as this script.
# Both have the same schema, that looks like:
# {
#     "id": int64,
#     "prompt": string,
# }
mi.upload_to_stage(stage_id, ["file1.parquet", "file2.parquet"])

system_prompt = "Label the following text as positive or negative."

json_schema = {
    "type": "object",
    "properties": {
        "label": {"type": "string", "enum": ["positive", "negative"]}
    },
    "required": ["label"]
}

# submit a batch inference job
mi.infer(stage_id, column="prompt", model="llama-3.2-3b", system_prompt=system_prompt, json_schema=json_schema, job_priority=1)

You can either poll for the status of the job, or periodically check in via mi jobs status <job_id> in the CLI.

Once the job is done, you can download the results:

import materialized_intelligence as mi

stage_id = "your_stage_id"

# download the results, which will be in the output_dir directory, and will be named the same as the files in the stage.
mi.download_from_stage(stage_id, output_path="output_dir")

# you can inspect the inference results

df = pl.read_parquet("output_dir/*.parquet")

for row in df.iter_rows(named=True):
    print(f"prompt: {row['prompt']}, label: {row['inference_result']['label']}")
    break