Skip to content

Module redlite.metric.livecodebench

Classes

LiveCodeBenchMetric

class LiveCodeBenchMetric(
    endpoint: str = 'http://localhost:8000'
)

Metric to score Python code generation agains input/output tests.

This metric is specific to the LiveCodeBench benchmark and interacts with a local server that runs the generated code against the test cases.

Server is a docker built from a GitHub redlite-livecodebench-grader repository.

After building server docker, run it like this:

docker run -it -p 8000:80 ilabs/redlite-livecodebench-grader:latest
  • endpoint (str, optional): URL of server running the grading service. Default is http://localhost:8000.

Ancestors (in MRO)

  • redlite._core.NamedMetric