Module redlite.metric.livecodebench
Classes
LiveCodeBenchMetric
class LiveCodeBenchMetric(
endpoint: str = 'http://localhost:8000'
)
Metric to score Python code generation agains input/output tests.
This metric is specific to the LiveCodeBench benchmark and interacts with a local server that runs the generated code against the test cases.
Server is a docker built from a GitHub redlite-livecodebench-grader repository.
After building server docker, run it like this:
docker run -it -p 8000:80 ilabs/redlite-livecodebench-grader:latest
- endpoint (
str, optional): URL of server running the grading service. Default ishttp://localhost:8000.
Ancestors (in MRO)
- redlite._core.NamedMetric