이번 μ‹€μŠ΅μ€ smolagents μ—μ΄μ „νŠΈλ₯Ό μ‹€ν–‰ν•˜κ³ , μ •λ‹΅ 검증기(evaluator)둜 ν’ˆμ§ˆμ„ μ μˆ˜ν™”ν•˜λŠ” λ―Έλ‹ˆ ν”„λ‘œμ νŠΈλ‹€.

flowchart TD
  A[μ‚¬μš©μž 질문 μ„ΈνŠΈ] --> B[CodeAgent μ‹€ν–‰]
  B --> C[예츑 λ‹΅λ³€ 생성]
  C --> D[Evaluator 채점]
  D --> E{점수 >= μž„κ³„κ°’?}
  E -- 예 --> F[PASS 둜그 μ €μž₯]
  E -- μ•„λ‹ˆμ˜€ --> G[μ‹€νŒ¨μ›μΈ 기둝 ν›„ ν”„λ‘¬ν”„νŠΈ/도ꡬ 보정]

μ‹€μŠ΅ λͺ©ν‘œ

  1. smolagents CodeAgentλ₯Ό μ‹€ν–‰ν•œλ‹€.
  2. ν…ŒμŠ€νŠΈμ…‹(질문/μ •λ‹΅) 기반 μžλ™ 채점을 κ΅¬ν˜„ν•œλ‹€.
  3. PASS κΈ°μ€€(예: 80점) 미달 μ‹œ μ‹€νŒ¨ 원인을 ν™•μΈν•œλ‹€.

1) μ€€λΉ„

Step 1-1. μž‘μ—… 폴더 생성

  • 도ꡬ: 터미널
  • μž…λ ₯: μ—†μŒ
  • μ‹€ν–‰λͺ…λ Ή:
mkdir -p ~/hf-agents-day34 && cd ~/hf-agents-day34
  • μ„±κ³΅νŒμ •:
pwd

좜λ ₯에 hf-agents-day34 포함

Step 1-2. κ°€μƒν™˜κ²½ + νŒ¨ν‚€μ§€ μ„€μΉ˜

  • 도ꡬ: Python 3.10+
  • μž…λ ₯: μ—†μŒ
  • μ‹€ν–‰λͺ…λ Ή:
python3 -m venv .venv
source .venv/bin/activate
pip install -U smolagents
  • μ„±κ³΅νŒμ •:
python -c "import smolagents; print('OK')"

OK 좜λ ₯

Step 1-3. HF 토큰 μ„€μ •

  • 도ꡬ: ν™˜κ²½λ³€μˆ˜
  • μž…λ ₯: Hugging Face Access Token
  • μ‹€ν–‰λͺ…λ Ή:
export HF_TOKEN="hf_xxx"
  • μ„±κ³΅νŒμ •:
echo ${HF_TOKEN:+set}

set 좜λ ₯


2) λ―Έλ‹ˆ ν”„λ‘œμ νŠΈ μ½”λ“œ μž‘μ„±

Step 2-1. day34_eval_loop.py μ €μž₯

  • 도ꡬ: 파일 νŽΈμ§‘κΈ°
  • μž…λ ₯: μ•„λž˜ μ½”λ“œ
import json
from pathlib import Path
from datetime import datetime
 
from smolagents import CodeAgent, HfApiModel
 
DATASET = [
    {"q": "λŒ€ν•œλ―Όκ΅­ μˆ˜λ„λŠ”?", "a": "μ„œμšΈ"},
    {"q": "2+2λŠ”?", "a": "4"},
    {"q": "파이썬 파일 ν™•μž₯μžλŠ”?", "a": ".py"},
]
 
RESULT_PATH = Path("eval_result.json")
 
 
def build_agent() -> CodeAgent:
    model = HfApiModel("Qwen/Qwen2.5-72B-Instruct")
    return CodeAgent(
        tools=[],
        model=model,
        max_steps=4,
        additional_authorized_imports=["json"],
    )
 
 
def ask(agent: CodeAgent, question: str) -> str:
    prompt = f"""
λ„ˆλŠ” μ§§κ³  μ •ν™•ν•˜κ²Œ λ‹΅ν•˜λŠ” μ–΄μ‹œμŠ€ν„΄νŠΈλ‹€.
질문: {question}
μ„€λͺ… 없이 μ •λ‹΅λ§Œ ν•œ μ€„λ‘œ 좜λ ₯ν•΄.
"""
    out = agent.run(prompt)
    return str(out).strip()
 
 
def score(pred: str, gold: str) -> int:
    return 100 if gold.strip().lower() == pred.strip().lower() else 0
 
 
if __name__ == "__main__":
    agent = build_agent()
 
    rows = []
    total = 0
    for item in DATASET:
        pred = ask(agent, item["q"])
        s = score(pred, item["a"])
        total += s
        rows.append({
            "question": item["q"],
            "gold": item["a"],
            "pred": pred,
            "score": s,
        })
 
    avg = total / len(DATASET)
    passed = avg >= 80
 
    result = {
        "checked_at": datetime.utcnow().isoformat() + "Z",
        "average_score": avg,
        "pass": passed,
        "rows": rows,
    }
 
    RESULT_PATH.write_text(json.dumps(result, ensure_ascii=False, indent=2), encoding="utf-8")
    print("saved: eval_result.json")
 
    if not passed:
        raise SystemExit(f"eval_failed:average_score={avg}")
  • μ„±κ³΅νŒμ •:
python -m py_compile day34_eval_loop.py

였λ₯˜ 없이 μ’…λ£Œ


3) μ‹€ν–‰

Step 3-1. 평가 루프 μ‹€ν–‰

  • 도ꡬ: Python
  • μž…λ ₯: κΈ°λ³Έ DATASET 3개
  • μ‹€ν–‰λͺ…λ Ή:
python day34_eval_loop.py
cat eval_result.json
  • μ„±κ³΅νŒμ •:
    • saved: eval_result.json 좜λ ₯
    • average_score 확인
    • passκ°€ trueλ©΄ 톡과

Step 3-2. μ‹€νŒ¨ μœ λ„ ν…ŒμŠ€νŠΈ

  • 도ꡬ: μ½”λ“œ νŽΈμ§‘κΈ° + Python
  • μž…λ ₯: DATASET의 μ •λ‹΅ ν•˜λ‚˜λ₯Ό μΌλΆ€λŸ¬ ν‹€λ¦¬κ²Œ μˆ˜μ •(예: μ„œμšΈ β†’ λΆ€μ‚°)
  • μ‹€ν–‰λͺ…λ Ή:
python day34_eval_loop.py
  • μ„±κ³΅νŒμ •:
    • eval_failed:average_score=...둜 μ’…λ£Œ
    • eval_result.json에 μ–΄λ–€ 문항이 0점인지 기둝됨

νŠΈλŸ¬λΈ”μŠˆνŒ… (3개 이상)

  1. 401 Unauthorized λ˜λŠ” λͺ¨λΈ 호좜 μ‹€νŒ¨
  • 원인: HF_TOKEN λ―Έμ„€μ •/κΆŒν•œ λΆ€μ‘±
  • ν•΄κ²°:
echo ${HF_TOKEN:+set}

set μ•„λ‹ˆλ©΄ 토큰 μž¬μ„€μ • ν›„ μž¬μ‹€ν–‰

  1. ModuleNotFoundError: smolagents
  • 원인: κ°€μƒν™˜κ²½ λ―Έν™œμ„±ν™” λ˜λŠ” μ„€μΉ˜ λˆ„λ½
  • ν•΄κ²°:
source .venv/bin/activate
pip install -U smolagents
  1. eval_failed:average_score=...
  • 원인: ν”„λ‘¬ν”„νŠΈκ°€ μž₯ν™©ν•΄μ„œ μ •λ‹΅ ν˜•μ‹ 뢈일치
  • ν•΄κ²°: ν”„λ‘¬ν”„νŠΈλ₯Ό μ •λ‹΅λ§Œ 좜λ ₯으둜 더 κ°•ν•˜κ²Œ μ œν•œ, ν•„μš” μ‹œ score ν•¨μˆ˜μ— ν›„μ²˜λ¦¬(strip, μ†Œλ¬Έμžν™”) 보강
  1. 응닡에 μ½”λ“œλΈ”λ‘/μ„€λͺ…이 μ„žμž„
  • 원인: λͺ¨λΈ 좜λ ₯ ν˜•μ‹ λΆˆμ•ˆμ •
  • ν•΄κ²°: ask()μ—μ„œ λ°±ν‹± 제거 μ •κ·œν™” μΆ”κ°€

체크리슀트

  • venv 생성 및 smolagents μ„€μΉ˜ μ™„λ£Œ
  • day34_eval_loop.py 문법 검사 톡과
  • eval_result.json 생성 확인
  • ν‰κ· μ μˆ˜ 및 PASS/FAIL κΈ°μ€€ 확인
  • μ‹€νŒ¨ μœ λ„ ν…ŒμŠ€νŠΈλ‘œ μ—λŸ¬ 경둜 검증

μ°Έκ³  링크 (μš°μ„ μˆœμœ„)

  1. https://github.com/huggingface/agents-course
  2. https://huggingface.co/learn/agents-course
  3. https://huggingface.co/docs/smolagents

μƒμ„±ν˜• AI ν™œμš© κ³ μ§€

이 λ¬Έμ„œλŠ” μƒμ„±ν˜• AIλ₯Ό ν™œμš©ν•΄ μ΄ˆμ•ˆμ„ μž‘μ„±ν–ˆκ³ , μ‹€μŠ΅ 절차(도ꡬ/μž…λ ₯/μ‹€ν–‰λͺ…λ Ή/μ„±κ³΅νŒμ •)와 μ½”λ“œλŠ” μ‚¬λžŒμ΄ κ²€ν† ν•΄ ν™•μ •ν–ˆλ‹€.