μ΄λ² μ€μ΅μ smolagents μμ΄μ νΈλ₯Ό μ€ννκ³ , μ λ΅ κ²μ¦κΈ°(evaluator)λ‘ νμ§μ μ μννλ λ―Έλ νλ‘μ νΈλ€.
- μ΄μ νΈ: π€ 33. λ³ΈνΈ 20
flowchart TD A[μ¬μ©μ μ§λ¬Έ μΈνΈ] --> B[CodeAgent μ€ν] B --> C[μμΈ‘ λ΅λ³ μμ±] C --> D[Evaluator μ±μ ] D --> E{μ μ >= μκ³κ°?} E -- μ --> F[PASS λ‘κ·Έ μ μ₯] E -- μλμ€ --> G[μ€ν¨μμΈ κΈ°λ‘ ν ν둬ννΈ/λꡬ 보μ ]
μ€μ΅ λͺ©ν
- smolagents CodeAgentλ₯Ό μ€ννλ€.
- ν μ€νΈμ (μ§λ¬Έ/μ λ΅) κΈ°λ° μλ μ±μ μ ꡬννλ€.
- PASS κΈ°μ€(μ: 80μ ) λ―Έλ¬ μ μ€ν¨ μμΈμ νμΈνλ€.
1) μ€λΉ
Step 1-1. μμ ν΄λ μμ±
- λꡬ: ν°λ―Έλ
- μ λ ₯: μμ
- μ€νλͺ λ Ή:
mkdir -p ~/hf-agents-day34 && cd ~/hf-agents-day34- μ±κ³΅νμ :
pwdμΆλ ₯μ hf-agents-day34 ν¬ν¨
Step 1-2. κ°μνκ²½ + ν¨ν€μ§ μ€μΉ
- λꡬ: Python 3.10+
- μ λ ₯: μμ
- μ€νλͺ λ Ή:
python3 -m venv .venv
source .venv/bin/activate
pip install -U smolagents- μ±κ³΅νμ :
python -c "import smolagents; print('OK')"OK μΆλ ₯
Step 1-3. HF ν ν° μ€μ
- λꡬ: νκ²½λ³μ
- μ λ ₯: Hugging Face Access Token
- μ€νλͺ λ Ή:
export HF_TOKEN="hf_xxx"- μ±κ³΅νμ :
echo ${HF_TOKEN:+set}set μΆλ ₯
2) λ―Έλ νλ‘μ νΈ μ½λ μμ±
Step 2-1. day34_eval_loop.py μ μ₯
- λꡬ: νμΌ νΈμ§κΈ°
- μ λ ₯: μλ μ½λ
import json
from pathlib import Path
from datetime import datetime
from smolagents import CodeAgent, HfApiModel
DATASET = [
{"q": "λνλ―Όκ΅ μλλ?", "a": "μμΈ"},
{"q": "2+2λ?", "a": "4"},
{"q": "νμ΄μ¬ νμΌ νμ₯μλ?", "a": ".py"},
]
RESULT_PATH = Path("eval_result.json")
def build_agent() -> CodeAgent:
model = HfApiModel("Qwen/Qwen2.5-72B-Instruct")
return CodeAgent(
tools=[],
model=model,
max_steps=4,
additional_authorized_imports=["json"],
)
def ask(agent: CodeAgent, question: str) -> str:
prompt = f"""
λλ μ§§κ³ μ ννκ² λ΅νλ μ΄μμ€ν΄νΈλ€.
μ§λ¬Έ: {question}
μ€λͺ
μμ΄ μ λ΅λ§ ν μ€λ‘ μΆλ ₯ν΄.
"""
out = agent.run(prompt)
return str(out).strip()
def score(pred: str, gold: str) -> int:
return 100 if gold.strip().lower() == pred.strip().lower() else 0
if __name__ == "__main__":
agent = build_agent()
rows = []
total = 0
for item in DATASET:
pred = ask(agent, item["q"])
s = score(pred, item["a"])
total += s
rows.append({
"question": item["q"],
"gold": item["a"],
"pred": pred,
"score": s,
})
avg = total / len(DATASET)
passed = avg >= 80
result = {
"checked_at": datetime.utcnow().isoformat() + "Z",
"average_score": avg,
"pass": passed,
"rows": rows,
}
RESULT_PATH.write_text(json.dumps(result, ensure_ascii=False, indent=2), encoding="utf-8")
print("saved: eval_result.json")
if not passed:
raise SystemExit(f"eval_failed:average_score={avg}")- μ±κ³΅νμ :
python -m py_compile day34_eval_loop.pyμ€λ₯ μμ΄ μ’ λ£
3) μ€ν
Step 3-1. νκ° λ£¨ν μ€ν
- λꡬ: Python
- μ λ ₯: κΈ°λ³Έ DATASET 3κ°
- μ€νλͺ λ Ή:
python day34_eval_loop.py
cat eval_result.json- μ±κ³΅νμ :
saved: eval_result.jsonμΆλ ₯average_scoreνμΈpassκ°trueλ©΄ ν΅κ³Ό
Step 3-2. μ€ν¨ μ λ ν μ€νΈ
- λꡬ: μ½λ νΈμ§κΈ° + Python
- μ
λ ₯: DATASETμ μ λ΅ νλλ₯Ό μΌλΆλ¬ νλ¦¬κ² μμ (μ:
μμΈβλΆμ°) - μ€νλͺ λ Ή:
python day34_eval_loop.py- μ±κ³΅νμ :
eval_failed:average_score=...λ‘ μ’ λ£eval_result.jsonμ μ΄λ€ λ¬Ένμ΄ 0μ μΈμ§ κΈ°λ‘λ¨
νΈλ¬λΈμν (3κ° μ΄μ)
401 Unauthorizedλλ λͺ¨λΈ νΈμΆ μ€ν¨
- μμΈ:
HF_TOKENλ―Έμ€μ /κΆν λΆμ‘± - ν΄κ²°:
echo ${HF_TOKEN:+set}set μλλ©΄ ν ν° μ¬μ€μ ν μ¬μ€ν
ModuleNotFoundError: smolagents
- μμΈ: κ°μνκ²½ λ―Ένμ±ν λλ μ€μΉ λλ½
- ν΄κ²°:
source .venv/bin/activate
pip install -U smolagentseval_failed:average_score=...
- μμΈ: ν둬ννΈκ° μ₯ν©ν΄μ μ λ΅ νμ λΆμΌμΉ
- ν΄κ²°: ν둬ννΈλ₯Ό
μ λ΅λ§ μΆλ ₯μΌλ‘ λ κ°νκ² μ ν, νμ μscoreν¨μμ νμ²λ¦¬(strip, μλ¬Έμν) 보κ°
- μλ΅μ μ½λλΈλ‘/μ€λͺ μ΄ μμ
- μμΈ: λͺ¨λΈ μΆλ ₯ νμ λΆμμ
- ν΄κ²°:
ask()μμ λ°±ν± μ κ±° μ κ·ν μΆκ°
체ν¬λ¦¬μ€νΈ
- venv μμ± λ°
smolagentsμ€μΉ μλ£ -
day34_eval_loop.pyλ¬Έλ² κ²μ¬ ν΅κ³Ό -
eval_result.jsonμμ± νμΈ - νκ· μ μ λ° PASS/FAIL κΈ°μ€ νμΈ
- μ€ν¨ μ λ ν μ€νΈλ‘ μλ¬ κ²½λ‘ κ²μ¦
μ°Έκ³ λ§ν¬ (μ°μ μμ)
- https://github.com/huggingface/agents-course
- https://huggingface.co/learn/agents-course
- https://huggingface.co/docs/smolagents
μμ±ν AI νμ© κ³ μ§
μ΄ λ¬Έμλ μμ±ν AIλ₯Ό νμ©ν΄ μ΄μμ μμ±νκ³ , μ€μ΅ μ μ°¨(λꡬ/μ λ ₯/μ€νλͺ λ Ή/μ±κ³΅νμ )μ μ½λλ μ¬λμ΄ κ²ν ν΄ νμ νλ€.