Choosing the Right Tool: A Decision Framework You Can Use Monday Morning

Putting the module together

You have learned the three tool families (rules, ML, LLMs), the six things LLMs do well, the six ways they fail, and how to decode hype terms. This final concept in Module 1 combines all of that into a decision process you can apply whenever someone proposes a new feature.

Step 1: Write the requirement as an input-output contract

Before choosing a tool, be precise about what the feature actually does. Describe it as an API contract: what goes in, what comes out, what the constraints are.

Good: "Input: a customer email (plain text). Output: a JSON object with fields customer_name, issue_category, and urgency. Constraint: must work for English and Spanish emails."

Bad: "Make the support experience smarter."

If you cannot write the input-output contract, you are not ready to choose a tool. Go back and clarify the requirement.

Step 2: Can rules handle it?

Rules work when you can define exact conditions, the number of exceptions stays small, and you need strong auditability. Validate an email format? Rules. Block requests from banned IPs? Rules. Calculate tax based on state? Rules.

Rules fail when the input is natural language, when the categories are fuzzy, or when you find yourself writing more than 20 special cases. At that point, you are building a bad ML model by hand.

Step 3: Do you have labeled data and strict latency needs?

Classical ML is the right pick when you have thousands of labeled examples, the categories are stable, and you need fast inference (single-digit ms) at high volume. Spam detection with a million labeled emails and p95 under 5ms is a textbook ML use case.

If you do not have labeled data, or the task changes frequently, classical ML is hard to start with. This is the gap LLMs fill: they work on day one without training data.

Step 4: Is the input open-ended language?

LLMs are the right tool when the input is messy natural language and the output is a structured interpretation or a draft. Summarize a conversation, extract fields from a paragraph, rewrite text in a different tone, draft a response.

The key constraint: if the output must be perfectly correct (legal decisions, financial calculations, medical advice), you need strong safeguards and often a different primary tool with the LLM as an assistant rather than the decision-maker.

Start with the simplest tool that works. Move right only when the simpler option fails.

Worked examples

Example 1: "Extract invoice fields from emails." Input: unstructured email text. Output: JSON with invoice_number, amount, currency. The input is messy language, no labeled data exists yet. Tool: LLM for extraction, with schema validation and human review for high amounts.

Example 2: "Search our FAQ and answer user questions." Input: user question. Output: answer grounded in FAQ content. Tool: retrieval (vector search) plus LLM summarization. This is RAG, covered in Modules 6 and 7.

Example 3: "Validate email format." Input: string. Output: valid or invalid. Tool: regex. Deterministic, free, instant. No LLM needed.

Example 4: "Detect spam in 10 million emails per day." Input: email features. Output: spam/not-spam score. Tool: classical ML. You have millions of labeled examples, need sub-5ms latency, and the cost of an LLM per email at this volume would be thousands of dollars per day.

Example 5: "Block SQL injection." Input: request parameter. Output: allow or block. Tool: parameterized queries, input sanitization, WAF rules. Never rely on LLM judgment for security gates. The LLM can be tricked; deterministic validation cannot.

The one rule that matters

If a mistake is expensive or dangerous, prefer deterministic approaches (rules, ML) and add humans. If a mistake is tolerable and the output is a draft, an LLM can be a strong accelerator. That one rule handles 80% of the "should we use AI?" conversations.

What comes next: how the engine works

You now have a map of the AI landscape. You know the tools, the capabilities, the failure modes, the hype, and a decision process.

Module 2 takes you inside the LLM engine. Not to make you a researcher, but to give you the vocabulary every later module depends on: what tokens are (and why they matter for billing), how the model generates text one token at a time, what temperature and sampling controls do, and what the context window means for your product design. That vocabulary is what separates a developer who uses AI tools from one who builds reliable AI features.

AI Concepts