LangChain in Action: How to Build Intelligent AI Applications Easily and Efficiently ?
LangChain - Elements

ChatOpenAI
╔════════════════════════════════════════════════════════════════════════════╗
║ 🌟 EDUCATIONAL EXAMPLE 🌟 ║
║ ║
║ 📌 This is a minimal and short working example for educational purposes. ║
║ ⚠️ Not optimized for production! ║
║ ║
║ 📦 Versions Used: ║
║ - "@langchain/core": "^0.3.38" ║
║ - "@langchain/openai": "^0.4.2" ║
║ ║
║ 🔄 Note: LangChain is transitioning from a monolithic structure to a ║
║ modular package structure. Ensure compatibility with future updates. ║
╚════════════════════════════════════════════════════════════════════════════╝
const model = new ChatOpenAI({
temperature: 0.7,
modelName: "gpt-3.5-turbo",
openAIApiKey: process.env.OPENAI_API_KEY,
topP: 1,
frequencyPenalty: 0,
presencePenalty: 0,
maxTokens: 500,
});
Temperature controls the randomness or creativity of the model's output. It affects how likely the model is to choose less probable words during generation.
- = 0.0 - 🔒 Most deterministic (least creative)
- ~ 0.7 - ✨ Balanced — creative but still relevant
- > 1.0 - 🎲 Very random and surprising
- > 1.5 - 🤪 Risk of incoherent or nonsensical output
After the model predicts token probabilities (via softmax), temperature modifies them like this:
- Lower temperature → sharpens the distribution (model favors high-probability tokens)
- Higher temperature → flattens it (more tokens have similar chances)
Examples
- temperature: 0.0 - “The Eiffel Tower is located in...” → Always completes with “Paris”
- temperature: 1.0 - “The Eiffel Tower is located in...” → Might complete with “Paris”, “France”, or even “a romantic part of Europe”
- temperature: 1.5 - “The Eiffel Tower is located in...” → Could say “a dream of ancient astronauts” — more artistic, but less accurate
Use Case | Suggested Temperature |
---|---|
Factual Q&A | 0 – 0.3 |
Chat assistants | 0.5 – 0.8 |
Storytelling / poetry | 0.8 – 1.2 |
Brainstorming | 1.0 – 1.5 |
What does topP do?
topP is a sampling parameter used to control the randomness and creativity of the model’s responses. It is also known as nucleus sampling.
Instead of picking the next word only from the top 1 or 2 probable options, topP lets the model choose from the smallest group of words whose cumulative probability adds up to P.
Examples
- topP: 1 → Consider all possible words (most random; behaves like no filtering).
- topP: 0.9 → Only the top 90% probable words are considered.
- topP: 0.1 → Extremely restrictive; only very likely words are considered (less creative, more focused).
Use Cases:
- Use lower topP (e.g., 0.5) when you want focused or deterministic output.
- Use higher topP (e.g., 0.95 or 1) for creative, open-ended text like poetry or storytelling.
You can use topP alongside temperature, but usually: Use one to control randomness. If you use both, experiment with their balance.
What does frequencyPenalty do?
It penalizes words that have already appeared in the output so far. The higher the value, the less likely the model is to repeat the same words.
Value range:
- 0: No penalty at all → The model may repeat words freely.
In your code: frequencyPenalty: 0 …means the model will not penalize repeated words, and is free to use the same word multiple times if it thinks it's appropriate. - 1: Moderate penalty → Reduces repetition.
- 2: Strong penalty → Avoids repeating any word.
Use case: If you're getting repetitive answers, increase frequencyPenalty to make the output more diverse.
Examples
- With frequencyPenalty: 0 → Output: "The cat is on the mat. The cat is on the mat."
- With frequencyPenalty: 1 → Output: "The feline rests on the rug."
What does presencePenalty do?
presencePenalty is a parameter that encourages the model to introduce new topics rather than repeating what's already been mentioned.
It penalizes tokens (words) that have already appeared in the generated text at least once, making the model more likely to explore new ideas.
Value range:
- 0: No penalty → The model may repeat concepts.
- 1: Moderate penalty → Encourages some variation.
- 2: Strong penalty → Pushes the model to keep introducing new topics or vocabulary.
Difference vs. frequencyPenalty:
- frequencyPenalty: Penalizes repeated words based on how frequently they appear.
- presencePenalty: Penalizes words simply for appearing at all before — even once.
Use presencePenalty when you want the model to:
- Be more creative
- Introduce new topics
- Avoid circling back to previously mentioned ideas
💡 Examples
- With presencePenalty: 0 Output: “Paris is a beautiful city. Paris has amazing food. Paris is known for its fashion.”
- With presencePenalty: 1 Output: “Paris is a beautiful city. The cuisine is delightful. It’s also a global hub for fashion.”
maxTokens
maxTokens sets the maximum number of tokens (words, parts of words, or symbols) that the model can generate in its response.
A token can be: A word ("apple" → 1 token). A part of a word ("fantastic" → fant, astic → 2 tokens). A punctuation mark (".", "?", etc.)
Example: "Hello there!" = 3 tokens: ["Hello", "there", "!"]
Why is maxTokens important?
- 🧾 Length of the response
- 💰 Cost (since usage is billed per token)
- 🚀 Performance/speed
⚠️ Important: maxTokens only applies to the response, not the prompt. If your prompt + response exceeds the model’s token limit (e.g. 4096 for gpt-3.5-turbo), you'll get an error.
💡 Tip: Use maxTokens to
- Limit verbose answers
- Avoid accidental runaway responses
- Save costs and time
What is Softmax?
The softmax function turns a list of numbers into probabilities that: Are positive. Sum up to 1
It’s often used in: Classification models (like picking the most likely class). Language models (choosing the next word/token based on probabilities)
Formula: For inputs [z₁, z₂, ..., zₙ]: softmax(zᵢ) = e^(zᵢ) / Σ e^(zⱼ) for all j
It emphasizes higher values, making the biggest input even more dominant.
Intuition: If a model outputs [2.0, 1.0, 0.1], Softmax transforms it into [0.65, 0.24, 0.11]. This shows the model is 65% confident about the first option.
function softmax(arr: number[]): number[] {
const exps = arr.map(x => Math.exp(x));
const sum = exps.reduce((a, b) => a + b, 0);
return exps.map(e => e / sum);
}
console.log(softmax([2.0, 1.0, 0.1]));
// ➜ [0.659, 0.242, 0.099] (approx.)
LangChain & Softmax?
LangChain does not expose softmax directly, but under the hood:
Models like ChatOpenAI use softmax to pick the next most likely token.
Temperature and topP control how softmax samples tokens.