Hai Huang’s Post

Lead Principal Engineer@Atlassian, DevAI, AI Researcher, ex-Googler

6mo

Prof. Rao's post is a good starting point for LLM practitioners to become familiar with classic (i.e., predating LLMs) AI concepts and theories, for two reasons: 📌 To understand that some things are theoretically beyond the reach of LLMs and not waste resources chasing them. 📌 To hedge against the possibility that LLMs may be replaced by other technologies and/or undergo fundamental technological revamps, which would render all existing investments in LLMs obsolete. #artificialintelligence #machinelearning #deeplearning

Subbarao Kambhampati

Prof at ASU (Former President of AAAI)

6mo Edited

𝕎𝕙𝕪 #𝔸𝕀 𝕗𝕠𝕝𝕜𝕤 𝕟𝕖𝕖𝕕 𝕒 𝕓𝕣𝕠𝕒𝕕 𝕓𝕒𝕤𝕖𝕕 𝕀𝕟𝕥𝕣𝕠 𝕥𝕠 #𝔸𝕀 👉 As I go around giving talks/tutorials on the planning and reasoning abilities of LLMs, I am constantly surprised at the rather narrow ML-centric background of grad students/young researchers have about #AI. This seems to be especially the case with those who think LLMs are already doing planning and reasoning etc. Most of them don't seem to know much about the many topics that are taught in a broad-based Intro to #AI course--such as combinatorial search, logic, CSP, difference between inductive vs. deductive reasoning (aka learning vs. inference), soundness vs. completeness of inference/reasoning etc. I can understand why a strong background in ML and DL is sine qua non these days in using/applying the current #AI technology. That doesn't however mean that other things, that are typically not covered in ML courses, but are covered in Intro #AI courses, are expendable. If you don't know those concepts, you are more likely than not to re-invent crooked wheels (see this for examples of how people get tripped up: https://lnkd.in/gUPPb7s4) All this is particularly relevant for those busy building empirical scaffolds over LLMs (the "LLMs are Zero-shot <XXX>" variety). Most often, these young researchers are coming from NLP. At one point, NLP used to be NLU and students had quite a firm grasp of logic (e..g Montague Semantics!). But over the years, NLU became NLP which in turn has become Applied Machine Learning, and students don't quite have the background in logic/reasoning etc. Now that LLMs have basically "solved" the "processing" tasks--such as information extraction, format conversion etc., NLP folks are trying to turn to reasoning tasks--but often lack the necessary background. (See this unsolicited advice to NLP students: https://lnkd.in/gKTdsH2P) Background in the standard Intro AI topics like search/CSP/logic are useful even if you don't plan on directly using those techniques (e.g. because you want differentiable everything to make use of your SGD hammer). Like MDPs, they provide a normative basis for many deeper reasoning tasks AI systems would have to carry out when they broaden their scope beyond statistical learning. Without that background, you will likely try to pigeon hole everything into "in/out of distribution" framework, when what you need to think of is "in/out of deductive/knowledge closure; see https://lnkd.in/gTWVibdt ) One of the other things that you get exposed to in the standard Intro #AI is computational complexity of the various reasoning tasks. People who jumped in directly via applied ML might understand a bit of sample complexity (maybe?), but are not that attuned to reasoning complexity. (Contd. in the comment below)

3 Comments

Carlos Luis Fernández Santana

Ecommerce Manager | AI Engineer |Data Analysis | Python | Digital Marketing

6mo

谢谢！

1 Reaction

Dinakar R.

CloudIDSS for Value Based Transformations

6mo

Salient points backed by understanding, reasoning & planning aka perceptual acuity

1 Reaction

See more comments

To view or add a comment, sign in

More Relevant Posts

Subbarao Kambhampati

Prof at ASU (Former President of AAAI)
6mo Edited
Report this post
𝕎𝕙𝕪 #𝔸𝕀 𝕗𝕠𝕝𝕜𝕤 𝕟𝕖𝕖𝕕 𝕒 𝕓𝕣𝕠𝕒𝕕 𝕓𝕒𝕤𝕖𝕕 𝕀𝕟𝕥𝕣𝕠 𝕥𝕠 #𝔸𝕀 👉 As I go around giving talks/tutorials on the planning and reasoning abilities of LLMs, I am constantly surprised at the rather narrow ML-centric background of grad students/young researchers have about #AI. This seems to be especially the case with those who think LLMs are already doing planning and reasoning etc. Most of them don't seem to know much about the many topics that are taught in a broad-based Intro to #AI course--such as combinatorial search, logic, CSP, difference between inductive vs. deductive reasoning (aka learning vs. inference), soundness vs. completeness of inference/reasoning etc. I can understand why a strong background in ML and DL is sine qua non these days in using/applying the current #AI technology. That doesn't however mean that other things, that are typically not covered in ML courses, but are covered in Intro #AI courses, are expendable. If you don't know those concepts, you are more likely than not to re-invent crooked wheels (see this for examples of how people get tripped up: https://lnkd.in/gUPPb7s4) All this is particularly relevant for those busy building empirical scaffolds over LLMs (the "LLMs are Zero-shot <XXX>" variety). Most often, these young researchers are coming from NLP. At one point, NLP used to be NLU and students had quite a firm grasp of logic (e..g Montague Semantics!). But over the years, NLU became NLP which in turn has become Applied Machine Learning, and students don't quite have the background in logic/reasoning etc. Now that LLMs have basically "solved" the "processing" tasks--such as information extraction, format conversion etc., NLP folks are trying to turn to reasoning tasks--but often lack the necessary background. (See this unsolicited advice to NLP students: https://lnkd.in/gKTdsH2P) Background in the standard Intro AI topics like search/CSP/logic are useful even if you don't plan on directly using those techniques (e.g. because you want differentiable everything to make use of your SGD hammer). Like MDPs, they provide a normative basis for many deeper reasoning tasks AI systems would have to carry out when they broaden their scope beyond statistical learning. Without that background, you will likely try to pigeon hole everything into "in/out of distribution" framework, when what you need to think of is "in/out of deductive/knowledge closure; see https://lnkd.in/gTWVibdt ) One of the other things that you get exposed to in the standard Intro #AI is computational complexity of the various reasoning tasks. People who jumped in directly via applied ML might understand a bit of sample complexity (maybe?), but are not that attuned to reasoning complexity. (Contd. in the comment below)

46 Comments
Like Comment
To view or add a comment, sign in
Chris Hart

Rule-based AI
6mo
Report this post
💯 This is also a huge issue for enterprises which requires change management around their data cultures. If I had a dollar for every time a bright eyed ML enthusiast claimed that we need 18 months of “data readiness” efforts before we could ship a recommendation engine, or that all “insights” require batch training and statistical modeling… Lots of very capable data scientists and engineers like to throw their own version of cold water onto AI hype (a quality to be admired). But when they missed out on the full history of techniques and double down on making everything mathematically rigorous, they just make business problems worse. There is a whole world of human-friendly logic and taxonomy modeling that integrates seamlessly with search and recommendation tasks, and it’s way faster and cheaper to ship. The real path forward is a hybrid of the classical symbolic, Good Old Fashioned AI (GOFAI), with the statistical modeling of data science (nowadays generative models to get started faster), plus human critics and curators to guide the knowledge requirements and oversee compliance for the system. Then, as generative models make small gains on retrieval by training on the exact content of a given domain (or that it is discovered where uncertainty is harmless for less consequential tasks), or if certain niche areas of the domain turn out to require more rigorous statistical metrics, system designers can decide, empirically at the margins, to both ease off a bit on the initial logic controls, and to train very specialized models for the metrics discovered. We need GOFAI, data, and UX teams to be allies in the face of hype, because if we don’t join forces, then we’ll produce janky, siloed systems and our organizations will be weighed down by confused executives and managers just trying to solve problems for their customers while we bikeshed the exact pattern of Conway’s Law to follow. Into a vacuum of confusion and competing disciplinary silos, comes vendors with crappy products that don’t scale or work with customer knowledge properly, or worse, having only one of the silos dominate the path to market, and hitting a wall when it is realized that their one third of the whole picture is necessary but insufficient. This internal competition and disunity has played out in every large organization I’ve analyzed over my career, but only got worse with the advent of LLMs, while so much waste could be avoided, and value delivered instead. GOFAI + (data + ML) + UX = responsible AI 😎

Subbarao Kambhampati

Prof at ASU (Former President of AAAI)
6mo Edited

𝕎𝕙𝕪 #𝔸𝕀 𝕗𝕠𝕝𝕜𝕤 𝕟𝕖𝕖𝕕 𝕒 𝕓𝕣𝕠𝕒𝕕 𝕓𝕒𝕤𝕖𝕕 𝕀𝕟𝕥𝕣𝕠 𝕥𝕠 #𝔸𝕀 👉 As I go around giving talks/tutorials on the planning and reasoning abilities of LLMs, I am constantly surprised at the rather narrow ML-centric background of grad students/young researchers have about #AI. This seems to be especially the case with those who think LLMs are already doing planning and reasoning etc. Most of them don't seem to know much about the many topics that are taught in a broad-based Intro to #AI course--such as combinatorial search, logic, CSP, difference between inductive vs. deductive reasoning (aka learning vs. inference), soundness vs. completeness of inference/reasoning etc. I can understand why a strong background in ML and DL is sine qua non these days in using/applying the current #AI technology. That doesn't however mean that other things, that are typically not covered in ML courses, but are covered in Intro #AI courses, are expendable. If you don't know those concepts, you are more likely than not to re-invent crooked wheels (see this for examples of how people get tripped up: https://lnkd.in/gUPPb7s4) All this is particularly relevant for those busy building empirical scaffolds over LLMs (the "LLMs are Zero-shot <XXX>" variety). Most often, these young researchers are coming from NLP. At one point, NLP used to be NLU and students had quite a firm grasp of logic (e..g Montague Semantics!). But over the years, NLU became NLP which in turn has become Applied Machine Learning, and students don't quite have the background in logic/reasoning etc. Now that LLMs have basically "solved" the "processing" tasks--such as information extraction, format conversion etc., NLP folks are trying to turn to reasoning tasks--but often lack the necessary background. (See this unsolicited advice to NLP students: https://lnkd.in/gKTdsH2P) Background in the standard Intro AI topics like search/CSP/logic are useful even if you don't plan on directly using those techniques (e.g. because you want differentiable everything to make use of your SGD hammer). Like MDPs, they provide a normative basis for many deeper reasoning tasks AI systems would have to carry out when they broaden their scope beyond statistical learning. Without that background, you will likely try to pigeon hole everything into "in/out of distribution" framework, when what you need to think of is "in/out of deductive/knowledge closure; see https://lnkd.in/gTWVibdt ) One of the other things that you get exposed to in the standard Intro #AI is computational complexity of the various reasoning tasks. People who jumped in directly via applied ML might understand a bit of sample complexity (maybe?), but are not that attuned to reasoning complexity. (Contd. in the comment below)
Like Comment
To view or add a comment, sign in
Research Graph

201 followers
10mo Edited
Report this post
Summary of the Article “BERT Explained: State of the art language model for NLP ” Article Link: https://lnkd.in/dzFesuy In this article, the author talks about the working of BERT. The article emphasises on two key things, bidirectional training and the Masked LM technique used in BERT. The author states that a transformer has two mechanisms - an encoder that reads the input and a decoder that produces the prediction. In addition, the article also explains how the bidirectional nature of the transformer helps the model understand the context of a word better. The article also describes the Masked LM technique used in BERT. During this technique, 15% of the words in each sequence are replaced with a [MASK] token. The model then attempts to predict the original value of these tokens using the context provided by other non-masked words. In addition to Masked LM, the BERT model is also trained using a technique called Next Sentence Prediction (NSP). During this process, the model is fed with a pair of sentences as input and the model learns to predict if the second sentence is the subsequent sentence in the original document. Lastly, the article talks about the wide variety of applications that BERT can be used for. These applications include classification tasks, question-answering tasks, name entity recognition etc. In addition to this, it can also be used for transfer learning purposes and be applied to other NLP tasks as well. In conclusion, BERT is undoubtedly a breakthrough in the field of NLP which has a vast array of applications. Reference: https://lnkd.in/dzFesuy #artificialintelligence #naturallanguage #BERT
Like Comment
To view or add a comment, sign in
Pritam Mohanty

Agentic AI Specialist | Product Management
7mo
Report this post
🚀 Enhancing NLP Model Performance with Prompt Engineering 🌟 Are you a product manager looking to take your NLP models to the next level? Prompt engineering is a powerful technique that can help you improve the performance of your models and deliver better results. Here are some tips and best practices to help you leverage prompt engineering effectively: 🔍 Understand Your Data: Before diving into prompt engineering, make sure you have a clear understanding of your data and the specific problem you're trying to solve. This will help you create prompts that are tailored to your use case and goals. 📝 Craft Clear and Specific Prompts: The key to effective prompt engineering is crafting prompts that are clear, specific, and relevant to the task at hand. Avoid ambiguity and make sure your prompts provide enough context for the model to generate accurate responses. 🎯 Experiment with Different Prompt Formats: Don't be afraid to experiment with different prompt formats to see what works best for your NLP model. Try using different types of prompts, such as fill-in-the-blank or multiple-choice, to see which one yields the best results. 🔧 Fine-Tune Your Prompts: Regularly fine-tune and refine your prompts based on the performance of your NLP model. Analyze the outputs generated by your model and make adjustments to your prompts to address any issues or improve accuracy. 🧠 Incorporate Domain Knowledge: Leverage your domain knowledge and expertise to create prompts that are tailored to the specific nuances of your industry or use case. This can help you create more effective prompts that produce better results. 🚦 Monitor and Measure Performance: Keep a close eye on the performance of your NLP models and regularly monitor key metrics to track the impact of your prompt engineering efforts. Use this data to make informed decisions and iterate on your prompts. By leveraging prompt engineering techniques effectively, product managers can unlock the full potential of their NLP models and deliver impactful solutions that meet the needs of their users. Stay curious, experiment with different approaches, and never stop learning to continuously improve your NLP model performance! 💡✨ #ProductManagement #NLP #PromptEngineering #AI #GenAI #BestPractices #Innovation #DataScience #ProductDevelopment
Like Comment
To view or add a comment, sign in
Ankur Dangwar

Systems Intern '22 | Central Coalfields Limited | Web Development, HTML, CSS, C++,JavaScript,Linux
3mo
Report this post
🥇Do You know what is R.A.G ? 🤯 🧠 Retrieval Augmented Generation (RAG) makes AI models like large language models (LLMs) smarter by combining their knowledge with external sources. Think of it as an LLM doing its own research! 📚 ♊ Gemini, Google's powerful LLM, utilizes RAG to provide more accurate and comprehensive responses. It's like having a personal research assistant built into the model. 🧑🏫 💡 RAG works by accessing a vast database of information, similar to a student researching in a library to write a report. This allows the AI to enhance its knowledge and deliver better results. 📖 🚀 Ready to learn more about RAG and Gemini? Check out the links in the description to get hands-on experience with these cutting-edge AI tools! Here are Some Foundational Research 🧐🔬 Papers📄 on RAG: 🔗Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks: (Lewis et al., 2020): https://lnkd.in/gbh8b6_v 🔗 REALM: Retrieval-Augmented Language Model Pre-Training: (Guu et al., 2020): https://lnkd.in/gjDWMtZj #Gemini #AI #LLM #GoogleCloud #MachineLearning #ArtificialIntelligence

Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks

arxiv.org
Like Comment
To view or add a comment, sign in
Vijeta Wasnik

Data Scientist | Machine Learning Engineer | Natural Language Processing, Machine Learning, Generative AI | B.Tech in Information Technology
6mo Edited
Report this post
🌟 Day 6 of 30 Days of NLP 🌟 Today, let's dive into Text Preprocessing Level 3 with a focus on Word Embeddings! 🚀 📍 Word Embeddings: Definition: Word embeddings are a type of word representation that allows words to be represented as vectors in a continuous vector space. They capture semantic meanings, relationships, and contexts of words. Popular word embedding techniques include Word2Vec, GloVe, and FastText. How It Works: Word embeddings are trained using neural networks on large text corpora. The result is a vector for each word, where semantically similar words have similar vectors. Popular Word Embedding Techniques: Word2Vec: Developed by Google. Uses two architectures: Continuous Bag of Words (CBOW) and Skip-gram. Example: Words like "king" and "queen" will have similar vectors. GloVe (Global Vectors for Word Representation): Developed by Stanford. Combines the advantages of global matrix factorization and local context window methods. Example: Captures the context of words by looking at their co-occurrence probabilities in a corpus. FastText: Developed by Facebook. Extends Word2Vec by representing words as bags of character n-grams. Example: Can generate embeddings for rare or misspelled words. Benefits of Word Embeddings: Capture semantic meanings and relationships between words. Efficient representation of words in a dense vector space. Improve performance in various NLP tasks like sentiment analysis, machine translation, and text classification. Limitations: Require large corpora for training effective embeddings. May not capture context-specific meanings (e.g., "bank" as in river vs. financial institution). Stay tuned for more insights and questions as we continue our journey into NLP! Let's keep learning and exploring the fascinating world of Natural Language Processing. 🚀💡 #30DaysOfNLP #NaturalLanguageProcessing #AI #MachineLearning #DataScience #WordEmbeddings #Word2Vec #GloVe #FastText
Like Comment
To view or add a comment, sign in
Pritam Mohanty

Agentic AI Specialist | Product Management
6mo
Report this post
🚀 Enhancing NLP Models Through Prompt Engineering 🧠✨ As product managers, leveraging prompt engineering techniques can be a game-changer in improving the performance of NLP models. Here are some key tips and best practices to consider on this exciting journey: 🔹 **Understand Your Data**: Before diving into prompt engineering, ensure you have a deep understanding of your data. Analyze the patterns, nuances, and biases present in your dataset to tailor prompts effectively. 🔹 **Craft Clear and Specific Prompts**: The success of prompt engineering lies in the clarity and specificity of the prompts. Avoid vague language and ambiguity to guide the NLP model towards accurate responses. 🔹 **Experiment and Iterate**: Don't be afraid to experiment with different prompt variations. Test out diverse approaches, iterate on the language used, and analyze the model's responses to refine your prompt strategy. 🔹 **Domain Expert Collaboration**: Collaborating with domain experts can provide valuable insights into crafting prompts that resonate with the context of your product or industry. Their expertise can help in formulating prompts that capture the intricacies of the domain. 🔹 **Continuous Monitoring and Evaluation**: Monitoring the performance of your NLP model is crucial. Regularly evaluate the outputs, gather feedback, and make adjustments to the prompts based on the model's responses to ensure ongoing improvements. 🔹 **Utilize Transfer Learning**: Leverage transfer learning techniques to fine-tune your NLP models. By transferring knowledge from pre-trained models, you can enhance the performance and efficiency of your prompts. 🔹 **Consider Bias and Fairness**: Be mindful of biases that may inadvertently seep into your prompts. Regularly assess the data inputs and outputs for any biases and work towards creating fair and inclusive prompts. By incorporating these tips and best practices into your prompt engineering approach, you can unlock the full potential of NLP models and empower your product with enhanced performance and accuracy. Stay curious, iterate relentlessly, and watch your NLP models flourish! 💡🌟 #NLP #PromptEngineering #ProductManagement #AI #MachineLearning #GenAI #ProductPerformance #Innovation #TechTips
Like Comment
To view or add a comment, sign in
Vaishnavi Sonawane

Applied AI Engineer | LinkedIn Top 1% ML Voice | AWS Certified ML Specialty | Pie & AI~ Ahmedabad Lead
10mo Edited
Report this post
Are you really good at building NLP systems? Here is a checklist for you. Each topic is worth 1 token. Drop in the comments how many you got. I can build a rule-based text classifier ✅ I can build a bag-of-words classifier ✅ I can build subword tokenizers using Byte Pair Encoding ✅ I can build subword tokenizers using WordPiece ✅ I can build subword tokenizers using Unigram ✅ I can build subword tokenizers using SentencePiece ✅ I can build word embedding generators using One Hot Encoding ✅ I can build word embedding generators using TF-IDF ✅ I can build word embedding generators using Word2Vec with Continuous Bag of Words ✅ I can build word embedding generators using Word2Vec with Skip-gram✅ I can build word embedding generators using GloVe ✅ I can build word embedding generators using FastText ✅ I can build Auto-regressive Language Models✅ I can build Count-based Unigram Models✅ I can build Higher-order n-gram models✅ I can evaluate Language Models with log-likelihood, per-word log likeihood, per-word (cross) entropy, perplexity ✅ I can build Feed-forward Neural Language Models ✅ I can build Recurrent Neural Networks models ✅ I can build Long Short-term Memory models✅ I know how attention mechanisms (cross, self) work ✅ I can build Transformer models (encoder-decoder like T5 & MBART, decoder only like GPT and LLaMa) and I understand how positional encodings, multi-head attention, masked attention, residual & layer normalization, feed-forward layers work ✅ I know how sampling for LMs (ancestral, top-p, top-k, epsilon), beam-search and it's variants work ✅ I know how Minimum Bayes Risk work ✅ I know how constrained decoding, human-in-the-loop decoding works ✅ I know how prompting methods, seq-2-seq pre-training, prompt engineering, answer engineering, multi-prompt learning and prompt-aware training methods work ✅ I know how multi-tasking, fine-tuning and instructiontuning, Parameter Efficient Fine-tuning, Instruction tuning datasets and Synthetic Data generation works ✅ I know how retrieval methods, retrieval augmented generation and long-context transformers work ✅ I know how distillation, quantization and pruning work ✅ I know how probing, attribution techniques and interpretable evaluation works ✅ I know how ensembling and mixture of experts works ✅ While I am sure there is more and I am aware knowing everything in the domain of AI is not practical, I am sharing this list here so that you know the correct roadmap to go about Generative AI and not just have an overview of hyped concepts without really understanding the pre-requisites. If you want to do it right, I am starting a NLP series starting this Saturday. Event details here: #aicommunity #nlp #techtalk

NLP - Why, For What, When & How did we reach here!? Part 1 · Luma

lu.ma

1 Comment
Like Comment
To view or add a comment, sign in
kaikai luo

CEO
7mo
Report this post
📌 Monosemanticity in NLP 1. **Monosemanticity**: Single, well-defined word meanings. Improves NLP accuracy and efficiency by reducing ambiguity. 2. **Challenges of Polysemy**: Multiple meanings cause confusion in NLP models. Traditional solutions require large, annotated datasets, which are impractical. 3. **Scaling Monosemanticity**: Involves creating word embeddings with single meanings, reducing ambiguity, and improving model performance. 4. **Techniques**: - **Contrastive Learning**: Distinguishes semantically distinct words. - **Semantic Clustering**: Groups similar words. - **Multi-Task Learning**: Trains on multiple tasks. - **Attention Mechanisms**: Focuses on relevant input parts. 5. **Benefits**: - Improved accuracy in tasks like sentiment analysis. - Faster processing without disambiguating meanings. - Less need for annotated data. - Applicable to any language. 6. **Applications**: - Chatbots and virtual assistants. - Machine translation. - Text summarization and generation. - Information retrieval. - Sentiment analysis. 7. **Challenges**: - Requires large, high-quality datasets. - Potential bias in training data. - Interpretability issues. 8. **Future**: Can revolutionize NLP and other AI fields. Important to address challenges like bias and interpretability. #NLP #AI #Monosemanticity #MachineLearning #TechInnovation
Like Comment
To view or add a comment, sign in
Yvonne Sun

Software & Machine Learning Engineer | Self-taught Statistician | Building Large-scale Enterprise Web Application & Scalable AI Solutions | Python, R, Java, SQL, Agile, DevOps, MLOps
8mo
Report this post
How to Choose the Right LLM for Your NLP Tasks ❓ Training a large language model (LLM) from scratch requires a lot of time and effort. Sometimes when we don't have the computing resources to train large language models (LLMs) from scratch, using pretrained LLMs is a feasible solution. In this scenario, how do you choose an appropriate LLM for the NLP problems you are working on❓ In all my work tasks as a developer, the first step is conducting a requirement analysis to understand the task. The same applies here🔆. ➡️When working on NLP tasks, the first priority is to identify the specific NLP task: whether it's text summarization, sentiment analysis, question answering, or something else. Different LLMs excel at different tasks. ➡️Next, match LLM capabilities by researching pre-trained LLMs and their strengths. For example, BART excels in summarization, while RoBERTa is strong in Question Answering. ➡️Data Availability and Model Training ◾️Data Size: Some LLMs, such as GPT-3, require massive datasets for fine-tuning. Consider the size and quality of your available data. If your dataset is small, traditional classification models like SVM or logistic regression may work well👍. ◾️Fine-tuning vs. Off-the-Shelf: Can we fine-tune a pre-trained model on the specific data, or do we need a model that performs well "off-the-shelf"? ➡️When we have enough data samples, larger models often perform better but require more computing power. Choose a model size that aligns with our hardware limitations. Some LLMs are offered as cloud services, while others require local deployment. We need to consider our budget and available infrastructure. ➡️Performance Benchmarks: Consult online resources for benchmark results on tasks similar to ours. This can help compare LLM performance and guide our choice. Instead of blindly working with a chosen LLM, review benchmark results on similar tasks to assess the potential effort and perform task evaluation. 📣In AI product development, especially for machine learning engineers, understanding a model’s potential misuse, limitations, biases, and out-of-scope applications is critical. Evaluate these factors before finalizing your choice❗️ #llms #machinelearning #nlp #datascience #deeplearning
Like Comment
To view or add a comment, sign in

4,424 followers

83 Posts

View Profile Connect

Hai Huang’s Post

More Relevant Posts

Explore topics