Yuan Sun

Sign in to view Yuan’s full profile

Welcome back

By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.

New to LinkedIn? Join now

By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.

United States Contact Info

Sign in to view Yuan’s full profile

Welcome back

By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.

New to LinkedIn? Join now

or

New to LinkedIn? Join now

By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.

920 followers 500+ connections

View mutual connections with Yuan

Welcome back

By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.

New to LinkedIn? Join now

or

New to LinkedIn? Join now

By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.

Join to view profile

Sign in to view Yuan’s full profile

Welcome back

By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.

New to LinkedIn? Join now

By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.

Nest Wallet

Northwestern University

Report this profile

Experience & Education

Nest Wallet

********

**-*******
********

***** **** *********
************ **********

****** ** ******* (*.*.)

2014 - 2015
******** ********** ** *********

******** ** ******* (*.**.)

2008 - 2012

View Yuan’s full experience

See their title, tenure and more.

Welcome back

By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.

New to LinkedIn? Join now

By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.

View Yuan’s full profile

See who you know in common
Get introduced
Contact Yuan directly

Join to view full profile

Sign in

Stay updated on your professional world

By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.

New to LinkedIn? Join now

Other similar profiles

Shiliyang (Bryce) Xu

United States

Connect
Ben Collier, PhD

Pittsburgh, PA

Connect
Mayur Mahajan

San Francisco Bay Area

Connect
Daliana Liu

San Francisco, CA

Connect
Jiawei Zhou

New York, NY

Connect
Kirti Pande

San Francisco Bay Area

Connect
Jin Rou (JR) New

San Francisco, CA

Connect
Jimmy Nguyen

San Francisco Bay Area

Connect
Lei Kang

New York, NY

Connect
Jiwei Liu, Ph.D.

Princeton, NJ

Connect

Explore more posts

Gaurav Kumar

Why AI Agents Need UX 🚀 Consider algorithmic trading: clear interfaces show trade rationale, risk metrics, and P&L in real-time. Without this visibility, a $$ error could slip through unseen. AI agent UX must enable oversight and intervention. UX helps achieve : (1) Real-time visibility (2) Safety controls & Risk monitoring Without UX, we risk costly mistakes. #AIAgents #UX Counter view: True autonomy requires independence. High-frequency trading algorithms already outperform human-supervised systems by removing decision latency. UX adds overhead: (1) Slows execution & Creates bottlenecks (2) Adds complexity and Bias Focus on accuracy and performance, not control. #AI #Innovation Thoughts?

3 Comments
Andy Ye

📣 Come check out the interview with Sida Shen in Authority Magazine. Sida talks about many of CelerData’s principles, such as: 1. Doing the right thing, not just the easy thing 2. Empathy with users 3. Being curious 4. Always staying humble We deeply understand the significance of data analytics for businesses, which drives us to continuously optimize our products and explore new possibilities. If you’re interested in improving your company’s data analytics capabilities, feel free to reach out to me! 🤝 #dataanalytics #lakehouse #bigdata https://lnkd.in/gEF5tF5E

1 Comment
Maria Rosario Mestre

Is it the end of RAG? Deepmind published a paper last week (link in comments) comparing the results of a RAG system (where you need to split your documents into chunks) and one that uses the entire context window of the LLM instead. They find that using long contexts (LC) with the most powerful LLMs is considerably more expensive but achieves higher performance than RAG. So the conclusion seems to be, send your longest documents to Google's proprietary models for best performance? However, looking at the results of the paper more thoroughly, this is what I found: * For a k above 10, RAG strikes the best balance between accuracy and cost. Looking at the two plots below, we can see that once top k is set to 10, the performance of RAG achieves 90% of the LC performance for only 41% of the costs. And this for a naive RAG that has not been optimised. * Get to know your data In the paper, they identified some of the common failures modes of RAG, and what they found is that it varied hugely across datasets. For each failure mode, there is a RAG technique to improve and address it (e.g. query expansion for multi-step), or maybe RAG is simply not the right approach, such as for summarisation queries. This made me think that before embarking on any project involving RAG, it's very important to understand the types of queries that are likely to be generated: it is mostly summaries, or queries that will have information spread out across long documents, etc? * Evaluation will be flawed when using datasets seen by the LLM previously As they comment in the paper, some of these benchmarks cannot be translated directly to enterprise use cases, because they use datasets with public knowledge, most likely already "seen" by the LLM (e.g. wikipedia). How do these results compare to a search system built on top of proprietary enterprise data? So the conclusion seems to be that no, RAG is still here to stay unless costs of LLMs drop significantly.

34 Comments
Kelly Weilu Han，PhD

San Francisco or Seattle? That's a question many of you have asked. San Francisco is incredibly energetic. You might find yourself meeting remarkable entrepreneurs at a coffee shop before heading to a Founders vs. Funders event on a downtown rooftop, perhaps near SOMA. However, I choose Seattle. Why? It's the moments I saw Joe Joe Heitzeberg and the incredible community he's built that make the difference. In Seattle, I know I'm in the right place because, at the end of the day, everything we build is about people, community, and stories. Speaking of stories. This one-shot episode focuses on browser agents, but there's something incredibly human about it. When you hear the founder discuss the most frequent use case for browser agents, you realize that even in the world of AI, it all comes back to serving human needs and enhancing our experiences.

1 Comment
Asif Razzaq

OpenPipe Introduces a New Family of ‘Mixture of Agents’ MoA Models Optimized for Generating Synthetic Training Data: Outperform GPT-4 at 1/25th the Cost Quick read: https://lnkd.in/gfxdiDT5 OpenPipe’s MoA models have excelled in rigorous benchmarking tests, achieving notable scores on LMSYS’s Arena Hard Auto and AlpacaEval 2.0. The MoA model scored 84.8 on Arena Hard Auto and 68.4 on AlpacaEval 2.0, indicating its superior performance in generating high-quality synthetic data. These benchmarks are critical as they represent challenging user queries that test the robustness and adaptability of AI models. The MoA model has been benchmarked against various GPT-4 variants in real-world scenarios. Results showed that OpenPipe’s MoA model was preferred over GPT-4 in 59.5% of the tasks evaluated by Claude 3 Opus. This is a significant achievement, highlighting the model’s effectiveness and practical applicability in diverse tasks encountered by OpenPipe’s customers...... OpenPipe

2 Comments
Daniel Gwerzman

My latest post: "The Missing Link in RAG Systems: Bridging Text and Visuals in PDFs for Better Results." In this piece, I explore how to enhance Retrieval-Augmented Generation (RAG) systems by integrating both text and visual elements in PDFs using Gemini 1.5 Flash. This method preserves the full context of documents, improving the accuracy and quality of responses in tasks like Q&A, summarization, and more. If you're interested in advanced document processing and AI, check it out! https://lnkd.in/eaP2eD7v #AI #AISprint #MultimodalAI #RAGSystems #PDF #gemini
Manfred Tijerino

I am acting this week as a Data Engineer. Why? Radix is managing a massive PayPal merchant with over 600K active subscriptions and getting 15K on average in monthly subscription trials. This base data produces secondary massive data every hour, such as failed or successful payments, refunds, and, most importantly, tracking subscriptions by status: Active, Past Due, or Cancelled. Our current data pipeline manages 95% of our customers perfectly fine, but we have some customers with massive data requirements that necessitate a more powerful data pipeline. Radix's bread and butter is data accuracy; therefore, data ingestion, processing, and storage are key; otherwise, we won't be able to assist them effectively in helping them to track and analyze their revenue metrics. Now, the issue lies in the following: as a small startup, we need to use our resources very strictly to increase our margin per customer without compromising quality. A well-implemented data pipeline can definitely help. 🔗 Data Fetching → Data Transport → Data Processing 1) AWS Fargate: Running a Python script to fetch data from the PayPal API in batches. 2) SQS: Queuing messages and storing data until it's ready for processing (perfect for batch processing). - Note: While Kafka is ideal for real-time data streaming, our PayPal APIs update every 4 hours, so it’s not necessary for our current needs. Plus, Kafka can be costly! 3) MongoDB: Storing raw data in batches. However, with gigabytes of data per fetch, it can be expensive. For maintaining speed and efficiency, AWS S3 is a better option for raw data storage. - Note: If you need to store gigabytes of raw data, AWS S3 is the best option. You can process the data from there and then store the processed data in MongoDB. This way, you will save tons of money. 4) Apache Spark: Our secret sauce! Designed to process terabytes of data in real time with the right architecture. This enables us to create vital revenue metrics like MRR, Churn, LTV, and subscription statuses. - Note: If real-time metrics are needed, processing data directly from the transport layer (SQS or Kafka) is ideal, with Kafka being the better option for streaming. Always think about reducing computing power to save costs! 5) Save the processed data, which is the bread and butter. In this case, you have several options: - Once the data has been processed by Apache Spark, you can directly save it back to MongoDB in a new collection. - Alternatively, you can save the processed data in a new database or data lake. In my case, I will use MongoDB to store both the raw and processed data. #DataEngineering #AWS #MongoDB #ApacheSpark #PayPal #DataAnalytics #Innovation #TechForGood

1 Comment
Y Combinator

Encord (YC W21) just released the world’s first fully multimodal AI data platform. Their platform lets you now manage, curate, and label images, videos, audio, and text — all in one place. Use it to transform petabytes of unstructured data into high-quality datasets, ready for model alignment, training and fine-tuning. Some key features include: ▸ Customizable Annotation Layouts: Annotate multiple data types in a single view. ▸ High-Dimensional Embeddings: Detect data imbalances and outliers visually. ▸ Advanced Metadata Filtering: Curate data efficiently using data attributes and natural language search. ▸ Instant Cloud Data Unification: Manage data across AWS, GCP, and Azure. ▸ Edge Case Prioritization: Use active learning to evaluate model performance and fine-tuning Congrats to the team on the launch! https://lnkd.in/gWM6Fqsp

29 Comments
Alexander Hoffmann

Watch our Google I/O '24 recap! Exciting developments in AI! Gemini, our most capable model yet, is now available to developers worldwide. It's multimodal, meaning it understands text, images, and video, and has a massive context window, so it can process huge amounts of information. This is a big step towards making AI truly helpful for everyone. Check out the summary and see what you think. #GoogleIO #AI #Gemini

1 Comment
Rebecca Nagel

So this new GPT-4o model (how it's written -- but she's saying "4-oh") update is really about ease of use BUT it has better input for voice, visual, etc. and less latency. And now free users aren't stuck with 3.5. FREE USERS now have access to GPTs and GPT store, greatly widening the market for GPTs. PAID USERS get 5x capacity limit. GPT-4o is now available in API -- says 2x faste and 50 percent cheaper than GPT-turbo.
Max Comparetto

The foundation for the future of models developed by Salesforce 🔬 🧪 These models will serve as the basis for future development of industry and domain specific models that go beyond basic text generation and include agentic reasoning, tool selection, and function calling (to enable advanced automation with rational intelligence) If you've ever used AI and wished that it could... 📧 Read an email (or other) response from a colleague or customer and... 🏃🏽‍♂️ This step is about sensing some change to 'State' (loosely defined here); in this case, what has changed is the status of an ongoing dialogue with a colleague or customer: from 'Awaiting Response' to 'Respond to Customer' ⚡️ Automatically identify AND take the next best action: whether that be to reply with a standard text-based email, a message including availability, a calendar invite, etc or to take some other type of action altogether, such as... 🏃🏽‍♂️ This step is about reacting to a given change, with context, to determine the appropriate action or actions to take; it is also about selecting the right tools for the task (in other words, to completely satisfy the users' request) ☁️ Updating a system of record (e.g. CRM), creating or updating a document, a presentation, or some other type of file 🏃🏽‍♂️ The possibilities for this last step are truly limitless (and cyclic); I've used this example, above, for sake of providing something simple...but, imagine the range of possible actions being endless, and imagine agents having the capacity to continuously monitor for updates to re-run these steps to complete tasks with increasing levels of complexity Excited to see the team reach this milestone -- and even more excited to partner across our organization and externally to create user value with these breakthroughs Check out Huan Wang's post to learn more
Hamid Shojanazeri

Insightful Paper on Scaling law for Precision TL;DR The more tokens you train on, the more precision you need for inference. The paper studies the limits of quantization that has implications on compute resources, model size decisions. - Training in lower precision reduces a model's effective parameter count. - Post-training quantization degrades performance more for models trained on larger datasets. - Paper also suggested a scaling laws to predict the loss of a model with different parts in different precisions, and suggest that training larger models in lower precision may be compute optimal. https://lnkd.in/gk8yZx_p #quantization #LLMs #AI #inference
Alon Faktor, PhD.

Hey, I'd like to give a shout-out to ClearML for helping us at Vimeo develop AI-based systems. We have been happy customers for a few years now and I'd like to share a bit about how we use ClearML. We use ClearML Datasets to cache a dataset of video transcripts, and run testing loading directly from ClearML Datasets. This allows us to simultaneously speed up the data loading and ensure consistency. Moreover we use ClearML to save our benchmark annotations in one place and track our system performance on the benchmark every-time we run an experiment or change our prompts. ClearML allows us to see the different parameters and prompts that were used for each experiment and to monitor improvements or regressions in our performance. We also use ClearML to run large-scale tests and help with statistical evaluation of our methods. For example, we developed a RAG (Retrieval Augmented Generation) Q&A system and wanted to verify that the LLM will not answer certain questions or user queries that are outside the scope of the video. We used ClearML to collect and analyze the RAG responses on many videos for predefined user queries that were outside the scope of the videos and got good visibility into the performance of the system. Also, the comparison feature on ClearML is great for tracking the improvement of our metrics along the progressing versions of our systems.

1 Comment
Shishir Kumar Prasad

Replacements are crucial for online grocery shopping. Check out our latest blog post on how we've leveraged machine learning to generate replacement recommendations for out-of-stock items !! Great work by Ahsaas Bajaj and team 👏

1 Comment
Dmitry Kotlyarov

What’s the Real Context Size of Your Long-Context LLMs? Picking up from my last post (https://lnkd.in/gYtgs_DT), where I talked about the pros and cons of long-context LLMs in Retrieval-Augmented Generation. This NVIDIA research takes a closer look at how these models perform at their limits, with insightful findings that reveal some non-obvious challenges. The authors tested 17 language models, including well-known ones like Gemini 1.5, GPT-4, and Llama 3.1. Turns out, despite many models claim to handle huge contexts, very few actually keep up their performance as context length grows. This might come as a surprise if you're familiar with traditional benchmarks like "needle-in-a-haystack" and passkey retrieval, where most models achieve near-perfect accuracy. While those tests are useful, the new RULER benchmark pushes LLMs with more complex tasks, such as multi-hop tracing and aggregating information across long contexts, ultimately revealing their true limitations and areas for improvement. As language models grow in size and context capacity, it's easy to get excited about their new capabilities and forget the trade-offs. Just because a model can handle massive contexts doesn’t mean more is always better — often, keeping it small and simple still wins the day. I'll drop the link to the paper in the comments. #MachineLearning #ArtificialIntelligence #AI #LLM #GenAI #NeuralNetworks

3 Comments
Taylor Bayouth

I built a lightweight Python script to compress long texts for LLMs, maximizing context window efficiency without relying on RAG. The script lets you choose from multiple compression styles—like summarization, key points, and glossary terms—to retain the most essential content. It splits large text into manageable chunks and compresses each one until it hits your target token count. Ideal for keeping context clear, relevant, and within token limits! https://lnkd.in/eCbVqGsv

1 Comment
Matthew Yeseta

Seeking a full time onsite hybrid AI Architect Engineer Manager for leading RAG and LLM Generative AI role to relocate and lead RLHF, prompt engineering for zero shot learning, one shot learning, few short learning. Engineer Architect or Manage and lead teams on Fine Tuning LLM models on PEFT, RLFL, Prompt Tuning context window, and to work on Reward model using LoRA, and to work on agent to instruct LLM on Context window and lead AI Generative LLM RAG performance for scalability. Engineer Architect or Manage and lead Generative AI LLM Lang Chain Use cases, and work on Hugging faces for encoded/weights. Engineer Architect or Manage and contribute to building RAG Retrieval Augmented Generation to retrieve external library data for tailored to specific models domain in prompt analysis. Additional talents that I have which include People first partnerships, deliver on AI architect accountability and ensure focus on diversity inclusion for delivery business value. Manage stakeholders data projects to improve business revenue. People talents that I offer are people loving manager who is humble and offers critical creative innovative AI thinking with the team. Engineer Architect or Manager the architecture for AI Lang Chain Generative AI Use Cases that need LLM text words. Contribute as Engineer Lead on improved production pipeline supply chain signal processing and be the Engineer Architect for large language text analysis (LLM). Research for teams can be more productive in using Lang Smith for LLM large language models. Assist on curating sandbox to develop bidirectional decoder/encoder transformer (BERT) masking language next sentence (MLM). Lead again on OpenAI GPT for optimized performance to build chat what-if analysis for the business. Manage Engineer to lead scaling the team on development with performing LLM Chain and predictions models and build predict messages that pull from the Hugging Face Hub objects and prompt engineering with prompt templates and human message responses.
Sam Lee

Part II of our blog is out! In this installment Abde Tambawala and I explored usage metrics and pricing models across different AI modalities and discussed how traditional User-based subscriptions will need to be supplemented with usage metrics. Take a look and let us know what you think! #pricingstrategy #ai #monetization

2 Comments
Alon Bochman

How should you evaluate LLMs? Manually, or with other LLMs? LLMs are difficult to evaluate because they can produce unstructured text that covers a broad area of knowledge. Most app developers I talk to get to production with little more than a “vibe check.” That means plugging in a few (dozen) examples of what you think a user might ask and “checking the vibe” of what the model says back. An approach like this would be considered scandalous in traditional software: “Don’t worry about the regression test, I played with the app and it gave off a good vibe” But this is where most of us live for AI apps in 2024. You could take a step further and use an LLM to evaluate your LLM. Lots of techniques have been proposed: 1. Reference vs. reference-free: In a reference evaluation, you are comparing LLM output to a “known good” output. Useful when you have labeled data. 2. Prompt vs tuning: You can develop your application LLM with prompt engineering or fine-tuning. You have the same choice for developing your “judge LLM” 3. Type of output: Score, probability, likert-scale or pairwise. For example, a prompt like “Please score the quality of this text from 1 (worst) to 5 (best)" is a score output. Probability refers to the likelihood of particular text getting generated, which GPTs are good at calculating. With so many options, suppose you become a connoisseur? How would you choose the right “LLM Judge” for your use case? It generally comes down to measuring “human agreement” - which means you are back to having some human judges, and you’d pick the evaluator whose judgements are closest to those of the humans. Catch-22? Survey paper on LLM judges: https://buff.ly/3xNwSFV #AI #MachineLearning #LanguageModels #LLM #AppDevelopment #EvaluationTechniques #ReferenceEvaluation #PromptEngineering #FineTuning #HumanAgreement #LifeLessons #Research

19 Comments
David Daeschler

Here is a prime (and hilarious) example of terrible use of LLMs. You are trying to sell me a product and the image below is your example. A couple problems: First, our business has nothing to do with "enhancing railways", so whatever "research" your Cohere and Claude Opus models did was worse than useless. The rest of the fluff in the paragraph is based on the misinformation so it begins to sound like buzzword bingo. Second, you've demonstrated that your product will miss the mark so badly, it'll create a massive amount of noise and work for us that we'd lose money trying to use it. "PS: Full disclosure, we let our AI and automation handle the copywriting and outbound process this time. We decided to use Cohere for research and Claude Opus for copy, aiming to one-up the usual ChatGPT antics. If it turns out we've missed the mark, shoot us a message and we'll reevaluate (or atleast give our AIs a pep talk)!" Stop churning out garbage. Building solutions on LLMs requires the same level of thought, planning, and execution as any other project.

6 Comments

Explore collaborative articles

We’re unlocking community knowledge in a new way. Experts add insights directly into each article, started with the help of AI.

Explore More

Others named Yuan Sun in United States

Yuan Sun

Seattle, WA
Yuan Sun

New York, NY
Yuan Sun

San Francisco Bay Area
Yuan S.

Software Development Engineer II at Amazon

United States

83 others named Yuan Sun in United States are on LinkedIn

See others named Yuan Sun

Experience & Education

Nest Wallet

******** **

View Yuan’s full experience

See their title, tenure and more.

View Yuan’s full profile

Sign in

Other similar profiles

Shiliyang (Bryce) Xu

Ben Collier, PhD

Mayur Mahajan

Daliana Liu

Jiawei Zhou

Kirti Pande

Jin Rou (JR) New

Jimmy Nguyen

Lei Kang

Jiwei Liu, Ph.D.

Explore more posts

Explore collaborative articles

Others named Yuan Sun in United States

Yuan Sun

Yuan Sun

Yuan Sun

Yuan S.