Better results scale with compute—more on learning, more on search. A good pretrained model takes you far, but test-time compute takes you further. It's important to recognize this new paradigm of scaling test-time compute, even for embedding models. https://lnkd.in/eFVArgjb
Jina AI
Software Development
Sunnyvale, California 16,660 followers
Your Search Foundation, Supercharged!
About us
Jina AI is a leading search AI company. Our Search Foundation consists of embeddings, rerankers, and small language models to help businesses build powerful GenAI and multimodal search applications.
- Website
-
https://jina.ai
External link for Jina AI
- Industry
- Software Development
- Company size
- 11-50 employees
- Headquarters
- Sunnyvale, California
- Type
- Privately Held
- Founded
- 2020
- Specialties
- Neural Search, Information Retrieval, Search, rag, embeddings, reranker, and rerank
Locations
-
Primary
710 Lakeway Dr
Suite 200
Sunnyvale, California 94085, US
-
Prinzessinnenstraße 19-20
Berlin, 10969, DE
-
No.48 Haidian West St
Beijing, CN
-
Fuan Technology Building
Shenzhen, Guang Dong, CN
Employees at Jina AI
Updates
-
Comparing how long-context embedding models perform with different chunking strategies to find the optimal approach for your needs. https://lnkd.in/epJcKVg6
Still Need Chunking When Long-Context Models Can Do It All?
jina.ai
-
You use our embedding models to do what? This might be one of the most "out-of-domain" applications of embeddings we’ve seen at EMNLP 2024. https://lnkd.in/eanD3fCR
Watermarking Text with Embedding Models to Protect Against Content Theft
jina.ai
-
Jina-CLIP v2, a 0.9B multimodal embedding model with multilingual support of 89 languages, high image resolution at 512x512, and Matryoshka representations. https://lnkd.in/epHA8txA
Jina CLIP v2: Multilingual Multimodal Embeddings for Text and Images
jina.ai
-
Is Meta-Prompt the new norm for API specs? Feed it to LLMs and generate integration code that reliably integrates Jina's APIs, saving you from the usual trial-and-error process. https://lnkd.in/e5St-mcm
Meta-Prompt for Better Jina API Integration and CodeGen
-
Search/acc! ⏩ Probably the hottest BoF session at #EMNLP2024! Nearly 100 researchers packed the room for 12 back-to-back talks on search foundation models - covering everything from code embeddings, distilled rerankers, ColPali, ColBERT, late chunking & smaller LMs. Killer lineup featuring Hans Ole Hatzel, Revanth Gangi Reddy, Björn Deiseroth, Rohan Jha, Manuel Faysse, Zhichao Xu, Jonghyun Song, Nayoung Choi, Rui Meng and many others! Big thanks to everyone who came out and showed such enthusiasm for search foundation models! We are looking forward to seeing you at EMNLP2025!
-
v3 + Late Chunking = cross-lingual positional alignment? In this Colab notebook, we show cross-lingual retrieval using jina-embeddings-v3, on different translations of "Alice's Adventures in Wonderland" Chapter 1. Two interesting observations: - The cross-lingual matches are remarkably accurate. - The matched segments across these different languages appear at roughly at the same position in their respective versions, showing a good alignment. This video explains why. Try it yourself - it runs smoothly on Google Colab's free T4 tier: https://lnkd.in/eu7CrX5A
-
At EMNLP 2024 Miami? Join us for a Birds of a Feather session focusing on embeddings, rerankers, and small LMs for better search. https://lnkd.in/eAVvryCS
Call for Participants: EMNLP 2024 BoF on Embeddings, Reranker & Small LMs for Better Search
jina.ai
-
Learn how Jina-CLIP enhances OpenAI's CLIP with better retrieval accuracy and more diverse results through unified text-image embeddings. https://lnkd.in/erq-fbPG
Beyond CLIP: How Jina-CLIP Advances Multimodal Search
jina.ai
-
We trained three small language models to better segment long documents into chunks, and here are the key lessons we learned. https://lnkd.in/eQwnWzHw
Finding Optimal Breakpoints in Long Documents Using Small Language Models
jina.ai