与马修·鲁索(麻省理工学院)一起探讨语义查询引擎

3作者: CShorten3 个月前原帖
人工智能正在改变数据库系统。到目前为止,最大的影响可能是自然语言与查询语言之间的转换,即文本到SQL(Text-to-SQL)。然而,另一个巨大的创新正在酝酿中。 我非常兴奋地发布第131期Weaviate播客,嘉宾是麻省理工学院的博士生Matthew Russo! 人工智能为我们的查询语言提供了新的语义操作符。例如,我们都熟悉WHERE过滤器。现在我们有了AI_WHERE,在这个操作中,一个大型语言模型(LLM)或其他人工智能模型可以计算过滤值,而无需它已经在数据库中可用! ```sql SELECT * FROM podcasts AI_WHERE “Text-to-SQL” in topics ``` 语义过滤器只是冰山一角,语义操作符的列表还包括语义连接、映射、排序、分类、分组和聚合等! 而且这还不止于此!关系代数的一个核心思想是查询规划,以及找到应用过滤器的最佳顺序。例如,假设你有两个过滤条件:汽车是红色的,以及汽车是BMW。现在假设数据集中只有100辆BMW,但有50,000辆红色汽车!首先应用BMW过滤器将限制下一个过滤器的数据集大小! 随着大型语言模型的参与,这一基础思想现在有了各种扩展!这个机会催生了新的查询引擎和声明式优化器,如Palimpzest、LOTUS等! 在这个播客中有很多有趣的内容,我很喜欢与Matthew讨论这些话题,希望你也会觉得有趣! YouTube: https://youtu.be/koPBr9W4qU0 Spotify: https://spotifycreators-web.app.link/e/ddUhVMmLoYb Medium: https://medium.com/@connorshorten300/semantic-query-engines-with-matthew-russo-weaviate-podcast-131-131a42bbc521
查看原文
AI is transforming Database Systems. Perhaps the biggest impact so far has been natural language to query language translations, or Text-to-SQL. However, another massive innovation is brewing.<p>I am SUPER EXCITED to publish the 131st episode of the Weaviate Podcast with Matthew Russo, a Ph.D. student at MIT!<p>AI presents new Semantic Operators for our query languages. For example, we are all familiar with the WHERE filter. Now we have AI_WHERE, in which an LLM or another AI model computes the filter value without needing it to be already available in the database!<p>```sql SELECT * FROM podcasts AI_WHERE “Text-to-SQL” in topics ```<p>Semantic Filters are just the tip of iceberg, the roster of Semantic Operators further includes Semantic Joins, Map, Rank, Classify, Groupby, and Aggregation!<p>And it doesn’t stop there! One of the core ideas for Relational Algebra and how its influenced Database Systems is query planning and finding the optimal order to apply filters. For example, let’s say you have two filters, the car is red and the car is a BMW. Now let’s say the dataset only contains 100 BMW, but 50,000 red cars!! Applying the BMW filter first will limit the size of the set for the next filter!<p>This foundational idea has all sorts of extensions now that LLMs are involved! This opportunity is giving rise to new query engines and declarative optimizers such as Palimpzest, LOTUS, and others!<p>So many interesting nuggets in this podcast, loved discussing these things with Matthew, and I hope you find it interesting!<p>YouTube: https:&#x2F;&#x2F;youtu.be&#x2F;koPBr9W4qU0<p>Spotify: https:&#x2F;&#x2F;spotifycreators-web.app.link&#x2F;e&#x2F;ddUhVMmLoYb<p>Medium: https:&#x2F;&#x2F;medium.com&#x2F;@connorshorten300&#x2F;semantic-query-engines-with-matthew-russo-weaviate-podcast-131-131a42bbc521