“The USA Math Olympiad is an extremely challenging math competition for the top US high school students… Hours after it was completed…a team of scientists gave the problems to some of the top large language models, whose mathematical and reasoning abilities have been loudly proclaimed… The results were dismal: None of the AIs scored higher than 5% overall”
—Ernest Davis & Gary Marcus, Reports of LLMs mastering math have been greatly exaggerated
https://garymarcus.substack.com/p/reports-of-llms-mastering-math-have
#mathematics #llms #llm #ai
llm
A California federal judge ruled Thursday that three authors suing Anthropic over copyright infringement can bring a class action lawsuit representing all U.S. writers whose work was allegedly downloaded from libraries of pirated works.
Even though I am probably one of the affected authors, lawsuits like this make me nervous. If the decision comes down in favor of Anthropic it sets a precedent for repeating what they and others have done. I am very skeptical that these issues would be appropriately settled in the courts; we need proper regulation of this industry as of two years ago. It's likewise worth noting that OpenAI claims at least 10x the traffic of Anthropic's various products.
Also, I've been in these kinds of lawsuits before. We'll end up getting a coupon for $1 off use of Claude if the class wins, or something comparably absurd. (*)
#AI #GenAI #GenerativeAI #LLM #copyright #theft #lawsuit #Anthropic #Claude
(*) Years ago I was inadvertently part of a class action lawsuit against Poland Spring because I bought their water during the period covered by the lawsuit. They were found guilty of deceptive marketing because they were mixing tap water in with the "spring water" they claimed to be selling. I was awarded a $1, maybe $5, coupon to buy Poland "Spring" water.
⇒ Please help me find #GenAI truth-telling sites! ⇐
In the past I've come across several websites that effectively debunk #GenerativeAI hype.
However, now that I actually need them, to help me make the case at work for strong oversight of the company's GenAI use, I can't find any of them.
It seems like no matter what search terms and search engine I use, I get garbage search results (hype, indeed!).
What are your go-to websites for debunking #AI hype? #tech #LLM
What I thought then is still true today: to make something like a software agent legitimately useful for a lot of people would require a large amount of low-level grunt work and non-technical work (2) of the sort that the typical Silicon Valley company is unwilling to do. (3) The technology is the absolute easiest part of this task. Throwing a Bigger Computer at the problem leaves all those other pieces of work undone. It's like putting a bigger engine in a car with no wheels, hoping that'll make the car go.
By the way #AI companies and VCs, I'm available for contract work and have done due diligence research before if you ever want to stop wasting everyone's time and money!
#AI #GenAI #GenerativeAI #LLM #agents #hype #SiliconValley #VentureCapital #dev #tech
(1) Which we've been told repeatedly is essentially infinite time in the tech world.
(2) Establishing semantic data standards and convincing a large enough number of people to implement them being an important component. LLMs do not magically develop protocols and solve all the ETL-style problems of translating among different ones. The Semantic Web didn't really stick for a lot of reasons, but one reason is that it's hard!
(3) Back when I was still in the startup world I was asked several times by VCs to tell them what I thought about some new startup that claimed to be able to magically clean and fuse data. I think they're still very keen on investing in this style of magic, because it requires an intense amount of human labor, but I think where companies landed was invisibilizing low-paid workers in other countries and pretending a computer did the work they did. Which has also been happening for well over a quarter of a century.
Google AI Search Shift Leaves Website Makers Feeling ‘Betrayed’
The now-ubiquitous AI-generated answers — and the way Google has changed its search algorithm to support them — have caused traffic to independent websites to plummet, according to Bloomberg interviews with 25 publishers and people who work with them.
Remember when Facebook told everyone they should change all their content to video, because it got more traffic? And then that turned out to be such a blatant falsehood that companies went bankrupt trying to do this?
#AI #GenAI #GenerativeAI #LLM #Google #Gemini #AISlop
无意中看到了关于四代和五代编程语言的概念,原来SQL算四代编程语言中的数据管理和操作领域用的编程语言,而五代语言的目的是为了让电脑用户不用编程,只要聚焦问题的定义和解法的适用条件。但是上个世纪80-90年代的尝试失败了,人们发现对于一个给定边界的具体问题,要生成对应的计算机算法去解决这个问题,这个任务本身就是一个困难的挑战,要完成这个任务仍然离不开程序员的洞察。
但这个定义让我联想到,最近几年来业界基于大模型开发的各种agent,mcp协议,就是冲着这个方向去的啊,并且看起来比40年前要promising得多了。