Welcome to the Techmeme Ride Home for Tuesday, July 30, 2024. I'm Brian McCullough. Today, perplexity wants to share ad revenue with publishers. But lots of AI companies are continuing to gamble with scraping. Meta's new segment, anything too model, AI influencers on Instagram, Canva makes an AI acquisition, and in non-AI news, Meta makes a huge settlement with Texas.
Here's what you'll miss today in the world of tech. Man, it's all AI stuff today. Perplexity is launching a program to share ad revenue with partners such as Time, DareSpeagle, Fortune, and WordPress.com after weeks of plagiarism accusations. I guess that's one way to do it, quoting the verge. Under this program, when perplexity features content from these publishers, and response to user queries, the publishers will receive a share of the ad revenue.
Publishing partners will also get a free one-year subscription to perplexity's enterprise pro-tier and access to perplexity's developer tools, plus insights through scalepost.ai, a new AI startup that helps secure partnerships between AI companies and publishers, such as how frequently their articles appear in search queries.
Dmitry Shavolenko, perplexity's chief business officer, declined to share exact deal terms, but said that the revenue share is a multi-year agreement with a double-digit percentage, consistent across all publishers, with especially favorable terms for the initial partners. Perplexity spokesperson Sarah Platnick added that payments are made on a per-source basis,
meaning publishers are compensated for each article used in responses. The program will temporarily provide cash advances on revenue to publishers as perplexity builds a long-term advertising model. The advances aren't a licensing fee for content like OpenAI's deals. It's a much better revenue split than Google, which is zero. Automatic CEO Matt Mullenweg told me via direct message. The publishing agreement doesn't cover WordPress.org, but automatic will-be-sending payments to
direct customers of WordPress.com. The amount, I don't know. Probably small to start because they don't make much revenue now, but if perplexity is the next Google, which I think it has a chance of being, these numbers could become meaningful, and we're looking to help publishers get paid in every way we can, he said. This new program comes a month after a Forbes editor found the publication's pay-wall reporting plagiarized and perplexity's new product pages, an AI-powered
tool that lets users create a report or article based on prompts. The AI-generated version of the Forbes story along with an AI-generated perplexity podcast of the story was then sent to subscribers via a mobile push notification Forbes reported. Wired then published an investigation that found perplexity's AI was quote paraphrasing wired stories and at times summarizing stories inaccurately and with minimal attribution. Forbes has since threatened legal action against perplexity.
Shevellenko told me the company started work on this program back in January well before the blowback, saying the team took inspiration from X's ad revenue sharing program. Perplexity planned to launch this program last month amid the drama but decided to hold off until now, he said, I asked him if this was a well-timed apology tour or if it was just a stopgap to prevent lawsuits. Quote, we don't want people saying nasty things about us more than we don't want to get sued.
Shevellenko said and quote, yeah, but you get the sense that other folks are making the strategic calculation to just go ahead and risk getting sued at this point. For example, some popular sites like conda nests, titles and Reuters.com modified their robots.tex files to block and thropic specific bots, but anthropic has allegedly just made new bots with other names. Other folks are apparently doing this as well, quoting 404 media.
Hundreds of websites trying to block the AI company anthropic from scraping their content are blocking the wrong bots, seemingly because they are copy-pasting outdated instructions into their robots.tex files and because companies are constantly launching new AI crawler bots with different names that will only be blocked if website owners update their robots.tex.
In particular, these sites are blocking two bots no longer used by the company while unknowingly leaving anthropics real and new scraper bot unblocked. This is an example of quote, how much of a mess the robots.tex landscape is right now. The anonymous operator of dark visitors told 404 media, dark visitors is a website that tracks the constantly shifting landscape of web crawlers and scrapers. Many of them operated by AI companies and which helps website owners regularly update
their robots.tex files to prevent specific types of scraping. The site has seen huge increases in popularity as more people try to block AI from scraping their work. Last week, repair guide site iFixit said that anthropics crawlers had hit its website nearly a million times in one day. And the coding documentation deployment service read the docs published a blog post saying that
various crawlers had hit its servers at a huge scale. One crawler said, accessed 10 terabytes worth of files in a single day and 73 terabytes total in May. This cost us over $5,000 in bandwidth charges and we had to block the crawler they wrote. We are asking all AI companies to be more respectful of the sites they are crawling. They are risking many sites blocking them for abuse, irrespective of the other copyright and
moral issues that are at play in the industry. The anthropic finding was published in a paper by the data Providence Initiative that more broadly shows the pervasive confusion content creators and website owners face when trying to block AI tools from being trained on their work. The onus on blocking AI scrapers is put entirely on website owners and the number of scrapers is constantly increasing. New scraper bots often called user agents are popping up all the time.
AI companies sometimes ignore the stated wishes of website owners and bots that are seemingly connected to well-known companies sometimes are connected to them at all. As best as I can tell the calculation here is if we scrape to build our model once we have the model fine. We'll take what comes but if you don't even have a model to begin with you don't have anything. So scrape first and find out what happens later I guess.
Meta has released the segment anything model 2 with support for object segmentation in videos and images. The code and weights are available under an Apache 2.0 license, quoting TechCrunch. Segmentation is the technical term for when a vision model looks at a picture and picks out the parts. This is a dog. This is a tree behind the dog. Hopefully and not this is a tree growing out of a dog. This has been happening for decades but recently it's gotten way better
and faster with segment anything being a major step forward. Segment anything 2 is a natural follow-up in that it applies natively to video and not just still images. Though you could of course run the first model on every frame of a video individually it's not the most efficient workflow. Scientists use this stuff to study like coral reefs and natural habitats things like that but being able to do this in video and have it be zero shot and tell you what you want it's pretty
cool. Mark Zuckerberg said in a conversation with NVIDIA CEO Jeff Sun Huang processing video is of course much more computationally demanding and it's a testament to the advances made across the industry and efficiency that SA2 can run without melting the data center. Of course it's still a huge model that needs serious hardware to work but fast flexible segmentation was practically
impossible even a year ago. The model will like the first be open and free to use and there's no word of a hosted version something these AI companies sometimes offer but there is a free demo. Naturally such a model takes a ton of data to train and meta is also releasing a large annotated database of 50,000 videos that it had created just for this purpose in the paper describing SA2 another database of over 100,000 internally available videos was also used for training
and this one is not being made public. I've asked META for more information on what this is and why it's not being released. Our guess would be that it's sourced from public Instagram and Facebook profiles and quote. META has also rolled out AI Studio in the US letting users create and share AI chat bots and Instagram creators set up chat bots to answer DM questions and story replies. Quoting and gadget the next time you DM a creator on Instagram you might get a reply from their AI.
META is starting to roll out its AI Studio a set of tools that will allow Instagram creators to make an AI persona that can answer questions and chat with their followers and fans on their behalf. According to META the new creator AI's are meant to address a long running issue for Instagram users with large followings. It can be nearly impossible for the services most popular
users to keep up with the flood of messages they receive every day. Now though they'll be able to make an AI that functions as and quote extension of themselves says Connor Hayes who is VP of product for AI Studio at META. These creators can actually use the comments that they've made the captions that they've made the transcripts of the reels that they've posted as well as any custom instructions or links that they want to provide so that the AI can answer on their behalf. Hayes
tells Engaget. Mark Zuckerberg has suggested he has big ambitions for such chat bots in a recent interview with Bloomberg. He said he expects there will eventually be hundreds of millions of creator made a eyes on META's apps. However, it's unclear if Instagram's users will be as interesting engaging with AI versions of their favorite creators. META previously experimented with AI chat bots that took on the personalities of celebrities like Snoop Dogg and Kendall Jenner,
but those characters proved to be largely underwhelming. One thing that ended up being somewhat confusing for people was, am I talking to the celebrity that is embodying this AI or am I talking to an AI in there playing the character? META's Hayes says about the celebrity branded chat bots. We think that going in this direction where the public figures can represent themselves or an AI that's an extension of themselves will be a lot clearer. And quote AI studio isn't just for
creators though. META will also allow any user to create custom AI characters that can chat about specific topics, make memes or offer advice. Like the creator focus characters, these chat bots will be powered by META's new Lama 3.1 model. Users can share their chat bot creations and track how many people are using them, though they won't be able to view other users interactions with them. Being in control of my health means being super mindful of what I put in my body from food to
supplements. I'm always trying to find the best option out there, which is why I'm so excited to tell you about Thorn. Thorn takes a personalized, innovative and scientific approach to health and wellness with their supplements. They manufacture all their supplements in the US using top-notch ingredients sourced globally, plus they team up with leading medical professionals to bring you
highly effective nutritional supplements. Whether it's their B complex, creatine, magnesium, citramate, or basic prenatal, Thorns got all the supplements I need to help promote and maintain my health goals. I'm a fan of their super greens, helps me wake up better than coffee does, helps me focus to do the show. Thorns not just my go-to, it's trusted by over 5 million customers, 47,000 health care pros, loads of pro athletes, 100 plus pro sports teams, and multiple US national teams.
Give your body what it really needs with Thorn. Go to Thorn.fit-slash-ride-home and use code-ride-home for 10% off your first order. That's THORne.ne.fiT-slash-ride-home-code-ride-home for 10% off your first order. Thorn.fit-slash-ride-home-code-ride-home. These statements have not been evaluated by the Food and Drug Administration. This episode is brought to you by Dragon Ball Legends, the mobile fighting game. Dragon Ball Legends is the first card-based fighting mobile game based on the
widely popular Dragon Ball anime series. This impressive game, based on the adventures of the Dragon Ball characters, has high quality 3D graphics and authentic voice acting. Relive your favorite moments and fight scenes with epic cinematic. From Kid Goku to Beerus, you can collect and train all your favorite characters. The game allows you to collect them all, from all Dragon Ball sagas, including Dragon Ball Super. Fight with Goku, add them to your team, and use the intuitive
controls to battle players around the world in friendly matches. Challenge your friends, compete in leagues to rank up the leaderboard, and defeat powerful foes in co-op. Are you ready to become a legend? Download Dragon Ball Legends for free on iOS or Android. The financial terms of the deal weren't disclosed, but Canva Co-founder and Chief Product Officer Cameron Adams said it's a mix of cash and stock. All of Leonardo.ai's 120 employees will be
joining Canva, including the executive team. Leonardo will continue to run independently of Canva with a focus on rapid innovation, research, and development. Now backed by Canva's resources, Adams told TechCrunch. We'll keep offering all of Leonardo's existing tools and solutions. This acquisition aims to help Leonardo develop its platform and deepen their user growth with our investment, including by expanding their API business and investing in foundational model R&D.
Sydney-based Leonardo.ai founded in 2022 was originally meant to focus on video game asset creation, the startups founders met while working at a video game company. But then Leonardo.ai's team decided to build out the platform to meet more scenarios like creating and training AI models
for image creation across industries such as fashion, advertising, and architecture. Today, Leonardo.ai offers collaboration tools and a private cloud for models, including video generators, as well as access to APIs that less customers build their own tech infrastructure on top of Leonardo.ai's platform. Leonardo.ai differentiates itself from other generative AI art platforms by the amount of control that it gives users, co-founders, Jatchin Bosme and JJ Faisan and Chris Gillis
told TechCrunch in an interview last December. For example, Leonardo.ai's live canvas feature enables users to enter a text prompt and then make a quick sketch of what they want the end result to look like. As the user sketches, Leonardo.ai creates a photo-realistic image based on both text and sketch prompts in real time. It's unclear how Leonardo.ai trains its in-house generative models like its flagship model Phoenix. It's an important question to ask about any generative AI service given
the legal ramifications of training models on copyrighted content sans permission. Leonardo.ai's PR kept it vague when asked for clarification, saying only that the models are trained on licensed, synthetic, and publicly available slash open source data. Leonardo.ai has over 19 million registered users and its tools have been used to create more than a billion images. Leonardo.ai is canvas eighth acquisition overall and its second acquisition this year, coming three months after it bought
UK design company Affinity for an estimated $380 million. Canva also owns presentations startup Zedings, free stock photography sites Pixabay and Pexels, and check-based product mock-up app smart mock-ups. Finally, would you believe non-AI news? Meta has agreed to pay $1.4 billion to settle Texas's lawsuit accusing Meta of using facial recognition tech to collect biometric data of
millions of Texans without consent. The terms of the settlement disclosed on Tuesday marked the largest accord ever by any single state according to the lawyers for Texas whose legal team included the plaintiff's firm Keller Postman. The lawsuit filed in 2022 was the first major case to be brought under Texas's 2009 biometric privacy law according to law firms tracking the
litigation. A provision of the law provides damages of up to $25,000 per violation. Texas accused Facebook of capturing biometric information billions of times from photos and videos that users uploaded to the social media platform as part of a free discontinued feature called tag suggestions. A spokesperson for Meta said the company is pleased to resolve the matter and looks forward to quote exploring future opportunities to deepen our business investments in Texas, including
potentially developing data centers. It has continued to deny any wrongdoing. Texas and Meta said they reached an accord in May, weeks before the start of a trial and state court was scheduled to begin. Meta separately agreed to pay $650 million in 2020 to settle a biometric privacy class action that was brought under an Illinois privacy law that is considered one of the nation's most
stringent. The company also denied wrongdoing. Alphabet's Google separately is fighting a lawsuit by Texas accusing the company of violating the state's biometric law. Nothing more for you today, talk to you tomorrow.