Mon. 08/21 – Twit Pics Disappear - podcast episode cover

Mon. 08/21 – Twit Pics Disappear

Aug 21, 202316 min
--:--
--:--
Listen in podcast apps:

Episode description

More X shenanigans over the weekend. Some solid evidence that some major LLMs have in fact been trained on copyrighted material. A ton of it, in fact. As Arm prepares to IPO, who might join them, depending on how things go? Bad news for Adyen is probably bad news for Stripe. And the rise of high tech sailing ships.

Sponsors:


Links:

See Privacy Policy at https://art19.com/privacy and California Privacy Notice at https://art19.com/privacy#do-not-sell-my-info.

Transcript

Welcome to the TechMeme Ride Home from Monday, August 21st, 2023, I'm Brian McCalla today, more X-Street Antigans over the weekend, some solid evidence that major LLMs have in fact been trained on copyrighted material, a ton of it perhaps? As are and prepares to IPO, who might join them depending on how things go, bad news for Adyen is probably bad news for Stripe as well, and the rise of high tech sailing ships. Here's what you missed today on the world of tech.

There was some drama at the artist formerly known as Twitter over the weekend. First Elon Musk tweeted that the block feature is gonna go away, except for in the cases of DMs to be replaced with, I guess, just better muting or something. Again, starting to really feel like Elon is almost daring certain users to abandon the platform at this point. But then again, Elon tweets a lot of things on X and some of those do happen, some of them do

not come to pass, and yes, I'm aware that I just called it a tweet. Then Saturday or Sunday came word that many images uploaded directly to Twitter between 2011 and 2014 are not loading and links from those years that use Twitter's native shortening service are broken too, quoting Forbes. Twitter, the social media platform officially known as X, appears to have deleted all images from the website that were posted between 2011 and 2014.

Links that use Twitter's native shortening service are also broken. It's not immediately clear if this was an intentional act or an error, but whatever's happening is causing concern among users who've been on the site for over a decade. News about the photo deletions on Twitter first went viral on Saturday after user Tom Coats tweeted about it. I confirmed that my own photos on the platform from mid 2011 to 2014 have been deleted and links no longer work, as you can see in the tweet

below. It appears that Twitter's link shortening domain, the new URL that Twitter generates so it can track user activity, is the likely culprit behind why images no longer display and links no longer work. Twitter launched in 2006, but didn't support native image uploads until the summer of 2011. Several image hosting services sprung up to support Twitter like Twitter, but that service shut down

in 2014 and many images from those early days are lost. But it now seems images that were posted to Twitter directly from 2011 to 2014 could be in danger as well since they're no longer loading on the site. Some users on the Reddit forum DataHorter, which tracks data preservation from the internet age, speculate that Twitter has broken something in an effort to migrate the site to x.com, which Twitter owner Elon Musk has held for a number of years. But that's simply a logical guess

at this point and hasn't been confirmed. Another popular theory is that Twitter is attempting to save money on image hosting fees, another guess that hasn't been confirmed by anyone officially at Twitter. According to an analysis from the Atlantic, books three, a data set used to train Meta's Lama, Bloomberg GPT and Aluther AI's GPT J, among others, contains more than 170,000 books from authors

like Stephen King and Wannath Diaz. In a lawsuit filed in California last month, the writers Sarah Silverman, Richard Kadry, and Christopher Golden alleged that Meta violated copyright laws by using their books to train Lama a large language model similar to OpenAI's GPT-4, an algorithm that can generate text by mimicking the word patterns it finds in sample text. But neither the lawsuit itself nor the commentary surrounding it has offered a look under the hood.

We have not previously known for certain whether Lama was trained on Silverman's Kadry's or Goldman's books or any others for that matter. In fact, it was. I recently obtained and analyzed the data set used by Meta to train Lama. It's contents more than justify a fundamental aspect of the author's allegations. Pirated books are being used as inputs for computer programs that are changing how we read, learn, and communicate. The future promised by AI is written with stolen words.

As a writer and computer programmer, I've been curious about what kinds of books are used to train generative AI systems. Earlier this summer, I began reading online discussions among academic and hobbyist AI developers on sites such as GitHub and Hugging Face. These eventually led me to a directed download of the pile, a massive cache of training text created by Aluther AI that contains

the books three data set plus material from a variety of other sources. YouTube video subtitles, documents and transcriptions from the European Parliament, English, Wikipedia, emails sent and received by N-Round Corporation employees before it's 2001 collapse. And a lot more, upwards of 170,000 books. The majority published in the past 20 years

are in Lama's training data. In addition to work by Silverman Kadry and Golden Nonfiction by Michael Poland, Rebecca Solnit, and John Crackauer is being used as our thrillers by James Patterson and Stephen King and other fiction by George Saunders, Zadie Smith, and others. These books are

part of a data set called Books Three and its use has not been limited to Lama. Books Three was also used to train Bloomberg's Bloomberg GPT, Aluther AI's GPTJ, a popular open source model, and likely other generative AI programs now embedded in websites across the internet. A meta spokesperson declined a comment on the company's use of Books Three. Bloomberg did not respond to emails requesting comment and Stella Beaterman, Aluther AI's executive director,

did not dispute that the company used Books Three in GPTJ's training data. Of the 170,000 titles, roughly one-third are fiction, two-thirds nonfiction. They're from big and small publishers. To name a few examples, more than 30,000 titles are from Penguin Random House and its imprints, 14,000 from HarperCollins, 7,000 from McMillan, 1800 from Oxford University Press, and 600 from Verso. The collection includes fiction and nonfiction by Elena Ferrante and Rachel Cusk.

It contains at least nine books by Eric Umorekami, five by Jennifer Egan, seven by Jonathan Franz, and nine by Bell Hooks, five by David Graham, and nine by Margaret Atwood. Also have note 102 pulp novels by Elron Hubbard, 90 books by the young earth creationist pastor John F. MacArthur, and multiple works of Aliens built the pyramids pseudo history by Eric Von Denken, and an email statement Beaterman wrote in part, quote, we work closely with creators and

rights holders to understand and support their perspectives and needs. We are currently in the process of creating a version of the pile that exclusively contains documents license for that use. End quote. Word that the dam might break later today when you'd imagine sometime after the close of trading probably arm is expected to file its much anticipated IPO perspectives. Now the question is, will the dam really break after this? Will the IPO be successful enough that other tech startups

might test public market waters? And if so, who? Well, quoting the financial times, a group of Silicon Valley's biggest private tech companies are dusting off long delayed plans to list their shares with the upcoming initial public offering of chip designer arm set to provide a new gauge for market sentiment. Grocery delivery group Instacart software company Databricks and identity verification startup. So, cure are among those considered candidates to launch stock market debuts by next year

according to people familiar with their thinking. They would follow arms blockbuster public offering which is expected as soon as next month according to people familiar with the plans. That IPO provides an unusual test of investors thinking the UK based chip designer was public for 18 years before being taken private by soft bank for 24 billion pounds in 2016. That should ease its passage back on to the public market according to investors, but it also makes it harder for other startups to

draw firm conclusions. Arm is among the first big tech companies to attempt an IPO in 18 months with several well-funded startups such as Stripe having put off float plans during a turbulent period for public tech stocks. Instacart could be among the first to follow with an IPO before the end of this year according to two people close to the matter. It first filed its intention to

list in New York last May but delayed plans due to market conditions. The Grocery delivery company's valuation has plunged from a peak of $39 billion in March 2021 to $12 billion in May

of this year according to people with direct knowledge of the company's financial details. It will make a decision depending on whether public markets stabilize later in the year said the people Nasdaq, the New York Stock Exchange on which Arm plans the list has in 2013, recovered the bulk of last year's losses and investors are increasingly confident that a small number of startups that

shelved plans to list in 2021 could soon revive them. Josh Wolf, co-founder of venture capital firm Lux Capital said, a slim sliver of an IPO window may open later this year when it does singular category defining companies would be strong standalone public new listings he added. Databricks, which posted revenues of more than $1 billion in June and acquired open AI competitor Mozec ML for $1.3 billion is a candidate to IPO according to Wolf who is one of its venture capital backers.

ID verification company SoCure, which is valued at $4.5 billion, hinted at an IPO in 2021 but pulled plans when the market soured. SoCure this year secured a $95 million credit facility from JP Morgan has hired a new chief financial officer with IPO experience and is preparing for an IPO as soon as next year according to founder Jimmy Ayers end quote.

It's tough to be a business of one you got to do everything yourself not only does your admin work cost you an arm and a leg it also takes you away from your own billable hours. There's got to be a better way and there is collective.com collective was built specifically for businesses of one that are making over 60,000 in profit a year collective handles all of the stuff that used to cost a pile of money for a fraction of the cost taxes bookkeeping accounting even payroll

and if you're set up as an LLC or just a sole proprietorship collective can elect your S corp tax status which could save you thousands on your taxes in fact collective members save an average of $10,000 per year on taxes with the structure a collective membership pays for itself

within just a few months and it's a hundred percent tax deductible the price goes up on September 1st so lock in your lower rate for a full year today to sweeten the deal get an extra $100 off when you go to collective.com slash ride but you've got to do it before September 1st

a team of experts to handle your business formation accounting bookkeeping payroll and business taxes at a small fraction of the cost plus save potentially thousands of dollars each year on taxes that's collective.com slash ride.

Archie the podcast puppy enjoyed his nom nom this morning nom nom delivers fresh dog food with every portion personalized to your dog's needs so you can bring out their best nom nom's made with real whole food you can see and recognize without any additives or fillers that contribute to

bloating and low energy that's because nom nom uses the latest science and insights to make real good food for dogs their nutrient packed recipes are crafted by board certified veterinary nutritionists made fresh and shipped free to your door nom nom's already delivered over 40 million meals to good

dogs like yours inspiring millions of clean bowls and tail wags Archie loves nom nom unlike our previous beagle who ate everything Archie can be picky he'll actually turn things down but he took to nom nom right away and loves it nom nom comes with a money back guarantee so

if your dog's tail isn't wagging like Archie's is within 30 days nom nom will refund your first order go right now for 50% off your no risk two week trial at try nom dot com slash ride spelled try n o m dot com slash ride for 50% off try nom dot com slash ride

now you just heard stripe mentioned in that last piece but i'm wondering if stripe might not be among those lining up to go out the IPO door that's because stripes rival adyen just saw 20 billion dollars wiped off its valuation in a single day after a bad earnings report so if you're a

stripe investor and you're looking at cops this ain't good quoting CNBC the company's shares plummeted 39% on Thursday erasing 18 billion euros or 20 billion dollars from adyen's market capitalization as investors dumped the stock after the firm reported its slowest revenue growth

on record identified as one of the top 200 global fintech companies globally by CNBC and stetista adyen is a payment service firm that works with customers including Netflix meta and Spotify it also sells point of sale systems for physical stores and handles payments online and

in store more than a processor adyen is what is known as a payment gateway meaning it uses technology to enable merchants to take card payments and transactions through online stores the company takes a small cut off of every deal that runs through its platform adyen last week

reported results for the first half of the year that came in well below expectations the company's revenue of 739.1 million euros for the period was up 21% year over year but also showed adyen's slowest sales growth on record and this had expected 853.6 million euros of revenue and 40% of

year on year growth according to definitive icon forecast adyen has typically been viewed as a growth stock after consistently reporting revenue growth of 26% each half year period since its 2018 stock market debut adyen said in a letter to shareholders last week that its EBITDA margin fell to 43% in the first half of 2023 from 59% in the same period a year ago adyen has historically been a lean business opting to hire fewer people overall than its main competitor stripe which has

roughly doubled its staff. Simon Taylorhead of strategy at sardine ai said adyen might face a natural ceiling to wet business size it can reach before having to reduce its margins to grow again ultimately their subject to the same macro headwinds as everyone in e-commerce is

Taylor told CNBC and they still grew 21% in cumbans would kill for that end quote source to say the UK is in talks with Nvidia AMD and Intel to buy up to a hundred million pounds worth of GPUs for what's being called a national AI research resource though some officials want

to spend far more especially given that that's a fraction of what say Saudi Arabia recently purchased but again this is a trend of nation states trying to buy GPUs quoting the guardian taxpayer money will be used as part of a drive to build a national AI resource in Britain

some of those under development in the US and elsewhere it is understood that the funds will be used to order key components from major chip makers and video AMD and Intel but an official briefed on the plans told the guardian that the one hundred million pound offer by the government

is far too low relative to investments by peers in the US EU and China the official confirmed in a move first reported by the telegraph which also revealed the investment that the government is in advanced ages to order up to five thousand graphics processing units from Nvidia

where she soon acts government revealed plans in May to invest over one billion pounds over 10 years in semiconductor research design and production a step dwarfed by the US's 52 billion pound chips act and the EU subsidies in the neighborhood of 43 billion euro or 37 billion pounds

a hold up in progress triggered by relatively weak investment could leave the UK exposed the mid mounting geopolitical tensions over AI chip technology and quote so again this is a theme at this point Nvidia itself all by its lonesome as a single company is a key geopolitical bottleneck

especially if you believe as these governments do that AI will usher in a new computing era for the twenty first century and you don't want to be left behind just thinking idly now but how do you imagine the US government thinks about that thinks about Nvidia now finally today cargo ship that

harnesses wind power has made its maiden voyage quoting courts the first vessel of its kind to be retrofitted with the technology called wind wings the pixis ocean has set sail from China with a lofty goal of helping the maritime industry decarbonize agribusiness giant cargill chartered the

Mitsubishi corporation vessel the wind wings described as an advanced wind assistant propulsion and route optimization system in today's press release have been developed by UK based design and engineering firm bar technologies and manufactured by Yarmereen technologies harnessing wind

alone along the journey could lead up to a thirty percent reduction in fuel consumption also cutting the ship's carbon emissions if the ship can stay the course it could open doors to a greener future for the polluting industry retrofitting a solution to decarbonize existing vehicles

while offering new ones a sustainable edge design and quote some more facts in the typical courts house style thirty percent reduction in fuel consumption and CO2 emissions that wind wings can achieve on average trading patterns according to simulations this could be even higher if used

in combination with alternative fuels cargill and bar technology said in the press release thirty seven meters that's the size of the solid wing sales made from the same materials as wind turbines featured in the system which are fitted to the deck of the bulk cargo ships 751 meters the length

of the ship cargill has chartered equivalent to two American football fields so this isn't some sort of tiny yacht twenty three large ships currently equipped with some form of wind assist technology over the past twelve years galvan all right secretary of the international wind ship

association and a statement in july said the figure is expected to double over the next 12 months half that's the share of new build ships quote that will be ordered with wind propulsion according to bar technologies and quote we were promised flying cars but instead we're returning to sailing ships which if it works like it looks like it does why not deter irre Talk to you tomorrow.

This transcript was generated by Metacast using AI and may contain inaccuracies. Learn more about transcripts.