news4global
  • Home
  • Bitcoin
  • Blockchain
  • Business
  • Latest news
  • Market
  • Regulation
  • VideosHot
    Emergency – 100X Crypto coin just abi – jaldi karo #bitcoin #cryptocurrency

    Emergency – 100X Crypto coin just abi – jaldi karo #bitcoin #cryptocurrency

    Cryptocurrency Market Update in Sinhala | Crypto Market New Update | BTC Update | Technical Analysis

    Cryptocurrency Market Update in Sinhala | Crypto Market New Update | BTC Update | Technical Analysis

    Attention Bitcoin Holders: BITCOIN JUST MOVED $3000 +5%

    Attention Bitcoin Holders: BITCOIN JUST MOVED $3000 +5%

    FORGET MEME COINS – THIS CRYPTOCURRENCY CAN 100X

    FORGET MEME COINS – THIS CRYPTOCURRENCY CAN 100X

    BEST LAYER 2 SOLUTION | Polygon (MATIC) vs Arbitrum (ARB) | L2 Solution in 2023 | Cryptocurrency

    BEST LAYER 2 SOLUTION | Polygon (MATIC) vs Arbitrum (ARB) | L2 Solution in 2023 | Cryptocurrency

    FTM Fantom Crypto Price News Today – Elliott Wave Technical Analysis Update and Price Now!

    FTM Fantom Crypto Price News Today – Elliott Wave Technical Analysis Update and Price Now!

    BREAKING: COINBASE AND BLACKROCK ARE SENDING PEPE COIN TO $0.01 – EXPLAINED – PEPE COIN NEWS TODAY

    BREAKING: COINBASE AND BLACKROCK ARE SENDING PEPE COIN TO $0.01 – EXPLAINED – PEPE COIN NEWS TODAY

    Beldex cryptocurrency fact check  in tamil | monero coin | bitcoin |crypto update |crypto scam

    Beldex cryptocurrency fact check in tamil | monero coin | bitcoin |crypto update |crypto scam

    Polkadot DOT Price News Today – Technical Analysis Update Now, Price Now! Elliott Wave Analysis!

    Polkadot DOT Price News Today – Technical Analysis Update Now, Price Now! Elliott Wave Analysis!

No Result
View All Result
  • Home
  • Bitcoin
  • Blockchain
  • Business
  • Latest news
  • Market
  • Regulation
  • VideosHot
    Emergency – 100X Crypto coin just abi – jaldi karo #bitcoin #cryptocurrency

    Emergency – 100X Crypto coin just abi – jaldi karo #bitcoin #cryptocurrency

    Cryptocurrency Market Update in Sinhala | Crypto Market New Update | BTC Update | Technical Analysis

    Cryptocurrency Market Update in Sinhala | Crypto Market New Update | BTC Update | Technical Analysis

    Attention Bitcoin Holders: BITCOIN JUST MOVED $3000 +5%

    Attention Bitcoin Holders: BITCOIN JUST MOVED $3000 +5%

    FORGET MEME COINS – THIS CRYPTOCURRENCY CAN 100X

    FORGET MEME COINS – THIS CRYPTOCURRENCY CAN 100X

    BEST LAYER 2 SOLUTION | Polygon (MATIC) vs Arbitrum (ARB) | L2 Solution in 2023 | Cryptocurrency

    BEST LAYER 2 SOLUTION | Polygon (MATIC) vs Arbitrum (ARB) | L2 Solution in 2023 | Cryptocurrency

    FTM Fantom Crypto Price News Today – Elliott Wave Technical Analysis Update and Price Now!

    FTM Fantom Crypto Price News Today – Elliott Wave Technical Analysis Update and Price Now!

    BREAKING: COINBASE AND BLACKROCK ARE SENDING PEPE COIN TO $0.01 – EXPLAINED – PEPE COIN NEWS TODAY

    BREAKING: COINBASE AND BLACKROCK ARE SENDING PEPE COIN TO $0.01 – EXPLAINED – PEPE COIN NEWS TODAY

    Beldex cryptocurrency fact check  in tamil | monero coin | bitcoin |crypto update |crypto scam

    Beldex cryptocurrency fact check in tamil | monero coin | bitcoin |crypto update |crypto scam

    Polkadot DOT Price News Today – Technical Analysis Update Now, Price Now! Elliott Wave Analysis!

    Polkadot DOT Price News Today – Technical Analysis Update Now, Price Now! Elliott Wave Analysis!

No Result
View All Result
news4global
No Result
View All Result
Home Latest news

Would Large Language Models Be Better If They Weren’t So Large?

May 30, 2023
Reading Time: 5 mins read
0
Would Large Language Models Be Better If They Weren’t So Large?
189
SHARES
1.5k
VIEWS
Share on FacebookShare on Twitter

Related articles

Internal Blast Probably Breached Ukraine Dam, Experts Say (Cautiously)

Your Wednesday Briefing

June 7, 2023
PGA Tour, LIV Golf & DP World Tour merger: Players ‘shocked and angry’

PGA Tour, LIV Golf & DP World Tour merger: Players ‘shocked and angry’

June 7, 2023


When it comes to artificial intelligence chatbots, bigger is typically better.

Large language models like ChatGPT and Bard, which generate conversational, original text, improve as they are fed more data. Every day, bloggers take to the internet to explain how the latest advances — an app that summarizes‌ ‌articles, A.I.-generated podcasts, a fine-tuned model that can answer any question related to professional basketball — will “change everything.”

But making bigger and more capable A.I. requires processing power that few companies possess, and there is growing concern that a small group, including Google, Meta, OpenAI and Microsoft, will exercise near-total control over the technology.

Also, bigger language models are harder to understand. They are often described as “black boxes,” even by the people who design them, and leading figures in the field have expressed ‌unease ‌that ‌A.I.’s goals may ultimately not align with our own. If bigger is better, it is also more opaque and more exclusive.

In January, a group of young academics working in natural language processing — the branch of A.I. focused on linguistic understanding — issued a challenge to try to turn this paradigm on its head. The group called for teams to create functional language models ‌using data sets that are less than one-ten-thousandth the size of those used by the most advanced large language models. A successful mini-model would be nearly as capable as the high-end models but much smaller, more accessible and ‌more compatible with humans. The project is called the BabyLM Challenge.

“We’re challenging people to think small and focus more on building efficient systems that way more people can use,” said Aaron Mueller, a computer scientist at Johns Hopkins University and an organizer of BabyLM.

Alex Warstadt, a computer scientist at ETH Zurich and another organizer of the project, added, “The challenge puts questions about human language learning, rather than ‘How big can we make our models?’ at the center of the conversation.”

Large language models are neural networks designed to predict the next word in a given sentence or phrase. They are trained for this task using a corpus of words collected from transcripts, websites, novels and newspapers. A typical model makes guesses based on example phrases and then adjusts itself depending on how close it gets to the right answer.

By repeating this process over and over, a model forms maps of how words relate to one another. In general, the more words a model is trained on, the better it will become; every phrase provides the model with context, and more context translates to a more detailed impression of what each word means. OpenAI’s GPT-3, released in 2020, was trained on 200 billion words; DeepMind’s Chinchilla, released in 2022, was trained on a trillion.

To Ethan Wilcox, a linguist at ETH Zurich, the fact that something nonhuman can generate language presents an exciting opportunity: Could A.I. language models be used to study how humans learn language?

For instance, nativism, an influential theory tracing back to Noam Chomsky’s early work, claims that humans learn language quickly and efficiently because ‌they have an innate understanding of how language works. But language models learn language quickly, too, and seemingly without an innate understanding of how language works — so maybe nativism doesn’t hold water.

The challenge is that language models learn very differently from humans. Humans have bodies, social lives and rich sensations. We can smell mulch, feel the vanes of feathers, bump into doors and taste peppermints. Early on, we are exposed to simple spoken words and syntaxes that are often not represented in writing. So, Dr. Wilcox concluded, a computer that produces language after being trained on gazillions of written words can tell us only so much about our own linguistic process.

But if a language model were exposed only to words that a young human encounters, it might interact with language in ways that could address certain questions we have about our own abilities.

So, together with a half-dozen ‌colleagues, Dr. Wilcox, Mr. Mueller and Dr. Warstadt conceived of the BabyLM Challenge, to try to nudge language models slightly closer to human understanding. In January, they sent out a call for teams to train language models on the same number of words that a 13‌-year-old human ‌encounters — roughly 100 million. Candidate models would be ‌tested on how well they ‌generated and picked up the nuances of language, and a winner would be declared.

Eva Portelance, a linguist at McGill University, came across the challenge the day it was announced. Her research straddles the often blurry line between computer science and linguistics. The first forays into A.I., in the 1950s, were driven by the desire to model human cognitive capacities in computers; the basic unit of information processing in A.I. is ‌the‌ ‌ “neuron‌,” and early language models in the 1980s and ’90s were directly inspired by the human brain. ‌

But as processors grew more powerful, and companies started working toward marketable products, ‌computer scientists realized that it was often easier to train language models on enormous amounts of data than to force them into psychologically informed structures. As a result, Dr. Portelance said, “‌they give us text that’s humanlike, but there’s no connection between us and how they function‌.”‌

For scientists interested in understanding how the human mind works, these large models offer limited insight. And because they require ‌tremendous processing power, few researchers can access them. “Only a small number of industry labs with huge resources can afford to train models with billions of parameters on trillions of words,” ‌Dr. Wilcox said.

“Or even to load them,” Mr. Mueller added. “This has made research in the field feel slightly less democratic lately.”

The BabyLM Challenge, Dr. Portelance said, could be seen as a step away from the arms race for bigger language models, and a step toward more accessible, more intuitive A.I.

The potential of such a research program has not been ignored by bigger industry labs. Sam Altman, the chief executive of OpenAI, recently said that increasing the size of language models would not lead to the same kind of improvements seen over the past few years. And companies like Google and Meta have also been investing in research into more efficient language models, informed by human cognitive structures. After all, a model that can generate language when trained on less data could potentially be scaled up, too.

Whatever profits a successful BabyLM might hold, for those behind the challenge, the goals are more academic and abstract. Even the prize subverts the practical. “Just pride,” Dr. Wilcox said.

Share76Tweet47

Related Posts

Internal Blast Probably Breached Ukraine Dam, Experts Say (Cautiously)

Your Wednesday Briefing

by admin
June 7, 2023
0

The destruction of a major dam in Ukraine.

PGA Tour, LIV Golf & DP World Tour merger: Players ‘shocked and angry’

PGA Tour, LIV Golf & DP World Tour merger: Players ‘shocked and angry’

by admin
June 7, 2023
0

Charl Schwartzel, pictured with Yasir Al-Rumayyan and LIV Golf chief executive Greg Norman, won the inaugural LIV event in England...

Canada wildfire smoke threatens health of millions

Canada wildfire smoke threatens health of millions

by admin
June 7, 2023
0

Matthew Adams, a professor at the University of Toronto and the director of its Centre of Urban Environments, said immediate...

Tunisian black women: ‘My skin colour says I don’t belong’

Tunisian black women: ‘My skin colour says I don’t belong’

by admin
June 7, 2023
0

There has since been a rise in violence against black African migrants, according to Human Rights Watch, and the statement...

No means no: Japan is set to redefine rape in landmark legal reform

No means no: Japan is set to redefine rape in landmark legal reform

by admin
June 7, 2023
0

As Japan moves to reform its sex assault laws, activists hope society will understand no means no.

Load More
  • Trending
  • Comments
  • Latest

Bitcoin Is ‘Definitely Not a Fraud,’ CEO of Mobile-Only Bank Revolut Says

March 2, 2022
How online shopping has changed over the last 30 years | National

How online shopping has changed over the last 30 years | National

April 6, 2022

Bitcoin’s Main Rival Ethereum Hits A Fresh Record High: $425.55

March 3, 2022
Protocon Announces ‘Contract Model’, an Alternative

Protocon Announces ‘Contract Model’, an Alternative

April 6, 2022

US Commodities Regulator Beefs Up Bitcoin Futures Review

0

Bitcoin Hits 2018 Low as Concerns Mount on Regulation, Viability

0

India: Bitcoin Prices Drop As Media Misinterprets Gov’s Regulation Speech

0

Bitcoin’s Main Rival Ethereum Hits A Fresh Record High: $425.55

0
Internal Blast Probably Breached Ukraine Dam, Experts Say (Cautiously)

Your Wednesday Briefing

June 7, 2023
PGA Tour, LIV Golf & DP World Tour merger: Players ‘shocked and angry’

PGA Tour, LIV Golf & DP World Tour merger: Players ‘shocked and angry’

June 7, 2023
Canada wildfire smoke threatens health of millions

Canada wildfire smoke threatens health of millions

June 7, 2023
Tunisian black women: ‘My skin colour says I don’t belong’

Tunisian black women: ‘My skin colour says I don’t belong’

June 7, 2023

Latest News

Internal Blast Probably Breached Ukraine Dam, Experts Say (Cautiously)

Your Wednesday Briefing

June 7, 2023
PGA Tour, LIV Golf & DP World Tour merger: Players ‘shocked and angry’

PGA Tour, LIV Golf & DP World Tour merger: Players ‘shocked and angry’

June 7, 2023

Categories

Site Navigation

  • Home
  • About us
  • Contact us
  • Privacy policy
  • Terms and services
  • Home
  • About us
  • Contact us
  • Privacy policy
  • Terms and services

© 2022 Designed by news4global

No Result
View All Result
  • Home
  • Bitcoin
  • Blockchain
  • Business
  • Latest news
  • Market
  • Regulation
  • Videos

© 2022 Designed by news4global