abusesaffiliationarrow-downarrow-leftarrow-rightarrow-upattack-typeblueskyburgerchevron-downchevron-leftchevron-rightchevron-upClock iconclosedeletedevelopment-povertydiscriminationdollardownloademailenvironmentexternal-linkfacebookfilterflaggenderglobeglobegroupshealthC4067174-3DD9-4B9E-AD64-284FDAAE6338@1xinformation-outlineinformationinstagraminvestment-trade-globalisationissueslabourlanguagesShapeCombined Shapeline, chart, up, arrow, graphLinkedInlocationmap-pinminusnewsorganisationotheroverviewpluspreviewArtboard 185profilerefreshIconnewssearchsecurityPathStock downStock steadyStock uptagticktooltiptwitteruniversalitywebwhatsappxIcons / Social / YouTube

이 페이지는 한국어로 제공되지 않으며 English로 표시됩니다.

기사

2023년 5월 23일

저자:
Gabriel Nicholas and Aliya Bhatia, Center for Democracy & Technology

New report highlights the shortcomings of large language models in analysing non-English content

"Lost in Translation: Large Language Models in Non-English Content Analysis", 23 May 2023.

...A new report from CDT examines the new models that companies claim can analyze text across languages. The paper explains how these language models work and explores their capabilities and limits...

...In the past, it has been difficult to develop artificial intelligence (AI) systems — and especially large language models — in languages other than English because of what is known as the resourcedness gap. This gap describes the asymmetry in the availability of high quality digitized text that can serve as training data for a model. English is an extremely highly resourced language, whereas other languages, including those used predominantly in the Global South, often have fewer examples of high quality text (if any at all) on which to train language models...

...while multilingual language models show promise as a tool for content analysis, they also face key limitations:

  1. Multilingual language models often rely on machine-translated text that can contain errors or terms native language speakers don’t actually use. 
  2. When multilingual language models fail, their problems are hard to identify, diagnose, and fix.
  3. Multilingual language models do not and cannot work equally well in all languages.
  4. Multilingual language models fail to account for the contexts of local language speakers.

These shortcomings are amplified when used in high risk contexts. If these models are used to scan applications for asylum for example, errant systems may limit a users’ ability to access safety. In content moderation, misinterpretations of text can result in takedowns of posts which may erect barriers to information, particularly where not a lot of information in a particular language is available...

...Governments, technology companies, researchers, and civil society should not assume these models work better than they do, and should invest in greater transparency and accountability efforts in order to better understand the impact of these models on individuals’ rights and access to information and economic opportunities. Crucially, researchers from different language communities should be supported and be at the forefront of the effort to develop models and methods that build capacity for tools in different languages...

개인정보

이 웹사이트는 쿠키 및 기타 웹 저장 기술을 사용합니다. 아래에서 개인정보보호 옵션을 설정할 수 있습니다. 변경 사항은 즉시 적용됩니다.

웹 저장소 사용에 대한 자세한 내용은 다음을 참조하세요 데이터 사용 및 쿠키 정책

Strictly necessary storage

ON
OFF

Necessary storage enables core site functionality. This site cannot function without it, so it can only be disabled by changing settings in your browser.

분석 쿠키

ON
OFF

귀하가 우리 웹사이트를 방문하면 Google Analytics를 사용하여 귀하의 방문 정보를 수집합니다. 이 쿠키를 수락하면 저희가 귀하의 방문에 대한 자세한 내용을 이해하고, 정보 표시 방법을 개선할 수 있습니다. 모든 분석 정보는 익명이 보장되며 귀하를 식별하는데 사용하지 않습니다. Google은 모든 브라우저에 대해 Google Analytics 선택 해제 추가 기능을 제공합니다.

프로모션 쿠키

ON
OFF

우리는 소셜미디어와 검색 엔진을 포함한 제3자 플랫폼을 통해 기업과 인권에 대한 뉴스와 업데이트를 제공합니다. 이 쿠키는 이러한 프로모션의 성과를 이해하는데 도움이 됩니다.

이 사이트에 대한 개인정보 공개 범위 선택

이 사이트는 필요한 핵심 기능 이상으로 귀하의 경험을 향상시키기 위해 쿠키 및 기타 웹 저장 기술을 사용합니다.