Session
Organizer 1: Yug Desai, South Asian University
Organizer 2: Ihita Gangavarapu, 🔒
Organizer 3: Turra Daniele, Internet Society
Organizer 4: Purnima Tiwari , 🔒
Speaker 1: Daniele Turra, Private Sector, Western European and Others Group (WEOG)
Speaker 2: Melissa Muñoz Suro, Government, GRULAC
Speaker 3: Abraham Fifi Selby, Technical Community, African Group
Speaker 4: Bianca Kremer, Civil Society, GRULAC
Ihita Gangavarapu, Technical Community, Asia-Pacific Group
Yug Desai, Civil Society, Asia Pacific
Purnima Tiwari , Civil Society, Asia-Pacific Group
Roundtable
Duration (minutes): 60
Format description: This session's roundtable structure is perfect because it allows for an active and participatory conversation, which is essential for delving into the complexity of open-source artificial intelligence. Due to the room's roundtable layout participants feel more equal, which encourages active participation and makes eye contact and connection easier. The 60-minute format ensures that important topics are covered without overwhelming participants by striking a balance between the depth of the discussion and expertise level of the audience. The format's planned methodology, which includes polls, audience participation, and policy questions, allows for cross-examination of ideas presented by the speakers. Because there are many opportunities for both in-person and virtual participation to contribute and add a variety of viewpoints to the conversation, this format also encourages inclusivity.
1. In which ways can open-source models prevent a few large entities from monopolising the AI landscape? What governance structures could be necessary to manage this? 2. How does open-sourcing influence innovation rates within the AI industry, and what are the long-term implications of open-source AI on the structure of the tech industry? 3. What specific risks does open-sourcing pose, such as increased potential for misuse or reduced incentives for large-scale investment in AI research? How can these risks be mitigated while still promoting open development and harnessing the opportunities?
What will participants gain from attending this session? 1. Deeper understanding of the potential benefits and challenges of open-sourcing large language models (LLMs) and AI systems more broadly. Insights into the current state and capabilities of open-source LLMs compared to proprietary models from big tech companies. Participants will learn about the latest developments and progress in this rapidly evolving field. 2. Perspectives on the governance structures and policies that may be needed to manage the risks and ethical concerns associated with open-sourcing powerful AI systems, such as potential misuse or reduced incentives for private investment. 3. Appreciation of the long-term implications, both positive and negative, that wider availability of open-source AI could have on the structure of the technology industry, business models, and the distribution of benefits from AI innovation. 4. A nuanced understanding of the strategic, economic, and social factors at play in the debate around open vs. proprietary AI development paradigms.
Description:
The development and dissemination of AI, particularly Large Language Models (LLMs), are increasingly dominated by major tech companies, raising critical issues around access, control, and equity. While proprietary models accelerate innovation and economic gain for some, they risk consolidating power and limiting technological diversity. Open-sourcing LLMs offers a pathway to democratise AI, potentially reducing costs and fostering inclusive innovation by enabling more stakeholders to participate in AI development and application. This roundtable will explore the strategic, economic, and social implications of open-sourcing LLMs, including the potential to counteract monopolistic controls and encourage a broader distribution of technological and economic benefits. The discussion will be centred around the state of open source AI particularly LLMs and their potential to match the proprietary models.
1. Adds further to the discussions centred around AI from the youth track of IGF 2024 2. Policy brief or report summarising key findings, insights, and policy recommendations generated during the session will be shared widely with IGF participants, policy makers, and relevant stakeholders. 3. The outcome of this event will be documented in the official report of Youth IGF India 2024, which is shared with the youth of India and the various bodies supporting the cause.
Hybrid Format: The Structure: Roundtable (60 minutes) Introduction & Opening Remarks: 5 min Policy question to speakers: 10min Audience intervention: 5 min Policy question discussion: 10 min Audience intervention: 5 min Policy question to speakers: 10 min Audience poll: 5 minutes Q/A from the hybrid audience: 10 min The session is designed such that onsite and online participants get enough opportunities for intervention throughout the session and not just towards the end. Use of polls (tools such as mentimeter) will ensure inputs in a hybrid format. In addition, after every discussion on a policy question by the speakers, participants are given an opportunity to share their interventions on the topic. Towards the end of the session, there is a live Q/A and discussion with everyone. In addition to the chat box, the online moderator will note requests of the online participants and inform the onsite moderator.
Report
1. The true democratization of AI requires more than just open source code - it needs supporting infrastructure, technical expertise, high-quality data, and sustainable funding mechanisms. Public-private partnerships must provide these resources to ensure open source models can effectively compete with proprietary alternatives in serving diverse global needs. Simply making models open source doesn't automatically solve access and equity issues.
2. There's a tension between regulation and open source as paths to democratization. Ultimately, a mixed approach of open source development and regulatory frameworks offers the most promising path for preventing AI monopolies. Success cases show open source models can effectively serve local needs when backed by proper infrastructure and governance support.
3. Language and cultural representation is a critical issue in AI development. Open source models provide an opportunity for underrepresented communities to adapt and improve models to better serve local languages, cultural contexts, and specific needs that may not be prioritized by large commercial AI companies.
1. Develop robust governance structures and regulations that ensure open source AI development remains truly accessible while protecting data sovereignty and promoting ethical AI development that serves public interest rather than just commercial goals.
2. Establish collaborative frameworks between governments, academia, private sector, and civil society to develop and maintain shared AI infrastructure and resources, particularly focusing on supporting Global South nations. Regional cooperation networks between Global South nations to share resources, expertise, and infrastructure, making AI development more cost-effective and sustainable for smaller economies would be a strong start.
Introduction: Democratising access to AI is about making sure that AI technologies and resources serve and are accessible to a broad range of people and not just large corporations and high skilled actors. The goal is to ensure empower others such as small businesses, educators, researchers, civil society organizations and nations small and big. LLMs is particular are dominated by major technology companies and it raises critical issues around access, control and equity with trends towards consolidation of power and limiting technological diversity.
Daniele Turra tackles the linkages between open source, innovations and the structure of tech industry by highlighting the history and philosophical underpinnings of FOSS. Open source is tightly related to the concept of freedom. Four core freedoms associated with open source movement are freedom to use code, freedom to study code, to redistribute it and to modify it such that no actor, large or small, is entitled to own any strong intellectual property on that code. In case of LLMs, there is an additional layer of complexity as the there is a model and weight that make an LLM tool. A truly open source LLM provides access to both these things. On the other end of the spectrum, we have fully closed models that are accessed through an API or an online interface with no way to look under the hood. Despite these complexities, we should make sure that the AI model embodies all four freedoms before being labeled open source.
We should also take into account the the community efforts from the open source community that many of these tools are built. Consider the entire supply chain from data collection, data storage, data preparation, algorithm training, application development. At each of these stages, having a open source community-supported solution benefits both public and private actors. And this begs the question if it is even possible to have something that is fully closed-source.
In response to an audience intervention, Daniele also emphasised that open source is a tool and a philosophy. It may not be the best way to regulate monopolies but it an important means of sharing knowledge and giving opportunities ties to various people in acquiring the skills and know-how without being blocked by intellectual property. He also cautions against open-washing of various models where they may only partially be open source considering a complex supply chain where the entire process may not be transparent. So we should be more careful in categorising these models.
Melissa Muñoz Suro brings the government perspective from the Dominican Republic. On of the focus areas in the National AI Strategy of the Dominican Republic is development of Taina, an AI model based on Open Source frameworks aimed at improving and personalising the delivery of government services. Data for this initiative is being collected from various government systems as well as directly from citizens who volunteer information about how they use government services. This method allows them to analyse how Dominicans communicate so that AI then reflects the culture and language of the people. The project is a collaboration between government, citizens and local universities. The universities ensure that the data is accurate, well-structured and aligned to privacy standards.
Availability of open source tools is essential for smaller economies like the Dominican Republic as it impacts innovation and the structure of tech industry in the country profoundly. It allows them to break the dominance of big companies and build systems that reflect local language and culture, while specifically addressing local needs and challenges. Creating such tools also open the door for collaboration with other regional countries with similar contexts. Open source is about how technology fundamentally serves people and empowers governments to create more opportunities. It is a key tool in creating a more inclusive, accessible and people-centric tech industry.
Abraham Fifi Selby shares the African experience with respect to the rapid AI innovation and role of Open Source. In Africa getting funding for AI related startups or research is limited and the infrastructure is also sparse. With limited investments coming in, open source AI systems are helping young people innovate as it reduces the costs and provide them a platform to build on. Secondly, there is a lot of room for collaboration between the advanced countries and the Global South as there is a lot of data in the Global South that is needed to feed these models. But it is also important for the models to then cater to the local needs. Most important in this regard is the multilingualism as there are a lot of languages being spoken in Africa. AI ethics is also a gap that must be filled in Africa as frequently not enough attention is paid to this in Africa where the focus on utilising the technology for development. Finally, it is would also help to have more investments and collaboration to build indigenous models in Africa.
Bianca Kremer also elucidates the difference between open source LLMs where both the source code and training data are publicly available to use, modify, and improve by everyone, and closed LLMs where you do not have access to the underlying architectures and data. Open source models can be improved by the community so that they can be rid of biases, algorithmic racism and other issues that may not be economically lucrative for larger companies. She also highlighted efforts underway in Brazil to develop local LLMs that are a product of collaboration in public universities that cater to Portuguese speaking audiences.
The speakers also dealt with the issue of compute where most entities to not have access to large GPU-clusters that large corporations have. We need a framework in place where it is possible to share these limited resources, especially with civil society and researchers. Despite all the use-cases and promising examples, open source models are lagging behind the proprietary models. Additionally, open source projects are not end to end solutions and require sustained efforts, finances, infrastructure and expertise to develop, maintain and innovate with. These trade-off are important to consider.
The audience was most concerned about monopolisation of the LLM ecosystem by a handful of actors and the resulting inequity. Distribution of computing power to boost the open source ecosystem with seen to be of importance in this regard as the big corporations have little incentive to open up.