2015 11 13 Open Forum - European Comission/Global Internet Policy Observatory (GIPO) Workshop Room 2 FINISHED

The following are the outputs of the real-time captioning taken during the Tenth Annual Meeting of the Internet Governance Forum (IGF) in João Pessoa, Brazil, from 10 to 13 November 2015. Although it is largely accurate, in some cases it may be incomplete or inaccurate due to inaudible passages or transcription errors. It is posted as an aid to understanding the proceedings at the event, but should not be treated as an authoritative record. 

 

***

>> Hello? Good afternoon, everybody. I'm very sorry that we had to start a little bit later due to organizational reasons. My name is Kasia. I'm responsible for stakeholder engagement initiative. I'm going to be the moderator of today's session. But very briefly about agenda, we have asked Cristina Monti from European Commission that is going to introduce the GIPO project. Then we have a remote presenter today, Luis Meijueiro, that is going to present the demo of the GIPO tour and that's the main purpose of today's meeting. Finally Stefaan Verhulst from The Gov Lab will present and discuss how GIPO can be an information engine for other initiatives. And I'm going to, at the very end I'm going to present the results of the survey toward the federation world map for GIPO. After that we're going to have a Q&A session, so please feel free to ask questions that you're going to have during this session.

We are having also remote participants, and hopefully they can hear us now. And so I'm going to give my, give the voice to Cristina now.

>> CRISTINA MONTI: Thank you, Kasia. And thank you very much to all of you who are here today on the very last day of IGF. Very briefly I will introduce to you what we are trying to do with the creation of the Global Internet Policy Observatory. As you might know, the idea is not really new. Yesterday, I think there was a very interesting workshop on different initiatives that are trying to cope in a way or another with the same kind of problem. The amount of information that we have to deal with and to digest concerning the Internet governance development, and how do we make the multistakeholder process that is sometimes quite overwhelming and sometimes even ‑‑ How do we make the process more accessible and transparent for stakeholders who do not have their resources to get involved and to follow all those issues that are constantly emerging. And so this is particularly intended for the weaker stakeholders like civil society or developing countries.

So the idea to establish this observatory was supported by the European Commission, and there was this initiative to provide the tool, a technical tool which could monitor and analyze Internet policy, but also technological and regulatory developments across the world. We did a feasibility study last year to discuss what technologies are available in different area, and there are plenty of technologies that could help us in this effort, ranging from automatic data collection, and text mining and the idea is really to make the most of this technology. So is the accent on automation is very strong for this project.

So GIPO is now currently being developed by a team of external experts as a European Commission Project. So we decided to put some money in this project, and then in April we launched a website which is called GIPOnet.org where you can find information about the stages of the project but also can you get involved in it, and this is also very important. We want to ensure outreach and inform and consult all different stakeholders. And we do this through regular webinars, social media campaigning, and presentations like this one.

We are facing many issues where we need to use the knowledge of others in order to make a tool which is useful for the users. For instance, just to give you a hint, one of the big obstacles we were discussing is taxonomies. How do we make sure that this initiative is not just a replication of others but really brings value? How can we ensure synergies with others? And then so the issue of taxonomies is quite prevalent in that sense. But also issues like multilingualism. On this subject a lot has been done in the European Union but also in the UN, and we are trying to reach out also to different communities to bring in their knowledge. So in this meeting, I think we will showcase what has been done so far, and we will try also to get your input because, again, it's crucial for us to get different stakeholders involved in all stages of the development of the tool so that it can be really a useful tool for you as users.

Before the project will be completed, we will have to have a conversation also about who will manage the tool. And here I would like to stress that the European Commission does not wish to be the owner of this project. The idea is the tools should serve the global community, so we're really open, and we are now engaged in discussions already with several partners and organizations on how best to manage the tool and be responsible for the tool.

GIPO has also established an advisory group which works independently from the European Commission, and it's composed of 12 members that bring in also different knowledge and expertise, and they provide guidance on different aspects of the project. They have already started to discuss, technical, legal, corporation and legal issues related to the development of the tool. So I think with this brief introduction, I will give you the floor back. Thank you.

>> KASIA JAKIMOWICZ: So I guess everybody is excited to see the practical side of the tool. We're going to try to connect with Luis that is responsible for technological development of the GIPO tool, and he's going to lead us through the tool itself as well as present some videos of the demo of the tool. Let's see whether Luis can connect with us. I would ask technical team to give Luis the voice. Luis, we cannot hear you right now.

>> We have Luis Meijueiro who is going to present hopefully.

>> LUIS MEIJUEIRO: Hello?

>> Hi. Luis, could you share your screen with us?

>> LUIS MEIJUEIRO: Okay. Right now. Hello, everybody. Thank you, Kasia. I will start now my presentation in a moment. Okay. Do you see it?

>> KASIA JAKIMOWICZ: Yes.

>> LUIS MEIJUEIRO: Okay. Right. Hello, everyone. Good afternoon. Good evening here in Europe. My name is Luis Meijueiro from CTIC. I'm going to present to you the observatory tool that is the main part of this initiative, the GIPO initiative.

Okay. Let's start talking about what is the observatory tool. The observatory tool is an automated tool that compiles and analyzes by itself without human. Information and Internet governance data gathered from many different sources, from various different topics, then automatic analyze of the information makes it valuable to the community for further usage. That's the best way to use it and analyze and to share it with everyone.

Okay. So I will explain to you the main components. It's (Inaudible) a tool. There is a user interface. The user interface enables people to search and comment and participate in this information gathering. This information we are talking about, and news about articles, events, about documents in general. The tool also has a dashboard that displays some kind of data gathers in a useful way, maybe by using some kind of a graph or map or some kind of graph. The system is able to create the content. Not just to view it, but some authorized users will be able to complete the contents and to add more and to elaborate the content in order to better understand it. And also the community will be able to evaluate the quality of the contents, the usefulness, and rate it in a score system.

But also this tool will deliver this information compiled for machines, not only for human, but in a way that another machine, maybe another observatory, can integrate and interpret with our tool and get the content and make queries to our database.

So I will talk now about the origin of the content. We gather content from the Internet ‑‑ probably content. The content has to be open and available for everyone. We cannot pass any or control so we will not get information that is behind some closed doors. We need and we will collect open information. And we will get it from (Inaudible) feeds that are the usual way to get information on the Internet from machines but also from web pages that are made for humans. And we will get its content by scanning the HTML code and getting the most important parts. But also we will get content from social media. Social media is very important. There is a lot of debate in the social media. So we'll try to gather content from it and Google Blast.

But don't get this in a bad concept, thinking of it as an (Inaudible). It's just a useful way to get content from machines. But the tool will also analyze and classify information by itself without human intervention. Making use of semantic technologies and knowledge kind of technologies, will try to classify it and get the scoring of the information to display and show it in a way.

So for us relevancy of information is the key. We first collect only the interesting information, not all the noise that is in the whole net. We will use this tool automatically, some scoring basing, some algorithms to use and how the scores are made and how they are used. Then the tool will make the documents with a lot of this data about the information that is gathered that is useful like key words and classifying some topics that is very useful to search the content and to search in the tool for information in a specific topic, for instance.

So about the uses, first, there are two types of uses. In uses, the use ‑‑ the tool will be accessible by anonymous use. We don't ask you for any personal data. And we will only use it as a web page. You will view and search the document and communicate and search. You can do some advanced features like sources to the system. You can collaborate with us and ask us sources, and also participate in the maintenance of the algorithms. But to try to do them better and better and try not to miss anything could be interesting also in the maintenance of the categorization of the information in the tool.

But apart from this document, there is a way to see the interface. This is the interface of the tool, only the home space. It goes better because it will deliver it to the first public at the end of this year, so we will go ahead and make a test. And you only need to in this case because you need to use your name and password. We are logging into the tool. And you will represent it with a home base. This home base is very basic. This will show you some kind of parts like a simple search box where you can search in the tool. And also there is a menu with the home with this same page with access to the sources and the items of information for analysis and different lists that you can see. You see a lot with this home base. Let's see what is one of the items, like this one, for instance. Okay. Here the whole content of the item in this case is the newest item. You can see the home base. And you have some buttons. You can search this content. You can edit the content. This is a part of the tool if you're an authorized user, you may cut some parts and complete and make your corrections. You can export. This feature is to share the content with other machines, and you can collect content available for selection and exporting. You can bolt up or bolt down. This is a part of the tool. And here you see the many parts that automatically the content gathers from the machine, and you can comment on the item. This is more quickly the basic features of Internet information.

Okay. The sources, it's simple in this case, you only need the URL. Here is a list of many, many sources that are currently alive and collecting items. And about the item, you cannot ‑‑ an item manually, just in case you detect some important thing. But the most important is that you can go to the analysis page, where all the items live and you that you can use it to search for items in a way that you can filter by a lot of different kinds of things, tasks, or many, many ways. And you can also see the list of exported items to check what items can currently share with other observatories.

Okay. Let's go ahead. Very, very quick overview of the main interface of the tool. So let's go step by step with the tool. Now, I will talk to you about the content analysis. That is the best part of this tool, how this tool does analyze the content by itself. There is no human. Okay. We use this semantic technologies, and we get key words extraction from the text that we get from the original source from that article, or that document, and we apply ‑‑ we use semantic services to quantify the content. You can say, understand what is the content about. And then we also classify it in one of the different baskets. We call it (Inaudible) in the Internet Governance field. We use what we call a facetted search because we classify under facets. Internet governance taxonomy is another facet, but another is type of content, news, or what is it. Now that facet can be the date, the date of the content, the year, the month. Also another content is we can localize the contents and we can see what is in that content and what it's talking about, what is being impacted by that content.

And also apart from this automatic work you can also include a human in the process to do that simplification or text interpretation. We see an example in this case, we focus on an item. This item is neutrality. And you see here that you have here the title. A human can see the focus is in neutrality, and the machine classified it under government principles. That is one of our facets. And these are all the semantic texts. You can see neutrality is one that the machine detects. We have one here and here. If we click on the neutrality, the machine gets from the database every item that talks about neutrality that we have collected all over the time. We have many pages here collected from several months here. And you see a lot of items. But this is only a list, so we can search things in this list. So you can go to the items view, and going to the facetted search is more easy to look at the content. If we want to see the content, if we knew that it was about India, then we have here on the tool makes that filter of items that are talking about (Inaudible), but I have a lot of information about every item. We can see every task or classification. If we go on and reclassify it again, I select (Inaudible), and only the list of items surfaces. If I know this is a news item, I know it. Here is the item. Now we begin with ‑‑ so only with this we get to the information very, very quickly.

Okay. So getting messy with these things I'm seeing more technical things. How do we do the filtering of content? Okay. Filters for us in the tool is a gate for a source of information. If we do not apply a filter, nothing is from a given source or from a given ‑‑ we at least have to give the tool one filter we are looking for, some kind of button, some kind of types of information. Then when those matches occur and now we have text that is in that source, that is something interesting to get the information, these filter, there is a score on them. We have a rating system that depends on some kind of coincidence, of key word, and so on. We assign every item of information a given score.

I will show you it quickly also in this video. Okay. Here we are again in the list of items. You see here in the first column a number. This number and usually the list is ordered by the numbers. The numbers can sort by relevance, the relevance as a whole of the item. We go to a source, we have a section of filters, and these are something that gives a score for different kinds. In this case is a group I'm talking about. When detecting something about net neutrality and (Inaudible) a kind of weight. You see how. We have the assimilation. The assimilation is a process that this part of the tool is for an authorized user that can work on the algorithms and customize to get the right information. So we assimilate, and the tool makes an assimilation and presents to us what can be collected from the source and what score would be given and why. Here you can see the logic of what kind of algorithm. This is only assimilation, and the authorized users can fine tune this. When he's happy with it, then apply it. And the machine will get and keep going and make and send items by mail and social media –

(Connection Lost)

Thank you, Luis for the introduction and overview of current status of GIPO, and I'm Stefaan at The Gov Lab and we are working on another mapping effort, which is totally complementary to by tapping into the expertise and intelligence and energy that is out there in the community. So we are more of a crowdsource effort and also more of an effort to really start mapping the connections as opposed to what GIPO wants to do which is really have a constant feed of updates and news. And so we believe that for any map to be successful, you have to be comprehensive with regard to the issue or issues they seek to address, have to be accurate, but you also have to be up-to-date. And so for that reason we've started talking and looking into possible ways for collaboration with GIPO from, I think day one almost, in order to really start thinking about how can the different functionalities actually add value if you would bring them together.

And so here on the screen we have a mock-up of what we hope to realize in close collaboration with GIPO in which you would have our content that is provided by the community, but then ultimately enhanced by a feat that J -- feed that GIPO would provide and that's the reason we would call it GIPO -- anyway, we would use the GIPO engine in order to make sure that people who visit the site also have access to up-to-date ongoing activities in addition to the resources, in addition to the background that is out there.

And so that's the current mock-up that we have. As you have seen, at the moment we were moving ahead and the -- when it becomes operational we would ultimately integrate it as well. So that's just one example of a possible collaboration and we are very pleased to have that opportunity and to be able to tap into this public interest tool that the European Commission is funding and is being developed by cat cassia, Luis and their colleagues.

>> Thank you. Stefaan, I want to mention one, that we started to collaborate day one. We had the workshop of all the observatories and mapping studies, and I think it is -- this is the most important thing, that we are trying to achieve synergies between different initiatives and not to repeat the same things and not to investigate resources in the same things, and in the scope of that while terminating dipole because it's the process in creation, it's going to continue for two years to come and we're inviting all the initiatives to start the project, and in the scope of GIPO we started to talk to all the other initiatives and observatories, so I wanted to present very briefly because we have only 19 minutes, the first results of the survey towards the federation (?). And this is a short -- that is meant to find out the possible scenarios of collaboration between GIPO and other initiatives. I'm going to show you the first results because the end report will be published by the end of 2015, so if you feel that your initiatives should be involved, just -- I'm going it share the details with you later on, just access the site and see the survey and you're invited to talk to us, and being in our meeting. So we invited initiatives to cooperate and talk with us about possible synergies. We got 16 replies, and we managed to attract stakeholders from the representing world trade organizations and different initiatives and geographical coverage as well, you can see here. This gives you a little bit of a background on what is the landscape of Internet Governance, observatories and mapping initiatives. So as you can see they're from very different areas, so we have NGOs, we have networks of resource centers, we have initiatives that are backed up by industry or just independent voluntary projects, and some of them are observatories, some are repositories of knowledge, there are (?) social map and network of scholars and centers, research projects and reports. And they cover different level thematic things. So we have initiatives that have covered (?) but there are initiatives that just focus on Europe or Africa or Asia or South America, or Australia. So as you can see we attracted a broad range of initiatives all over the world. And some of them are targeting niche projects -- they're all diplomats digital world and some attract a wide target group. It depends, and some they don't really state what is the type of group.

As you can see some of them focus on Internet Governance, but there are also initiatives that have much broader scope, and here you can also see what's the main prevailing subject for most of them. This is going to be security and trust issues. Oh. We skipped -- so and very briefly because it's very interesting, that as you can see this is a very -- kind of a very young landscape, because most of denisha -- initiatives were started around 2012 with some exceptions, and most of them -- 75% of them are -- have below five people working on -- on it, and most of them are -- very often they work on a voluntary basis of part-time or assigned to other projects. So this is why some of the financial sustainability is (?).

So just to go on very quickly, other initiatives and why also GIPO is important here and these are some of the issues we're dealing with. Of course resources are a challenge, so as you can see, also -- see the needs for sustained resources, financial and human, and of course human resources is a big challenge, so this is why we have GIPO, because GIPO is an automated tool that helps people to focus on creating content. So they don't have to -- to focus resources on developing technology. They can just focus on adding -- added value and bring -- and create a -- searching for information. (?) information is a challenge for most of the initiatives and as you can see from Luis' presentation, GIPO is attacking the problem as well and trying to focus on the -- and assess the relevance of information, focus on the information that is relevant and useful, and inquiries are a big challenge that GIPO is going to tackle as well. It's not really -- Christina right now. We are focusing on five languages as far as I know, but we're going to try to include more, more if it's possible.

>> That's why we need more partners and people in organizations that can get on board and help us.

>> But on a technical level we're going to try to tackle the problem from a technological perspective and approach (?) on how maybe Luis can cover it in more detail. And of course taxonomy and semantics is a big challenge and this is why (?) initiatives try to talk together, because we want to achieve as much interoperability as possible and having the same semantics, key words, hashtags, is very important, and we also came to a conclusion after the Danish discussions that it's not about the taxonomy and big taxonomy, it's because we have those discussions about taxonomies and how to divide the Internet Governance, governance subjects into categories. But it's more about underlying dictionary of key words that you can then assign however you want. So as you can see, we are trying to be very practical and hopefully as a result of this project and talking to other initiatives and also as a result of initiative of Stefan, at some point we're going to reach common dictionary of key words with all the initiatives together and that will safeguard the interoperability of all the initiatives in the future, or at least make it more visible. So that's -- feasible. So that's what we are hoping for.

Okay. So the whole idea about the survey and about being told -- the other initiatives is to see whether we can -- whether we can collaborate and have wherever there is a potential for interoperability. As of now we talked to 16 initiatives and all of them expressed interest in collaboration, and majority of the initiatives expressed interest in partial integration of services and collaboration on some level, and as you can see, we can collaborate either by (?) of content, organizing common events or trainings, across common -- common cooperation is very important and we can also cross-communicate our activities. From what we checked with the feasibility of interoperability, most of the initiatives have at least our assess (?) that they can share, but some of them still are working on it, so it's also important to talk to all the initiatives beforehand so when they create new pages or new mapping exercises, that they think about how useful they can be to other partners and take into consideration that it would be great if they could be interoperable from the start.

And yeah -- and if you feel that your initiative can contribute to common discussion, you can access -- you can contact me or you can access our Web page and you can fill in the survey. So the link is here for you to use.

So that was very short from my side. We have a few minutes to go for any questions and any comments that you have and I would like to give the floor to all of you. Thank you.

Yes. So there is another call from Christina that all the slides will be available online and all the demos will be available online, so please feel free to access our site in next days to use all the materials. And we also organized all the Webinars on the ongoing basis, so every month there is a Webinar on future developments and common collaboration. So please I would like you to -- I would like to invite you for -- to take part in future Webinars.

Anybody else? Yes, please. Can you introduce yourself?

>> (?) from the European Parliament. I'm interested since the GIPO is supposed to provide not just a repository but a filter of information and present the most relevant information, what your conception of relevancy is there. Is it only about how much the articles in question or the resources have been used in the past or is there also some kind of underlying conception of quality or relevancy? Because the problem we're always trying to solve here is on the one hand we want all the stakeholder groups to be equally involved in the discussion, but at the same time, of course, they're also more important, more dominant parts of the debates that draw a lot of attention. So I would be interested to hear a bit more about that.

>> Yeah, Luis, I can you can answer this question. Can we hear you?

>> Luis: Yes, can you hear me?

>> Yes, a little bit louder, please.

>> Luis: Yes. Okay. Talking about relevance I said --

>> Louder, Luis. We cannot hear you.

>> You cannot hear. I will try to --

>> You can shout.

>> Luis: My mic --

>> Yes, now.

>> Yes, do you hear me?

>> Yes.

>> Luis: Okay. So talking about relevance, we're at first using a simple approach, which is by quantity. I can talk about several issues, especially issues related when Internet Governance. So the most issue, the more issues the (?) talks about the more relevant will get. This is only our first approach because if we only -- the (?) mainly are not expressing Internet Governance, but we can -- we can control very, very -- very accurately this courting of (?). So we can define rules, make rules, algorithms, to escort this information. So if we can't find a rule that you can say, okay, this is most relevant because not only talks about the specific item, topic of relevance, but because it's talking about two items specifically and I know -- human knows that this combination of topics when found together are more relevant than finding them isolated. That's an example. We can do an algorithm to detect that and then escort it more. We are only starting today. The possibilities are very promising.

>> (?) to this, thank you, Luis, for the technical explanation. To me it's important to say that the algorithm will be completely transparent and open, for those who can understand how it works. So -- so the other parts of the tool will be using open source software as much as possible to ensure transparency of how the data is collected and how -- sorry, by using also open (?) software the tool will be able to develop also in the future.

>> And I think there is also additional (?) that we show -- there is an option evaluating content within the tool. So that also gives you additional kind of relevancy because you can -- you can actually comment on the content and comment whether it's relevant or not, and that can be analyzed later on and adjusted.

I don't have any comments from remote participation. No? Okay. Thank you. Any other comments? Yes, please.

>> Thank you very much. My name is Utina Vid. I'm representing the European network of children organization network, and online safety. Some of you have already (?) if I make maybe similar points again. But we would certainly welcome the observatory, you know, in terms of being users of information, because as NGOs we often don't -- lack the resources to do independent research. So just backing it sometimes with some useful -- with other research will be helpful, but also more importantly, perhaps, we also see ourselves as active contributors, through observatory, because as NGOs we are completely independent, as the name suggests, independent of government, independent of any other industry groups, industry, for example, so we can challenge. But that brings me a bit to the question about NGOs registering, because that -- the list was very quick and I couldn't see any NGOs there and maybe there are some. But what about if an NGO applies to be a member that you don't like, so everyone can think about hate groups, you know, who have a nice name like cultural -- something like that. I mean, who decides to be the gatekeeper and, you know, what opportunities would people have to challenge a decision?

>> Well, maybe -- at the beginning I mentioned very briefly that now we are really focusing on the tool, on the technical tool. At some point we will need to have a discussion about governance. What you are telling me is who is going to decide who is on the list, who likes whom. I mean, for the moment the tool is just a tool. There is no management behind the tool, but in principle it should be open, and maybe we should also make a distinction. One thing is the list of sources that the system will track automatically, so, for instance, if you're an NGO producing papers or information about Internet Governance-related issues, the system will track you. You don't need to subscribe to the system. The system will find the information automatically, and even if it is in another language. So that's -- that's the idea.

So maybe we need to make this distinction about list of sources and then list of partnering organizations, which is something different, and this is something that we are still discussing, but again, the focus at the moment is really on the automation part.

>> Maybe I can add something to it, because I think it's a very valid comment. So from the technology perspective, you have a list of sources, right? So you have a list of free (?) so we are checking which information -- if you have a Web page or you have an observatory, you are checking over its value, machine readable form and we can use it. But somebody has to accept this list of sources and we also have an advisory group that propose sources and that can check and comment on it. So if there is some kind of -- at the beginning you need to create a list of sources, so of course we're not -- we're not really -- we don't assess, really, content, but of course we're not going to promote the content if it's not, you know -- yeah, depends on the point of view. But there is a list of sources that it needs to be put into and accepted so if there will be something really incredibly inappropriate, probably somebody would decide on it, but I don't know what's the process behind. Maybe there has to be some process developed behind. Yeah. And that's a valid comment.

And the second thing is that there is a list of initiatives that we cooperate with, and this is a process in creation, so feel free to connect with us and talk to us and we can talk about the -- the collaboration in the future.

Oh, we see that there is only one minute left. If you have another question -- other questions or comments, please do. If not, we're going to close the session, and thank you all for participation. And I invite you one more time to access the Web site and follow our Webinars and materials. Thank you very much.

(Applause)