Simone Polillo / Sociology

Maps of Knowledge

Knowledge is created as much through disagreement, debate and conflict, as through consensus, introspection, and dispassionate assessment of the rationale behind one’s arguments, and the evidence one can gather in their support. Yet, when teaching a discipline, emphasis is often on already-made knowledge and not on knowledge-in-the-making, and so the picture is one of collective harmony and shared commitment to uncontroversial goals, not of the messy process of discovery and creativity that engages, and motivates, producers of knowledge.

My “Dream Idea” is to take advantage of developments in the analysis of bibliographic data, specifically the application of network techniques to the study of citation communities, to expose students to the conflictual nature of intellectual production, and, together, to reflect upon and theorize about the processes through which consensus is (however provisionally) achieved in a discipline as to what constitutes knowledge worth thinking about and using. I envision a small group of advanced students (10-12, from the humanities and the social sciences, but potentially from other domains too), with an interest in a diverse set of ideas, and a commitment to figuring out why those ideas took the shape they did, what ideas they had to fight against and displace, and what ideas they are likely to be challenged by in the future.

This project is motivated by intellectual as much as by technical reasons. As the analysis of “big data” increasingly draws public attention, the tools to analyze large citation datasets from repositories like Web of Science are now readily available online for free, and require only a basic command over programming languages like Python in order to be put to use. Moreover, well-respected organizations (like Google) have produced and made available free tutorials on programming that provide a more than adequate introduction to the kinds of techniques one needs for the analysis of “big data.” Finally, sophisticated yet intuitive network-visualization software is now freely available as well, putting the graphic visualization of data within easy reach of non-experts. Analyzing citation data is therefore now possible to students who have only some degree of familiarity with programming languages and visualization techniques, but who are keen on exploring these exciting possibilities.

Our first meeting will be dedicated to figuring out exactly what we want to study. What debates are we interested in, and why? How much do we know about potential controversies around the ideas we want to explore, and what is it going to take to gain better preliminary understanding? Once we have singled out the key ideas whose development we want to follow, through bibliographic research we can find out what authors (in papers or books) were crucial in disseminating them, and we can also visualize how networks of collaborators and opponents formed over time behind these ideas. It is therefore time to introduce our methodology.

I envision meeting with the students over two or three evening sessions to acquire basic programming skills, introducing them to the nuts and bolts of Python as we work through internet tutorials and the quizzes they propose to enhance learning. Potentially, researchers from Scholars’ Lab could be involved in this part of the project. Importantly, our aim is not o learn the language for its own sake, but to apply what we learn to the study and visualization of bibliographic networks, especially as they evolve over time. So we will focus on Python only to gain a working understanding of publicly available scripts that will then do the heavy lifting of extracting bibliographic data from Web of Science, and making them readable to network visualization software.

Once we understand how to set up our code, and what the code actually does, it will be time to select our sources (what journals are we looking for? How well covered are they?). After this step, extracting the data should be relatively straightforward, and will allow us to move to the fun part–visualization! Freeware software like Gephi is particularly user-friendly, so we will feed our data to it and begin exploring different ways of visualizing our networks. While we will pay some attention to the mechanics of visualization (there are several algorithms we can rely upon, so understanding how they actually organize the data is important), we will also try and develop an aesthetic sense for best practices in network mapping. Dealing with different ideas in different fields will be tremendously inspiring in this respect. We will begin gaining a sense of what debates actually look like: for instance, does the size of a citation community matter to the importance of an idea regardless of field of inquiry? Do contenders have to be equally matched in terms of supporters? How many competing ideas and debates can appear at once in any given field?
Visualizing how these dynamics play out in diverse fields will have several pedagogical outcomes:
It will help us concretely understand how different debates unfold in different disciplines. Gaining appreciation of how different fields work will enhance interdisciplinarity.

We will witness research in the making, thereby reflect upon what (if anything) new methods of discovery afford compared to more traditional approaches.

We will attempt to theorize why different fields work differently, thereby practicing an empirically grounded sociology of knowledge.

Budget and Schedule

I envision this project to take the form of a seminar, meeting once a week over one semester (at least at the beginning), over pizza, in the evening so as to accommodate everyone’s schedule. Costs will therefore be limited to feeding us (about $100 per meeting, about 10 meetings, for a total of $1,000). Towards the end of our seminar, I hope we will be able to present the results of our research in the context of an undergraduate symposium (e.g. as part of the Sociology department’s end of year research symposium). To prepare for that presentation and to debrief/celebrate afterward, I anticipate meeting twice with the students over dinner at my home or in a restaurant at a total cost of approximately $600.

Total cost ~$1,600