Answering MOSS Questions at PyData Global

We enjoyed our session at PyData Global yesterday, where we showed off our current (pre-)prototype. Over 50 people attended, and we got lots of good comments and questions.

The questions below are edited for punctuation and spelling, followed by my answers.

Q: How can we access this for personal use?
A: The app is not available yet. We hope to start prototyping in early 2024.

Q: This is amazing, thanks. How scalable is the graph rendering ?
A: We’re hitting the limits of what Kumu can support, but the actual app should be able to handle things.

Q: It looks like the graph lists Python packages only? If yes, are there any plans to expand to other languages, e.g., Julia?
A: We are not excluding any programming languages. As long as a software project is used for science, we will want it on the map.

Q: Is the data on the platform updated in real time? And also how can we verify the authenticity of the links between the projects?
A: The pre-prototype you’re looking at does not yet use real-time data. That is a feature we will be looking at once we start prototyping the actual MOSS app. We plan to develop processes for verification etc. together with the community.

Q: Is it possible to use a node2vect on each node or between nodes?
A: I can’t answer this question specifically, but I know we want to support advanced search and analysis functions. MOSS is intended to be a work tool (e.g., for scientists) and a research tool (for people studying open source or OSS in science).

Q: What are the various interest groups of OSSci and how to join them?
A: Domain-spefic (vertical): Chemistry / Material Science, Life Sciences / Healthcare, Climate and Sustainability. Cross-domain (horizontal): Reproducible Science, Map of Science. Learn more…

Q: For some relationships that you just demo’d, like finding similar libraries to Matplotlib, are those relationship data pts created manually, or automated through say some LLM?
A: All manual so far. Obviously, we hope to automate much or most of this (top down) while also maintaining options for individuals to add or enhance the data (bottom up).

Q: What is the name of this graph visualization software that Jonathan is using?
A: Kumu, we love it!

Q: Just curious, how much time did it take to map all these projects?
A: We started using Kumu in July. Jon has put in significant hours into getting us to this stage.

Q: What is the rough size of your full graph? (e.g. in terms of number of vertices and edges)?
A: We’re currently at about 10k nodes, with 30–50k connections.

Got any other questions? Ask away!