Guidance around in-person or synchronous discussions

rgommers · August 6, 2024, 5:10pm

Hi all,

The purpose of this post is to discuss writing up a few guidelines around development discussions during audio/video calls and in-person meetings. I’m hoping we can discuss the key points here, and then open a PR with changes to the SciPy Core Developer Guide and/or the SciPy Project Governance document.

The immediate trigger for opening this discussion now was the fairly contentious discussions around the dev process used for the stats.distributions overhaul (xref gh-15928 and gh-21050). However, there have been more instances of this coming up, from “how should this work” and “why didn’t I know about this” questions to (sometimes) disagreements. We have people meeting at in-person events, we have active grants, we have maintainers who work in the same company or institute, we have bi-weekly community and newcomers meetings, we have folks self-organizing and collaborating around particular topics, we regularly have internships focused on contributing to SciPy, and more. And our written docs and policies say next to nothing about any of this. So I think it’s time to try and rectify that a bit. That isn’t going to lead to major changes in how we work, but hopefully it’ll capture a common understanding and help with some knowledge transfer around things that currently have to be learned by osmosis.

Here is a set of points/guidelines that came to mind for me:

Decisions get made in public, in this forum (previously the mailing list) and on GitHub - according to our current governance policy and many years of established practice. This didn’t/doesn’t change - any discussion elsewhere has to lead to written proposals and requires time for open discussion and (ideally full) consensus before finalizing a proposed change.
Higher-bandwidth discussions between SciPy maintainers & contributors - in person or in audio/video calls - are:
- A normal part of contributing to the project,
- Desirable for some - we’re working with other people and building connections is part of the fun,
- Less desirable for some other people, for a variety of reasons from time constraints to personal preferences or perhaps even a disability; contributing asynchronously only is perfectly fine and normal,
- Potentially a more effective way of collaborating (but not necessarily for everyone or all topics),
- Not required for participating in the decision making process for any topic,
  - It’s up to folks doing the higher-bandwidth discussions to switch to async mode early/often enough to ensure everyone gets a chance to participate to the extent they want,
Some suggested guidelines for ways of working:
- For meetings that happen regularly, ideally they will have:
  - public announcements,
  - an entry in the SciPy community calendar at Scientific Python - Community Calendars 📅,
  - meeting minutes and other relevant artifacts (e.g., slides for a presentation given on behalf of the projects) added to GitHub - scipy/archive: Archive for public materials such as meeting minutes, logos, and presentations, and
  - relevant discussion on a particular issue or PR summarized in a comment on GitHub.
- This ideal setup in the point above isn’t always achievable for various reasons (meeting format, time constraints of participants, etc.).
  - That happens and isn’t a problem - working in private is not wrong and can be quite productive - as long as the folks involved realize that if it isn’t posted publicly yet, it can’t be decided on.
  - The longer you wait before making something public, the higher the risk someone will disagree with some choices you made along the way and ask for significant changes, so consider this carefully when working on something in private.
  - If you do start out working in private on some topic/design, remember to “switch gears” to public mode - explicitly think about when you do this (typically when opening either a PR or an issue with a design proposal).

I hope this is helpful. Feedback, other thoughts and ideas are very welcome as always. I’ll give this at least a week before thinking about opening a PR.

Cheers,
Ralf

drammock · August 6, 2024, 7:50pm

FWIW, MNE-Python has been grappling with many of the same issues. I think the challenges are going to be common to many projects, and the responses may be different depending on project or context. It’s probably worth thinking about how much of the resulting guidance might make sense as a SPEC — there’s a “governance” SPEC currently in draft mode, it might fit there, or it might make more sense as its own “decision-making” SPEC.

betatim · August 7, 2024, 7:04am

I think it is a good idea to discuss this topic and as a result have a form of social agreement on how to handle these situations. I think a lot of the “big picture” outcomes of such a discussion will be mainstream. At least from having participated in this kind “how do we as a group communicate and make decisions” of discussions in several groups and settings it seems that the “big picture” outcome is always more or less the same. However, I think there is value to going through the exercise of discussing it within each new group/setting and a trap is to think “ah this is all obvious and of course we will do it like this”.

While at first it seems like a time sink to do this exercise again and again, the more often people have participated in it, the quicker it seems to go. So that is good. I think of it as something similar to the partner check in rock climbing. It takes a bit of time to learn when you are new to climbing. Then you establish a routine with your climbing partner and it goes real quick. Then you climb with someone new and it might feel like it slows you down because they do it a little different from you. In the end it is still a good idea to do it though.

This also means that I am not so sure that having a guide that says more than “Please make sure to discuss this and form a consensus in your project. Here are some ideas/things other do” is particularly useful. The process is useful for getting agreement amongst people. Making sure those with worries feel heard, and things get tweaked.

This brings me to one of the things that I think is hardest about having mixed sync and async communication: after having discussed something in person/high bandwidth it can feel tedious to have to re-re-re-discuss points in the resulting async conversation. Especially if it is a bit of a hairy topic. The thing to establish as part of the discussion on “how we discuss things” is that it is normal, acceptable and needed to re-discuss things, and that any hint of “oh we already decided this” is going to lead to conflict. Similarly you need a bit of compassion from those who did not participate in the sync conversation to understand that others might be exhausted regarding a topic that they already discussed in person. Not sure really what the solution is other than awareness and reaffirming that it is ok to re-discuss and that it can be hard on those who took part in the sync discussion.

In some way this isn’t a new problem, as long lived open-source projects it feels like almost all possible topics have already been discussed in the past. So when they come up again 3years later we have the same situation.

Another tricky thing is striking a balance between re-re-re-discussing things and making a decision and moving forward.

tupui · August 7, 2024, 9:21am

I think here that responsibility is key. Things like silent consensus are quite slow and also require people to always be on the lookout. I would rather see people empowered and get specific responsibilities in a project. This can help move things not only faster but also set clearer and more transparent rules and processes. Of course you need ways to contest the decisions of subgroups when there are legitimate concerns. Though assuming said subgroup is skilled, there should not be that many events requiring escalation.

Here as well, this is also not something unseen. A lot of (non profit) organizations around the world work like that. We are just not used to do that in our area. It does not mean that it would not work though.

rgommers · August 7, 2024, 10:50am

Thank you for the thoughtful post @betatim! I agree with pretty much all of it, including that there is value in discussing it in each project, and with the points of friction you identify.

ilayn · August 9, 2024, 11:20am

I won’t touch the Code of Conduct parts, we have a policy about that and I don’t want to influence that, if ongoing, procedure.

I think we are using a bit of a too formal language to describe our issues.

We don’t have sufficient maintainer capacity.
Maintainers want to work on things they like
They have strong opinions where SciPy should go
Open source maintenance is very difficult and getting more difficult for SciPy

When we don’t have sufficient capacity it is down to a few members of the active maintainer list and they take on often seemingly impossible tasks and work on it tirelessly. Then someone else comes and points to a corner that breaks the whole design rationale or drains the morale about the design.

One thing we need to get this engraved in our minds is that this is normal and will happen all the time. We can work to reduce its effect or make it more obvious and try to be respectful but it will happen. It is a thankless and very very annoying work. Also it is work done with complete strangers. We don’t need to be in-person or socially connected to achieve results. it is nice to bond but not necessary.

Having said that we are not as big as PSF of Linux Foundation. We can’t delegate responsibility to a team to work on things individually because we don’t have that specialization. We barely manage to have GSoC mentorships.
But content-wise this is also a very well known issue across open source. Example response from free-threaded python saga

If you squint your eyes it is almost the same occurrence with same response. The difference between this and our case, Python is handled by PSF and can afford to assign teams and working groups. We don’t have enough that kind of capacity to divide and assign to work on things and moreover we don’t get this kind of transparency plus more importantly, we don’t know how to evaluate the progress. If stats folks say it’s done, I’d say good job. I don’t have the time or the interest to go through the work except a surface scan. But somebody better look at it. Many things, including work I contribute, go in with trust and not with scrutiny. That’s just an unfortunate fact and we are OK with it. It’s just the way things are.

In other words, PSF is doing a very good job on keeping everyone (who are interested) up-to-date. Some folks are still being lazy and jumping into discussions without reading things typically leading to dissatisfaction. But nevertheless if you are willing to spend the time you can get yourself up-to-date quickly and join the conversation. In our cases this is not possible.

This is especially important for stats and special. Because they are in high demand these days due to ML or whatever they are doing with distributions. This does not mean the rest is unimportant, in fact the rest is getting more like LAPACK that you only know its existence when it breaks.

Making it a bit more specialized to the case that caused this discussion; take scipy.special: I just can’t understand what happens in special anymore even though last month I wrote a good chunk of it from scratch in C with relative success (and with a number of bugs). But now, I don’t even know where it is. Everything is scattered across ufuncs and templates. But that’s fine because I trust that they know what they are doing. However I don’t know what this CuPy business is all about. I have very strong opinions against C++ usage but fine I lost that argument. I have been working with @steppi and @izaid quite productively; mostly disagreeing but still made great progress. I am certain that if somebody would be involved, they would be included in the discussions. In the stats case, I also trust our folks know what they are doing. But probably Robert also knows what he is doing. But he can’t jump in because there is not much to jump into.

Similar things are also happening with QuanSight internships; for the record, I have absolutely no problem with them, quite the opposite. It is wonderful that they are happening. And we have folks on that company payroll which is also short of a miracle that QuanSight exists. But the structure and the relationship is not transparent.

Again just to make sure, I am not even implying, for a second, there is some shady business going on. Quite the other way around, but it is not happening out in the open. It is not a fault or deliberate omission but transparency is a muscle that needs practice. The more we do it the easier it gets. Yes it is a private entity but still interfacing with an open-source project.

It is a very happy case that we have a network of trust. I trust person A that trusts B and so on and so far it has been working amazingly. But this is not an excuse for not being transparent because this is a limited capacity project with voluntary involvement.

So to summarize in my opinion, in-person or otherwise folks can bundle up and do great work but this is not going to be enough to exclude others chiming in or not documenting what has been done. Late or not, feedback is feedback especially “customer-facing” parts requires relentless scrutiny.

We should not refrain from getting public feedback, it is often very unproductive and seriously soul-draining to read bunch of irrelevant stuff since everyone has an opinion but I think it is necessary evil. That’s why we moved out to here from a mailing list to increase reach, in the first place.

Once we make peace with it, I am positive that we will not need a sync/async discussion guidelines for adults, but instead will have a common understanding what the work entails, just like how we learned to care about not breaking backwards compatibility. This also means that, unlike responsibility, functional ownership is not a thing in open-source.

ilayn · August 9, 2024, 11:28am

By the way, I don’t need any explanation for the examples I used. I just included them to make a point not to start investigations on them

lucascolley · August 9, 2024, 12:30pm

Indeed, it is always necessary at some point in the process. Even poor feedback which is quite off the mark (I have given plenty) is enough to make you think twice, which is often sufficient to catch improvements. As Ralf said above, we can still work in private, and that is often very useful, in cases such as massive projects which would take much longer if every minor step was scrutinised, or when developers are very much learning on-the-go. As you identified, we are short on maintainer capacity and time is precious. We just need to be aware of the risks of extra work and discussion that increase (roughly) proportionally to the amount of private work and (proposed) decisions made in private. I think the same logic applies just as well to individuals as groups. So I think that

was pretty on the mark.

I think we have to consider that new developers appear often. That is why documentation and guidelines are useful - it is a lot easier to learn best practice by osmosis if you are around for a few years than if you are around for a few weeks. I think that what is being proposed here is not really telling adults how to make discussion, but rather putting to paper our experience on which types of discussions/decisions are seen as allowed/useful/essential, and when they should happen.

ilayn · August 9, 2024, 12:59pm

To that I wholeheartedly agree, and also an important point, I missed completely in my diatribe.

I just don’t want to hide behind some made-up paperwork to arrange our relationship with each other. There is an undeniable social aspect to it and often times, it seems to me that, we have to confront it head on by having difficult discussions.

tupui · August 9, 2024, 7:10pm

Mostly agree with what you said! Especially, I am fully onboard with the transparency part. (I really tried my best in that instance to get people involved, but a separate topic.)

I do still think that the social aspect is important especially when we have stories like XZ and even more recently this hacker who got into an org and released malware. We, and Python in general, is still somehow flying under the radar. But with all the AI push, it’s only a matter of time folks realise how vulnerable our communities are security wise.

On the working group part, I do think that in some instances, e.g. if a group of folks get a grant, or there is a company like Quansight with folks assigned on a project, we/the group can assign work to specific folks. I really think that can work if, as I said above, we would accept to give more responsibilities to maintainers. Taking the stats eg as you used it, but keeping short. The group consist(ed) of active stats maintainers, I do think that in that instance it would be reasonable to give the group more liberty. But again, as said above, this still means that there is a communication, ways to get more involved, etc.

There is a lot of energy in the community, especially with new joiners as we see with all our new maintainers. I always feel like we could do more if we were more trusting each other. Our bar is high to get there, we should back each others up more and give us some more slack. Git is here, we have a slow release cycle, let’s go, it’s fine to break stuff sometimes and we have plenty of time to fix things before the big day.

story645 · August 9, 2024, 7:32pm

Yes, but frequently the folks asking “how do decisions get made” and “who makes them?” are newer maintainers and contributors who are on the path to being maintainers, usually trying to get in something on the bigger side, and they honestly usually don’t have the context to wade into the social aspect.

Some of this is mitigated through mentorship or championing, but it would also be so nice to have a clear “this is who you {have to, should} loop in when doing {x}” that is agreed upon.

ilayn · August 10, 2024, 3:32pm

I am fine with typing something up. I just don’t think the effect will be as substantial as we hoped for. Maybe I am wrong, hence I have no objections for the guide.

But can someone please tell me where these newcomer interactions happening? I have been hearing about these hypothetical newcomers for years now but not once I’ve seen happening as long as I have been involved with SciPy which again circles back to transparency.

If you are doing something on behalf of SciPy team let folks know about it. We can’t act on something we never hear about which was the topic of another discussion some time ago.

We also would not know if an encounter goes south between a maintainer and another party and we will be left with “he-said/she-said” drama. This is a very real possibility in this day and age and not a risk we need to take.

tupui · August 10, 2024, 3:48pm

On top of my head (if I understand your question correctly): sprints at SciPy/PyCon/pyLadies/etc, newcomers hours (the one Melissa is doing), various mentorship some of us did (eg me recently with Lucas), normal community call, some random issues or PRs, on the Scientific Python discord, our slack. I know we also have folks teaching. Etc.

lucascolley · August 10, 2024, 5:48pm

In terms of “where”, what Pamphile said! In terms of “hypothetical newcomers”, I would say that me, thalassemia, dietbru, fancidev and nickodell have all been “newcomers” at some point in the past year alone - and that’s only mentioning people who are now a member of the SciPy org. I might have to brush up on my philosophy reading if we’re only hypothetical people

ilayn · August 10, 2024, 9:25pm

By hypothetical I mean folks come in online channels. I hope we are not handing out brochures and then “here read all this” as our newcomer policy.

I did not remember folks you mention searching for decision taking procedure and we are just shrugging. If I missed it my apologies, but I don’t think it happened anywhere public. So if it happened on private channels then they (and also yourself) actually apply as the newcomers implied above. Hence not sure how that answers my question and gets back into transparency domain. Are people acting on SciPy’s behalf? I don’t know but if they do, I trust they are doing the right thing. What happens if they don’t do the right thing or things go sideways. Do we need a public drama to test it out?

Also, if they are already newcomers, who are handling them if there is a process started, I don’t know any such procedure. So I really don’t follow what the premise of a newcomer here. Hence my wording, hypothetical.

EDIT: Come to think of it, this is a separate issue I’m starting poke around and I don’t really have anything concrete to argue about. So let me stop here so that the discussion does not derail from the original subject.

melissawm · August 10, 2024, 11:03pm

I believe we are taking about two separate issues here, and maybe I can clarify one of them.

How are we handling newcomers at SciPy? The monthly newcomers meetings are public and announced in the open calendar and discourse (previously the mailing list). I’ve been organizing these for a while, first as part of a grant and as a volunteer now. The guidance that people ask in these meetings is about github workflow, building SciPy, where to find the code they want to edit, how to handle reviews etc. These questions are also what people ask in the Slack. If this is something the community is not interested in, we can rethink this approach or stop it entirely.
Is it beneficial for new contributors to understand the decision-making processes of an open source project? I’d argue that it is, because that is how we keep transparency. I don’t know if the people contributing are asking for it explicitly, but there may be people we are not hearing from.
The point of helping newcomers is to potentially bring new people into the project and make sure it is sustainable long-term. Hopefully, they will one day become decision-makers.

As far as the initial points of the post, I don’t have much to add except to say +1 for transparency.

story645 · August 11, 2024, 3:21am

Sorry, I was speaking from my experience at Matplotlib because I can’t read and thought you all were discussing a SPEC That being said, it sounds like you all are having the same sorts of issues.

I don’t think anyone is saying that, I think it’s like Melissa said, that the issue is probably more whether folks feel like they can confidently answer, and I think that requires transparency around decision making process. I think this really hits on it:

Your process is maybe better, but I think community wide we wanna empower newer maintainers to take on larger decision making work-like approving/merging big PRs, championing PRs, and mentoring for GSOC/GSOD-and that is a lot easier when the decision making process is transparent and somewhat well defined/described.

ETA: but also I’m gonna get out of this thread now and leave it to you all to discuss, sorry again!

lucascolley · August 11, 2024, 11:19pm

don’t worry, we were (a potential SPEC at least)!

fancidev · August 14, 2024, 2:47am

+1 on the proposed guidelines!