What does it mean to be a project that is welcoming to newcomers?

In the age of AI, LLMs, code agents and assistants what does it mean to be an open-source project that is welcoming to newcomers?

When I started out we put a lot of work into creating things like contributing guidelines, tutorials on setting up and using Git(Hub), giving extra attention to Pull Requests created by newcomers, mentoring people to level them up, describing the contributor journey from first contact to becoming a maintainer. Lots of effort to reduce the friction for first time contributors.

You could submit PRs that were missing a lot of things and maintainers would be excited to help you add the missing things, correct mistakes, etc.

Today you can probably create a PR that is like one of those by using a AI assisted editor in 5min and put it up. Should we be/are we as welcoming to these PRs today as we were 5-10 years ago? My feeling is no. I think the reason for that is that it feels like not a lot of effort went in to making the PR but it will require significant maintainer effort.

So, this makes me wonder: What does it mean to be a project that is welcoming to newcomers?

7 Likes

I know I don’t know the answer and I probably don’t even know all the questions you could ask yourself. Interested to hear what others are thinking.

Here are some things I’ve pondered, usually without a clear answer:

  • Stop using “good first issue” - use a similar label but one that isn’t aggregated on sites like https://goodfirstissue.dev/
  • Tool up! - Use AI agents to reply to PRs that look like they were substantially created by AI tools
  • Create a AGENTS.md to improve the quality of AI based contributions.
  • State that your project requires significant open-source experience, refer to that when closing PRs from beginners
  • patiently explain how to contribute to projects, that it requires more than what your AI tool can do
  • take up woodworking and leave behind open-source

I think “ten years” ago the problem was that there were hardly any people who contributed to open-source projects. Especially if you don’t count contributions from maintainers of other projects. As a result it made sense to increase participation.

Today it feels like there are too many new contributors, or at least the ratio of beginner contributions to reviewers has significantly worsened.

How to (significantly) increase the number of reviewers is a problem to which I don’t know a solution. It seems like it requires patience and personal mentorship, just like ten years ago :smiley: I’m not sure this is a bad thing.

1 Like

Thanks for thinking about this in public @betatim!

I agree with you. Although I’ve found the AI-generated thing to not yet be so common/problematic for PRs, much more so for issues and security reports.

Re too many vs. not enough new contributors: I think it has always, and still does, depend strongly on how well-known a project is. We’ve had too many really new / drive-by contributors in NumPy and SciPy for years; the balance has just gradually worsened a little over time. For a brand new project though, it may still be a priority to find one or two new contributors that start helping structurally.

+1, labels like these have been highly problematic for years. They attract a ton of drive-by contributions, and I have a feeling (hard to prove) that the change of a new contributor turning into a maintainer has a negative correlation with that contributor starting with this label.

Promising new contributors are almost always motivated by a problem/topic, not by “getting on the scoreboard”. They’re not that hard to spot, and prioritizing reviews for them and reaching out to them with encouragement and an offer of mentership after they’ve made a number of solid contributions is still the best way to grow your team I’d say.

I’m still strongly in the “no bots” camp. Getting a low-quality reply from a bot or having a bot close your contribution as stale is still the most common way to really turn me off from ever wanting to contribute again to some project. So yeah you may filter out low-quality newcomers, but you’ll also reject promising ones.

I don’t have a clear answer to this question, but I feel clear guidance in contributor docs about what contributors can roughly expect, and what to do when those expectations aren’t met, should be a key part of any answer.

3 Likes

Nothing to do with AI but related to the question of this thread

My opinion about AI reviewers is that they take some amount of effort and setup in order to pay attention to the things that you care about in the project.

If you go through this effort in order to make the reviews good, that begs a question - why not use it on all contributions, not just AI written ones?

I partially agree. On one hand, if I get a review from an LLM, it makes me feel… ignored? It’s like when you’re on hold, and the hold music stops to say, “Your call is very important to us. We are experiencing higher than usual call volumes.” It’s a communication that communicates the opposite of what it literally says.

On the other hand, I think some kinds of review that I do really could be automated. e.g “This commit message doesn’t follow our commit message policy. For future reference, I would suggest a commit message like …” Another example would be helping a new contributor set up a working development environment.

1 Like

We do both of these things without AI in Pixi! For the first one, a short workflow like pixi/.github/workflows/lint-pr.yml at main · prefix-dev/pixi · GitHub works.

For the second, you just need to use Pixi :slight_smile: which I am actively working on enabling in SciPy.

1 Like

Yeah automated reviews don’t bother me much, AI or otherwise, but the most infuriating thing in the universe is when you go through the trouble to file a bug, and it’s completely ignored, and then three years later a bot closes it with a message like “There’s been no activity on this big report, so we assume the bug has magically solved itself and is no longer an issue”.

2 Likes

I can offer the perspective of a contributor who used AI to try to contribute to scikit-image and networkx by optimizing its performance.

I understand that its ever easier to contribute new code with AI, and that can increase workload on the reviewers. My perspective is that if the quality of the contribution is good, and it helps the project then encouraging it could be helpful. If you define a quality bar, and the contributor meets the bar then it should be accepted. This could help expand the community and improve the package quality/features with accelerated pace. AI reviewers + status checks can also help reduce some of the work.

1 Like

I would agree that the AI part isn’t the core issue as such[1], but rather a continuous drift and increased lopsidedness in the kind/style of contributions (e.g. we don’t do hacktoberfest because it’s was mostly noise always).

One example is that I recetly think I see far more issues that literally nobody in the world actually cares about or ever will care about! Not even the poster, they just have some tooling.[2]
Honestly, if I would know it’s an auto tool, I would be tempted to just close such issues with a single sentence (or a PR if I don’t care enough) and at least ask to bunch them up into one large issue. But yeah, without knowing it feels awfully unfriendly…
Of course these are often real issues, but unintionally or not they waste time just by existing and needing triaging and cluttering the issue tracker.

I agree that a large part of this is the percieved effort imbalance. It’s frustrating to more time to get a PR done than the contributor and that for contributions where it’s not even clear the contributor hs a learning experience.

But, I am not sure that tooling up helps much? A clearer “checklist” (which could be AI assisted/review) before even opening an issue might help a little? (I.e. tells you “this is missing a test, a PR without one may not be reviewed.”, or “issue misses how it affects you, without that it may be considered non-relevant”).
Since we don’t have tooling for that, I suppose it might also still be OK for a first PR/issue, but it would have to be very clear that it is a bot and it’s basically OK to ignore.

Or maybe I just need to be better to post: Look, this isn’t super high quality and about something nobody cares about… Sorry, but don’t be too surprised if it just gets closed in a few months?! While I guess at the same time trying to be more encouraging to good (especially second?) contributors?


  1. Some projects have concerns about AI contributions, but those are not this discussion. ↩︎

  2. similar things are years old with e.g. bogus security issues up to CVEs… I wouldn’t be surprised if the ratio of actual security issues is higher on the normal issue tracker than through “security issue reporting” channels. ↩︎

1 Like

A few weeks (moths?) have passed. In the mean time I’ve started (and stopped and started again) using AI to work on projects. Jump to the end for my thoughts, read the middle waffle bit if you want to hear about my adventures of using AI to build more and more complex things.


One thing I’m learning is that AI tools are eager to do things and lots of it. I used to joke that a AI tool is like a summer student, but I increasingly think this is pretty accurate. They are smart, have some knowledge of the field, very motivated and have lots of time to do things (no meetings, etc). They will explore all sorts of avenues, some more promising, some less. And you can guide them by providing constraints.

AI tools seem to be very much like this. They perform much better for tasks where you have constraints that keep them on the straight and narrow. For example if you have an existing implementation that you can declare as “the truth!”.

The first project I tackled where I had no chance of evaluating the “code quality” was creating a small web app Swiss Alpine Maps - it gives me something I’ve wanted for years but never started because HTML+JS is quite foreign to me. Here the constraints I specified were “single file, load JS from CDN, no build steps, no react” - lots of things not to do. Inspired from reading Useful patterns for building HTML tools

Next in my list of experiments was creating a random forest implementation in pure Python. Using only PyTorch and its tricks/tools/infrastructure to achieve good performance. Here the constraints are provided by the existing scikit-learn implementation. You can compare to it, you can use it as a benchmark, etc. After a few afternoons of working on this I think I have an implementation that is mostly correct (I made the mistake of adding binning as a feature, so you can’t expect 1:1 results with scikit-learn - a learning for the future), about 5-10x slower than scikit-learn on a CUDA GPU, but it uses a outrageous amount of memory. What I learnt from this is that there is not enough constraints and architecture specification from me to help the AI make progress once the basics are there. At the moment when tasked with improving memory or runtime performance it benchmarks, optimises, gives up, again and again. Each time it starts the loop it basically decides the same thing is the problem, tries the same fix, and learns that this doesn’t fix it. To make progress here I think I need to spend more time setting up constraints, ways to make notes from learnings, etc. (I’ve not published it yet, I might to that as a way to preserve one of my early attempts of using AI to do something complicated. Even if it is ultimately useless.)

Based on this learning experience I spent some time creating (with the help of AI) design documents for how array API support in scikit-learn works. What patterns we have, what are anti-patterns, what the testing strategy is, that we do not accept performance regressions for Numpy, etc. Most of these docs are great, also for humans, some of the things in them needed fixing. After doing this I spent about a day on working with the AI to convert the GaussianProcessRegressor to have array API support. The changes so far look reasonable, I don’t think I could have done it myself in under a day (especially given that I didn’t pay super close attention to this task but attended meetings, etc at the same time - deep work this was not). Right now I’m in the phase of establishing performance - and from that decide what to do next. I have learnt that the AI is pleased with its work when it works, but that this is not the end of the road - I want the torch CUDA version to be faster, or at the very least not slower. And no regression in performance.


TL;DR: AI works a lot better if you provide architecture and design constraints. Explicit instructions on patterns and anti-patterns. You can store these as markdown documents in your repository. Creating them takes a bit of time and care, but they save you a lot of time in the long run.

Should we spend time adding this stuff to our projects? Is it worth the maintenance effort? Does it make it easier to use AI to make contributions to the project? Is it akin to having a linter setup for contributors to “just use”?