I had a quick check and it looks like Microsoft doesn't have an automatic clean-up bot running which is really nice to see.
Compare it to the Istio project which just deletes your issue if someone doesn't respond it to after 28 days. Despite the issue often being quite real.
Also I wonder how these larger projects will fare once Github rolls out the Discussions feature.
Auto cleanup bots are one of my biggest pet peeves in recent years. As another commenter mentioned, angular is the worst offender for me. Ive seen countless _real_ issues get closed over the years because of inactivity by core maintainers.
The only reason I knew those github issues existed in the first place was because I wasted hours (sometimes days) of my life looking into an issue only to eventually stumble upon them and realize its a bug and I'm not "holding it wrong". Good luck to the next person that goes spelunking on the internet for help on those issues.
If only there was a way you could mark an issue as "old, but still possibly relevant." Perhaps by adding a label, or something.... Doesn't all issue tracking software support either labels, tags, or custom statuses beyond "open, todo, in progress, closed"?
I think having a bot tag, but not close, issues over a certain age, seems to make a lot of sense.
The @angular repositories do that too... The worst are the "lock bots" which prevents any new information being added to the issue... Which is super annoying.... But it must be hard for a maintainer to deal with the notifications
Yeah, but sweeping the issue under the rug is not the way to go.
Sure, you have closed the issue because you did not pay attention to it for the last year. But the problem is still there. So the next person who stumbles upon the old, locked issue will now open a new one. Now you have two issues.
Seems they do have automatic cleanup but it's not documented. Seems if the label "needs more info" was added more than a week ago, they close the issue.
The quest for the perfect issue grooming continues.
I think this is perfectly acceptable. If the user hasn’t demonstrated a reproducible problem and more information is required, the user has a week to submit said information. If not, the issue is cleaned up automatically. I don’t see why the maintainers should have to manually revisit each ticket if the OP fails to respond.
And auto-closing doesn’t mean the issue can’t be reopened. If OP posts additional information after the grace period, the maintainers will be notified and the ticket can be reopened.
Yeah, guess we all have different processes for dealing with reports/GitHub Issues. For me, a closed issue signals a closure of some sort, either that it's not accurate, been fixed, or any other reason. Simply that time has passed does not count as "resolution" for me.
Maybe another status "stale" would be the solution here. That way, maintainers would be able to focus on "active" issues while the people who reported the issue could still add information to the issue later on to move it from "stale" back to "open" / "active".
I'd reverse it and say you should have a status for when something is done discussed about, and implementation path is clear "Ready for development" or something like that. Then people who are looking for things to develop, they filter by that. Anything that is not "Ready for development" either needs triaging, or another type of closure.
An automatic clean up bot could help but is usually not thought through (as numbers examples in this thread describe).
There are two common cases where some automation can help:
- User reports bug, dev explains that "it's not a bug, here's why, xxx" (or otherwise provides a satisfactory response) and user, even if satisfied, fails to return to close bug. So instead devs have to remember to close the bug which they usually do immediately after responding, which comes off as abrupt, even though it's not meant that way.
- User reports bug, dev responds with request for more info ("I can't duplicate; could you provide a better test case?"), user never responds.
In both cases it's OK to time out these old reports (which are essentially clutter). You could provide a way for the dev to flag this as one not to be auto-cleaned up.
Another improvement, while I'm at it: provide a Godbolt-like way to submit a test case that can be outrun (i.e. a CI case). Then the bug could be auto-closed if it's in the current release but already fixed in the dev sources. Not all bugs or applications could take advantage of this of course.
Microsoft's automatically closed tickets are a running gag joke with my development team. They are pathetic - problems don't go away because you ignore them.
I remember when the issue tracker for TypeScript or VSCode was still relatively new, the issue quality was much better. Now if you look at their issues, it's like trying to wade through spam, I hope they have specialized people who don't get demoralized from reading the issue list.
For example just one issue last hour "Slow performance" and basically list of extensions, OS info. To track that one down, they should test each and every extension listed, and still might not have a good understanding what is going on.
The bigger your OS project gets, the more you have a need for help curating and triaging issues; you need a team of customer support, basically. And patience. So much patience.
I would close the issue because the OP didnt search enough. A MCVE should be provided with steps to reproduce. OP should do the work of testing each extensions one by one
This is one area where I feel github is really letting us down.
Everything ends up in a giant "bag", where you end up with 1000+ "issues". While you can go into issues then sort by a given label, this has to be done actively, and constantly kept up to date.
I'd prefer something more like an 'inbox', where issues can be filed into different categories which moves the out of the 'new' section. This is what most large projects are doing anyway with labels.
A large part of this comes down to the venue. The early part of GitHub really exemplified the "move fast and break things" mentality. GitHub broke the software world's ability to grapple with bugs, both in affordances i.e. the tools that it exposes for managing them, and, it would seem, cognitively. The way that GitHub approached bug tracking is one of the most frustrating examples of throwing away all progress just to start over from scratch and ignore everything that came before.
A majority of GitHub's users' first experience interacting with a bug tracker was probably on GitHub, so they never really knew any better. The rest seem to be experiencing some collective amnesia about how to effectively file and otherwise triage/manage bugs. Basically none of the issues ever stumbled upon start with a clear and anodyne set of steps to reproduce. Every "issue" is a conversation. (And this seems to be not only tolerated but encouraged. It's madness.) About half are support requests, maybe more. Even when GitHub is used "correctly" to file bona fide bugs, on the whole, project maintainers seem to treat it as a general intake area for everyone else to file issues, and the project authors themselves hardly use it as their own database for known bugs. They're all jotting down vague descriptions in a text file that stays on their local machine or something, who knows.
The entire phenomenon and attitude has to be one of the top 5 most annoying things about software development in 2020.
You are right, but I'd prefer myself something folder based. You could use the same argument to say mail clients don't need explicit folders, but most people seem to want them :)
You can do this with GitHub Projects (the Trello-like interface). For my personal projects, I use GitHub Issues as a dumping ground for ideas, which automatically go into a "Triage" (inbox) column, which I go through and assign labels and priorities every week or month depending on the project.
The downside with GitHub projects is that you can't automate based on labels so the issues need to be organised into columns manually if you have more than the simple one board with To-Do/In-Progress/Done columns setup. Though search and filters slightly helps with that.
> I'd prefer something more like an 'inbox', where issues can be filed into different categories which moves the out of the 'new' section. This is what most large projects are doing anyway with labels.
I definitely prefer this over trackers that for some inexplicable reason have different flows for bugs, requests, API comments, features and what not. The "inbox" you speak of already exists: unlabelled issues are your inbox. :)
I can understand why so many issues get created. I've had to read through quite a few issues for node.js debugging issues because they've made changes to the feature.
Before I could just do `node --inspect index.js` and get the debugger to auto attach with my node version set using NVM. There's a few different flags to try. I ended up wasting a few hours to just get debugging working again. Now I have to actually set a launch profile (I've got one which works) but wading through all the issues because they've changed, to me, a fundamental feature was just plan frustrating.
The is _so_ much churn and worthless changes in the JS ecosystem. AFAIK it's the only ecosystem that has this much churn, which proves that it's not necessary.
My best example of this is the fact that I can now no longer yarn install <package>. Instead, yarn informs me, the command is now yarn add <package>. For gods sake.
Every time I upgrade or build a project more than a week old I'm almost virtually guaranteed to have a deprecation warning somewhere in my stack, that's infuriating too.
>AFAIK it's the only ecosystem that has this much churn, which proves that it's not necessary.
ah its funny what people qualify as "proof" these days. Javascript is in the unique position of being the only language integrated into browsers, surely you recognize that this is a big factor in the ecosystem?
What do "dev question" and "user question" mean? Are these support questions that get asked using a GitHub Issue as the medium, rather than bug reports?
The VSCode recommend this as a way to get help. They used to have multiple channels, e.g. UserVoice, but closed them all in favour of just using GitHub Issues. I can't see anything obviously wrong with that.
Institutional memory. If you work on a project long enough, you tend to remember these things. You may not recall the exact issue id, obviously, but you know you've seen something like this and there's a limited number of keywords you have to search to get there.
Nobody does this in an N^2 way.
See simhash, minhash, etc
Trivial faster-than-N^2 algorithm:
1. Compute simhash of each issue - O(1) in N, the number of issues.
2. Sort the issues by hamming distance of simhash (N log N in the number of issues)
3. Pick the K issues before or after the current issue in the results (O(1) in the number of issues)
You can precompute/incrementally update any of these steps as issues are added. This would make find duplicates itself O(1) because it would just be step 3.
If you put this in term of size of text in the issues instead of number (which is what you used) it changes the time bounds, but, for example, it's not any worse than sorting/comparing the issues as text strings.
You're not finding duplicates within open issues, you must find them among ALL issues. There are 20 times more, making that 400 times more pairs (10 billion).
Hopefully they can edit or label issues in such a way that helps with searching for duplicates; older issues become a knowledge base / documentation. IIRC it's what Stack Overflow tried to do, turn questions into wiki pages / references.
The problem is that project like VSCode racking up these bugs, and letting them to snowball. It's not that they can't sort them out with 30+ full time professional QAs at Microsoft.
Once upon a time, GNOME had 300 bugs in evolution bugzilla, and it was felt as "the end of the world."
What they need to do is just like GNOME projects a decade+ ago did: make a really hard, like HAAAARD! feature freeze, and keep doing long series of "housekeeping" releases as long as needed before unfreezing work on features.
The wiki is a great reference, but a search of the word "grooming" across Bing [¹], Google [²] and DuckDuckGo [³] all lead to references to child pedophilia.
The definition of that word is changing and not accepting that fact looks dated. Might just be my opinion, feel free to comment if you disagree.
Just because others are using a word for a different meaning doesn’t mean you should abandon its original meaning and just give up on it.
Grooming is not an irregular word. The act of grooming oneself is hopefully a regular activity. Brushing your teeth, trimming your nails, getting hair cuts, etc. The word is also regularly used with pets.
IMO this is like when a while back all the news sites were talking about how the OK hand sign was a white supremacist symbol. If everyone had stopped using it, then perhaps it would still be seen as a symbol for that. But everyone that that was dumb and kept using it the way it always has always has been and that new association was lost.
Dozens and dozens of terms in our industry have "other" meanings.
For example it is common to talk about "pegging" a server -- a reference to how physical speedometers and other such gauges used to work. "Pegging" has a more recent sexual meaning as well.
And then of course there is "penetration" testing in the security sphere.
And so on and so on. It's fine. There is no conflict.
I remember hearing and reading "pegged" in an IT context back in the 1990s, at least 10-15 years before the sex term seemed to gain widespread usage. It was definitely used in entirely work-safe contexts where one would never dream of making anal sex references.
Of course, now that you mention it, I think your belief is likely common enough to make me reconsider its use! I think a lot of folks hearing "pegging" in 2020 probably think of the sex practice before they think of speedometers -- a lot of people don't have cars, and many that do never notice the little pegs.
However, I do think "pegging" (a somewhat obscure IT jargon term, with obscure etymology) is in a different situation than "grooming," an incredibly common everyday word. Retiring a bit of jargon is entirely different than retiring a an everyday word.
And the problem is that we do cater to the 1%. See GitHub trying to replace master branch to main branch, or Python replacing the master-slave terminology, and so forth. This is just two examples, there are billions. There is a guy (or a bot) who has opened over 3k issues to projects about the master-slave terminology change[1]. GitHub is doing nothing against it, and yes, they know about it. You could replace GitHub with loads of other major companies and whatnot, they are not only allowing it, but encourage it, they are speaking up with the 1% and you can clearly see the effects.
[1] https://github.com/bopopescu (check out his "contribution" activity: "Joined GitHub", and then "Opened 3,226 other pull requests in 3,208 repositories")
Yes github actually fucked me up with the whole master/main thing the other day. I was trying to push a local repo I had been working on for a few days to a new github repo. Cost me an hour of head scratching until I realised what the deal was. I've defaulted all my repos back to "master" in line with git CLI which is the single source of truth.
Github is on my personal shitlist for a number of reasons though and this is nowhere near #1.
Compare it to the Istio project which just deletes your issue if someone doesn't respond it to after 28 days. Despite the issue often being quite real.
Also I wonder how these larger projects will fare once Github rolls out the Discussions feature.