Wikipedia:Bots/Requests for approval/Hazard-Bot 34
- The following discussion is an archived debate. Please do not modify it. To request review of this BRFA, please start a new section at WT:BRFA. The result of the discussion was Approved.
Operator: Hazard-SJ (talk · contribs · SUL · edit count · logs · page moves · block log · rights log · ANI search)
Time filed: 03:27, Monday, December 28, 2015 (UTC)
Automatic, Supervised, or Manual: Automatic
Programming language(s): Python
Source code available: GitHub
Function overview: Updates the list of common mistakes for WikiProject Fix common mistakes
Links to relevant discussions (where appropriate): Wikipedia:Bot requests#Update the lists at WikiProject Fix common mistakes, Wikipedia:Bot requests/Archive 63#Bot to updated lists at WikiProject Fix common mistakes
Edit period(s): Perhaps monthly
Estimated number of pages affected: 24, plus possibly the log table to make 25
Exclusion compliant (Yes/No): Yes
Already has a bot flag (Yes/No): Yes
Function details: At least for now, I'm planning to manually trigger this task whenever I'm aware of an updated dump (and it's available on Tool Labs). The script will go through all articles, searching for the common mistakes checked by the WikiProject (currently those listed at Wikipedia:WikiProject Fix common mistakes#Log, though it's flexible enough to change). This task is only about updating the lists, not fixing the "mistakes" (if they are indeed mistakes, that is). To reduce false positives, I'll be searching for the mistakes with both a leading and a trailing space in the text. Also, I'm making available a page (possibly User:Hazard-Bot/Common mistakes blacklist, though I'm open to alternatives) to list pages to exclude from the lists (there definitely will always be false positives, so this will be a means of avoiding the same set of recurring pages on the lists on every run). Hopefully this is straightforward enough. Now to notify the WikiProject of their late Christmas present. Hazard SJ 03:27, 28 December 2015 (UTC)[reply]
Discussion
editApproved for trial (50 edits or 30 days). Please provide a link to the relevant contributions and/or diffs when the trial is complete. Pretty straight forward, give it a test, report back results. — xaosflux Talk 04:08, 28 December 2015 (UTC)[reply]
Thank you Hazard-SJ for my Christmas present. I'm the poor sap who currently does this manually, so I wholeheartedly approve of this. I also add a leading and a trailing space when creating the list, so Hazard is doing the same things I currently do. I use "blacklists" on CheckWikipedia. However, I haven't used a blacklist for Fix Common Mistakes, but this is a good idea. I would put the blacklist under the project's subpage. Thank you again. Bgwhite (talk) 06:43, 28 December 2015 (UTC)[reply]
- @Bgwhite: you're welcome. I also realize I should have used the name "whitelist" as opposed to "blacklist". Anyways, how does Wikipedia:WikiProject Fix common mistakes/Whitelist or Wikipedia:WikiProject Fix common mistakes/Whitelisted pages sound? Let me know what you want. Also, I see that @JJMC89: got a quick start on Wikipedia:WikiProject Fix common mistakes/a a from the first batch of lists. How were they? Hazard SJ 14:50, 28 December 2015 (UTC)[reply]
- @Bgwhite: I should probably automate this too Hazard SJ 15:03, 28 December 2015 (UTC)[reply]
- I think the simple ones (both words the same) should be okay. I started working on an a and I've noticed some regular phrases that should probably be excluded: an a cappella, an a priori, an a la carte, an a fortiori, and an a posteriori. (I only saw the last two a couple times.) I remember seeing some pages with math/formulas/code that would always show up in the dump but not necessarily have an error, but I didn't take note of them. I'll try to keep track of new ones that I run into. — JJMC89 (T·C) 15:38, 28 December 2015 (UTC)[reply]
- @Bgwhite and JJMC89: After a little talk with The Earwig, I've made some changes. The set of mistakes to be checked can now be configured from User:Hazard-Bot/FIX/Scan configuration. Each level 2 heading identifies the mistake, then unordered lists (
* ...
) within the sections can identify exceptions. I've filled in the mistakes, and added some of the "an a" exceptions (feel free to update them). As for the previous "blacklist", I've corrected the name to "whitelist", and it's now at User:Hazard-Bot/FIX/Whitelisted pages. This would probably come in handy for perhaps pages with quotes that contain errors, or whatever else the case may be, since specific phrases can be directly included within the configuration page. Again, moving those pages to subpages of the WikiProject is perfectly fine (possibly Wikipedia:WikiProject Fix common mistakes/Scan configuration and Wikipedia:WikiProject Fix common mistakes/Whitelisted pages?), and we'll probably also want to add some protection to minimize tampering. Since I just did a scan and don't have the next dump as yet, I'll hold off on the next run for a bit. (P.S. @The Earwig: I tried using\b
instead of the spaces, but that also included-
, creating multiple false positives, and I was unable to figure how to exclude that single character.) Hazard SJ 08:16, 29 December 2015 (UTC)[reply]
- @Bgwhite and JJMC89: After a little talk with The Earwig, I've made some changes. The set of mistakes to be checked can now be configured from User:Hazard-Bot/FIX/Scan configuration. Each level 2 heading identifies the mistake, then unordered lists (
- The next enwiki dump is currently being generated, so I'll hopefully be able to run that within the next few days. Hazard SJ 06:46, 22 January 2016 (UTC)[reply]
- @Bgwhite: but we had so much fun! No? (Kidding.)
- @Hazard-SJ: that is AWESOME. I'm typically only working on here once a month or so (Great Userbox War etc., don't ask) so I just saw this. But I repeat - AWESOME! Anything that can improve WP:FIX, I'm all for it. Thank you so much! Let me know how it works out! Sct72 (talk) 00:38, 23 January 2016 (UTC)[reply]
- @Bgwhite, JJMC89, and Xaosflux: The next batch is out (January 2016 dump)! Hazard SJ 17:04, 3 February 2016 (UTC)[reply]
- Hazard-SJ (Gene Rayburn) Hazard is soooooo slow. (crowd) How slow is he? (Gene Rayburn) A <blank> beat Hazard in a 100m "dash".
- The February dump started today. Should be ready in a couple of days. To be fair, January's was late in starting up, but that won't stop me from giving you a hard time :) Bgwhite (talk) 21:30, 3 February 2016 (UTC)[reply]
- And the February dump still hasn't been completed :) Hazard SJ 07:35, 13 February 2016 (UTC)[reply]
- @Hazard-SJ: Are we good to go here? — Earwig talk 22:15, 3 February 2016 (UTC)[reply]
- @The Earwig: Possibly, I haven't encountered any problems so far, and the edits are to a limited set of pages (which can be controlled by Wikipedia:WikiProject Fix common mistakes/Scan configuration). It would be nice to have that page, as well as Wikipedia:WikiProject Fix common mistakes/Whitelisted pages, furnished with a silver lock (semi-protection). Additionally, a new set of suggestions came by, which I addressed below, so hopefully we're good there as well. Hazard SJ 07:35, 13 February 2016 (UTC)[reply]
- @Bgwhite, JJMC89, and Xaosflux: The next batch is out (January 2016 dump)! Hazard SJ 17:04, 3 February 2016 (UTC)[reply]
- The next enwiki dump is currently being generated, so I'll hopefully be able to run that within the next few days. Hazard SJ 06:46, 22 January 2016 (UTC)[reply]
Looking good. If possible, it may be good to not match inside comments (<!--...-->
), some tags (<score>...</score>
[may have parameters inside the opening tag], <math>...</math>
, <source>...</source>
, <pre>...</pre>
), and file names ([[File:(ignore me)|.*?]]
, |image=
, etc.). — JJMC89 (T·C) 07:46, 5 February 2016 (UTC)[reply]
- @JJMC89: I didn't get
|image=
, but cb3970f should have covered the other things you requested. Hazard SJ 07:35, 13 February 2016 (UTC)[reply]- Everything looks good to me. Thanks. — JJMC89 (T·C) 08:14, 13 February 2016 (UTC)[reply]
- Approved. — Earwig talk 03:40, 28 February 2016 (UTC)[reply]
- Everything looks good to me. Thanks. — JJMC89 (T·C) 08:14, 13 February 2016 (UTC)[reply]
- The above discussion is preserved as an archive of the debate. Please do not modify it. To request review of this BRFA, please start a new section at WT:BRFA.